Construct a token data structure, including a method to print a token.
2. Use the JFlex tool to automatically construct a lexical analyzer in Java for MicroR from a set of regular expressions specifying tokens. This can interface with your token class to return a token to the main program which then prints the token.
CSCE 4430 ASSIGNMENT #1
Due Wednesday, September 15, 2021
R is a programming language for statistical computing and data analysis. Consider the following
extended BNF grammar for a subset of R, called MicroR.
Program
FunctionDef
MainDef
StatementList
Statement
::=
::=
::=
::=
::=
Cond
AndExpr
RelExpr
RelOper
Expr
AddOper
MulExpr
MulOper
PrefixExpr
SimpleExpr
::=
::=
::=
::=
::=
::=
::=
::=
::=
::=
source ( “List.R” ) {FunctionDef} MainDef
id < − function ( [id {, id }] ) { {Statement} return ( Expr ) ; }
main < − function ( ) { StatementList }
Statement { Statement }
if ( Cond ) { StatementList } [else { StatementList }]
| while ( Cond ) { StatementList }
| id < − Expr ;
| print ( Expr ) ;
AndExpr {|| AndExpr}
RelExpr {&& RelExpr}
[!] Expr RelOper Expr
< | | >= | == | !=
MulExpr {AddOper MulExpr}
+|−
PrefixExpr {MulOper PrefixExpr}
*|/
[AddOper] SimpleExpr
integer | ( Expr ) | as.integer ( readline ( ) ) | id [ ( [Expr {, Expr}] ) ]
| cons ( Expr , Expr ) | head ( Expr ) | tail ( Expr ) | null ( )
Syntactic and Semantic Conventions The keywords and the token symbols in MicroR are in
bold. Note that MicroR, like C, C++ and Java, has symbols { and }, which are distinguished from
grammar metasymbols { and }, respectively, by underlining.
An id can only contain letters (only alphabetic characters), digits, and underscores ( ) with the
restrictions that it must begin with a letter, cannot end with an underscore and cannot have two
consecutive underscores. For example, give 2 Joe, tell me and A45Asm3 are valid identifiers, but
6gh, two bad, and no end are not. integer is an unsigned integer. Comments are indicated by
being preceded by #.
1. Determine the set of tokens which a lexical analyzer would need to recognize.
2. Design and implement a lexical analyzer procedure to read a source program in the above
language and print the next token in the input stream. If the token detected is a valueless
token, such as a keyword, then it is sufficient to print only the keyword. If it has a value,
then both the token type and lexeme should be printed.
3. You will be given several MicroR programs with which to test your lexical analyzer. These
will be located on the class WWW page and will be of the form Test1.R, Test2.R, etc.
Suggestions:
1. Construct a token data structure, including a method to print a token.
2. Use the JFlex tool to automatically construct a lexical analyzer in Java for MicroR from a set
of regular expressions specifying tokens. This can interface with your token class to return a
token to the main program which then prints the token.