The second project involves modifying the syntactic analyzer for the attached compiler byadding to the existing grammar. The full grammar of the language is shown below. Thehighlighted portions of the grammar show what you must either modify or add to the existinggrammar.function:function_header {variable} bodyfunction_header:FUNCTION IDENTIFIER [parameters] RETURNS type ;variable:IDENTIFIER : type IS statement ; |IDENTIFIER : LIST OF type IS list ;list:( expression {, expression} )parameters:parameter {, parameter}parameter:IDENTIFIER : typetype:INTEGER | REAL | CHARACTERbody:BEGIN statement END ;statement:expression ; |WHEN condition , expression : expression ; |SWITCH expression IS {case} OTHERS ARROW statement ENDSWITCH ; |IF condition THEN statement {ELSIF condition THEN statement}ELSE statement ENDIF ; |FOLD direction operator list_choice ENDFOLD ;case:CASE INT_LITERAL ARROW statementdirection:LEFT | RIGHToperator:ADDOP | MULOPlist_choice:list |IDENTIFIERcondition:expression RELOP expression |condition logical_operator condition |( condition ) |NOTOP conditionlogical_operator:ANDOP | OROPexpression:( expression ) |expression arithmetic_operator expression |NEGOP expression |INT_LITERAL | REAL_LITERAL | CHAR_LITERAL |IDENTIFIER ( expression ) |IDENTIFIERarithmetic_operator: ADDOP | MULOP | MODOP | EXPOPIn the above grammar, the red symbols are nonterminals, the blue symbols are terminals and theblack punctuation are EBNF metasymbols. The braces denote repetition 0 or more times and thebrackets denote optional.You must rewrite the grammar to eliminate the EBNF brace and bracket metasymbols and toincorporate the significance of parentheses, operator precedence and associativity for alloperators. The precedence and associativity rules are as follows: Among binary arithmetic operators the exponentiation operator has highest precedencefollowed by the multiplying operators and then the adding operators. But the unarynegation operator ~ has higher precedence that all of the binary operators. All relational operators have the same precedence. Among the binary logical operators, & has higher precedence than |. But the unary logicaloperator ! has higher precedence than either of the binary logical operators. All binary operators except the exponentiation operator are left associative.The directives to specify precedence and associativity, such as %prec and %left, may not beused.Your parser should be able to correctly parse any syntactically correct program without anyproblem.You must modify the syntactic analyzer to detect and recover from additional syntax errors usingthe semicolon as the synchronization token. To accomplish detecting additional errors an errorproduction must be added to the function body, to the variable declaration and to the whenclause.Your bison input file should not produce any shift/reduce or reduce/reduce errors. Eliminatingthem can be difficult so the best strategy is not introduce any. That is best achieved by makingsmall incremental additions to the grammar and ensuring that no addition introduces any sucherrors.An example of compilation listing output containing syntax errors is shown below:1 — Multiple errors23 function main a integer returns real;Syntax Error, Unexpected INTEGER, expecting ‘:’4 b: integer is * 2;Syntax Error, Unexpected MULOP5 c: real is 6.0;6 begin7 if a > c then8 b + / .4;Syntax Error, Unexpected MULOP9 else10 case b is11 when => 2;Syntax Error, Unexpected ARROW, expecting INT_LITERAL12 when 2 => c;13 endcase;Syntax Error, Unexpected ENDCASE, expecting OTHERS or WHEN14 endif;15 end;Lexical Errors 0Syntax Errors 5Semantic Errors 0You are to submit two files. The first is a .zip file that contains all the source code for the project. The .zip fileshould contain the flex input file, which should be a .l file, the bison file, which shouldbe a .y file, all .cc and .h files and a makefile that builds the project. The second is a Word document (PDF or RTF is also acceptable) that contains thedocumentation for the project, which should include the following:a. A discussion of how you approached the projectb. A test plan that includes test cases that you have created indicating what aspectsof the program each one is testing and a screen shot of your compiler run on thattest casec. A discussion of lessons learned from the project and any improvements that couldbe made Here is the recommended approach for project 2.
1) First build the skeleton for project 2, run it and be sure that you understand how it works.
2) Replace the lexical analyzer in the project 2 skeleton with your lexical analyzer from project
1. Be sure to remove the main method from your scanner.l. Also replace the listing.h and
listing.cc files in the skeleton with the ones from your version of project 1. The add the
necessary %token declarations for the new tokens. Note that the tokens.h from project 1 is no
longer needed. It will be generated by bison from the %token declarations. Verify that the project
builds correctly at this point. Then confirm that test cases test1.txt – test4.txt that were
provided as test cases for the skeleton code parse properly and that the test case syntax1.txt
produces a syntax error.
3) The simplest modification to make first is to modify the grammar to include variables and
literals of the real data type. You will need to modify the type and primary productions. Use
test5.txt to test this modification. Shown below is the output that should result when using
that test case as input:
$ ./compile < test5.txt
1
2
3
4
5
6
// Function with a Real and Character Variables and Literals
function main returns character;
a: real is 7.8e-1;
begin
when a < .45, '\n' : 'A';
end;
Compiled Successfully
4) Next, add the if statement to the statement production in the grammar. The if statement
allows zero or more elsif clauses represented in EBNF using braces. The grammar in the
specification defines the language but is not in the form needed for bison. It contains EBNF
symbols, that must be removed. In that part of the grammar, the EBNF braces are used to
indicate that a if statement contains zero or more elsif clauses. Because bison does not support
these EBNF symbols, a recursive production must be used instead. Study how the recursive
cases production in the skeleton defines zero or more case statements. You should use a similar
approach.
Be sure that you understand the difference between the meaning of ; and ';' in the bison input
file. The former is the symbol that terminates a production. The latter represents the semicolon
symbol in the target language as does the ; in the project specification. It is also important to
distinguish between the statement and the statement_ productions that were provided in the
skeleton code. The latter incorporates the semicolon that ends the if statement in the target
language and also provides recovery should an error occur within a statement. Use test6.txt to
test this modification. Shown below is the output that should result when using that test case as
input:
$ ./compile < test6.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Function with an If Statement
function main returns integer;
a: integer is #A;
begin
if a < 10 then
1;
elsif a < 20 then
2;
elsif a < 30 then
3;
else
4;
endif;
end;
Compiled Successful
5) The fold statement would be best implemented next. Note that it will require two subordinate
production direction and operator. Use test7.txt to test this modification. Shown below is
the output that should result when using that test case as input:
$ ./compile < test7.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Left and Right Fold Statement
function main returns integer;
a: list of integer is (1, 2, 3);
begin
switch a(1) is
case 1 =>
fold right – (2,3, 4) endfold;
case 2 =>
fold left + a endfold;
others =>
0;
endswitch;
end;
Compiled Successfully
6) The provided skeleton allows an optional, 0 or 1, variable declaration. The requirements state
that 0 or more should be permitted. This modification again requires replacing the EBNF braces
with a recursive production. Use test8.txt to test this modification. Shown below is the output
that should result when using that test case as input:
$ ./compile < test8.txt
1
2
3
4
5
6
// Multiple Integer Variable Initializations
function main returns integer;
b: integer is 5 + 1 - 4;
c: integer is 2 + 3;
begin
7
8
b + 1 - c;
end;
Compiled Successfully
7) The next feature to add is 0 or more parameter declarations in the function header. Because
the parameters require comma separators, notice that the grammar is written to indicate that the
parameters are optional because the parameters production must contain at least one. Study how
optional elements are included in the grammar by noticing how an optional variable declaration
was implemented in the skeleton. Also study how the elements production is implemented. Like
the parameters, it involves comma separators. Two test cases are provided to test this change.
Before using either of them, use one of the early test cases to confirm that your parser still allows
no parameters. Then proceed to use, test9.txt, to verify that program with one parameter
declaration parses correctly. Shown below is the output that should result when using that test
case as input:
$ ./compile < test9.txt
1
2
3
4
5
6
// Single Parameter Declaration
function main a: integer returns integer;
begin
a + 1;
end;
Compiled Successfully
The next test case, test10.txt contains two parameter declarations. Shown below is the output
that should result when using that test case as input:
$ ./compile < test10.txt
1
2
3
4
5
6
7
// Two Parameter Declarations
function main a: integer, b: real returns real;
c: real is .7;
begin
a + b * c;
end;
Compiled Successfully
8) The additional binary arithmetic operators for remainder and exponentiation and the unary
minus operator should be added next. Notice how the existing operators are implemented in the
skeleton code. There must be one production for each level of precedence. Because the
remainder operator has the same precedence as the multiplying operators, it should be included
as another right-hand side to the existing production. But because the exponentiation operator
has higher precedence a production must be introduced. Because that operator is right
associative, its production must be right recursive. Because the unary minus operator has higher
precedence than all the binary arithmetic operators. In the Course Resources area of the
classroom you will find Module 1 from CMSC 330. You may want to read section II-B if you
need a better understanding of how to construct a grammar so that it produces the correct parse
tree to account for precedence and associativity.
Then proceed to use, test11.txt, to verify that program containing every arithmetic operator
parses correctly. Shown below is the output that should result when using that test case as input:
$ ./compile < test11.txt
1
2
3
4
5
6
// Arithmetic Operators
function main returns integer;
begin
9 + ~2 - (5 - 1) / 2 % 3 * 3 ^ 1 ^ 2;
end;
Compiled Successfully
9) The additional logical operators, which include the | and ! operators should be added next.
Each has a separate precedence level. The | operator has the lowest precedence of all operators
and ! has the highest. So, two new productions must be added to the grammar. Shown below is
the output that should result when using that test case as input:
$ ./compile < test12.txt
1
2
3
4
5
6
// Relational and Logical Operators
function main returns integer;
begin
when 5 > 8 & 3 = 3 | 9 < 1 & !(3 7) | 6 = 9, 1 : 0;
end;
Compiled Successfully
10) At this point your parser should contain the complete grammar for the language. As a final
check, use test cases test13.txt, which contains a nested if together with most elements of the
language and test14.txt, which contains a nested switch statement. Shown below is the
output that should result when using test13.txt as input:
$ ./compile < test13.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// Comprehensive Test with Nested If
function main a: integer, b: character, c: real returns real;
d: integer is #8e;
e: real is 3.75;
f: character is 'A';
g: list of integer is (1, 3, 5);
begin
if ~a > 5 & a < 1 & !(c = 5.8 | c .7E4) then
if c >= 7.5E-2 & c = d, a + 2 – 7.9E+2 / 9 * 4 : 8.9;
elsif g(1) = a ^ 2 % 3 then
a % 2 – 5 / c;
else
15
16
17
18
19
20
fold left + (1, 2, 3) endfold;
endif;
else
fold right – g endfold;
endif;
end;
Compiled Successfully
Shown below is the output that should result when using test14.txt as input:
$ ./compile < test14.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// Comprehensive Test with Nested Switch
function main a: integer, b: real returns real;
c: integer is 8;
d: real is .7E2;
begin
switch a is
case 1 => a * 2 / d ^ 2;
case 2 => a + 5.3E+2 – b;
case 3 =>
switch d is
case 1 => a % 2;
others => 9.1E-1;
endswitch;
case 4 => a / 2 – c;
others => a + 4.7 * b;
endswitch;
end;
Compiled Successfully
11) At this point, you are ready to implement the error recovery portion of the program. Verify
first that test case syntax1.txt that was include with the project 2 test data produces the correct
output. Then modify the grammar so that errors in the function header will be properly recovered
from. You will need to introduce an additional production that includes the error token. You
should use the statement_ production in the skeleton code as a model. Once you have
implemented the error recovery production for the function header, use test case syntax2.txt to
test it. Shown below is the output that should result when using syntax2.txt as input:
$ ./compile < syntax2.txt
1 // Error in Function Header, Missing Colon
2
3 function main a integer returns integer;
Syntax Error, Unexpected INTEGER, expecting ':'
4
b: integer is 3 * 2;
5 begin
6
if a 5 then
6
a * 3;
7
else
8
2 + a;
9
endif;
10
c: real is 3.5;
11 begin
12
if a a 2;
Syntax Error, Unexpected INT_LITERAL, expecting ';'
7
case 2 => 5;
8
endswitch;
Syntax Error, Unexpected ENDSWITCH, expecting CASE or OTHERS
9 end;
Lexical Errors 0
Syntax Errors 2
Semantic Errors 0
The fact that the second error, the one indicating that the others clause is missing, confirms that
recovery from the first error message has occurred.
14) As a final test use test case syntax5.txt, which contains a variety of syntax errors. Shown
below is the output that should result when using that test case as input:
$ ./compile < syntax5.txt
1 // Multiple Errors
2
3 function main a integer returns real;
Syntax Error, Unexpected INTEGER, expecting ':'
4
b: integer is * 2;
Syntax Error, Unexpected MULOP
5
c: real is 6.0;
6 begin
7
if a > c then
8
b + / 4.7;
Syntax Error, Unexpected MULOP
9
else
10
switch b is
11
case => 2;
Syntax Error, Unexpected ARROW, expecting INT_LITERAL
12
case 2 => c;
13
endswitch;
Syntax Error, Unexpected ENDSWITCH, expecting CASE or OTHERS
14
endif;
15 end;
Lexical Errors 0
Syntax Errors 5
Semantic Errors 0
If you have implemented all the required error recovery, your compiler should detect all five
syntax errors.
All of the test cases discussed above are included in the attached .zip file.
You are certainly encouraged to create any other test cases that you wish to incorporate in your
test plan. Keep in mind that your parser should parse all syntactically correct programs, so it is
recommended that you choose some different test cases as a part of your test plan. Your
instructor may use a comparable but different set of test cases when testing your project.
CMSC 430 Project 2
The second project involves modifying the syntactic analyzer for the attached compiler by
adding to the existing grammar. The full grammar of the language is shown below. The
highlighted portions of the grammar show what you must either modify or add to the existing
grammar.
function:
function_header {variable} body
function_header:
FUNCTION IDENTIFIER [parameters] RETURNS type ;
variable:
IDENTIFIER : type IS statement ; |
IDENTIFIER : LIST OF type IS list ;
list:
( expression {, expression} )
parameters:
parameter {, parameter}
parameter:
IDENTIFIER : type
type:
INTEGER | REAL | CHARACTER
body:
BEGIN statement END ;
statement:
expression ; |
WHEN condition , expression : expression ; |
SWITCH expression IS {case} OTHERS ARROW statement ENDSWITCH ; |
IF condition THEN statement {ELSIF condition THEN statement}
ELSE statement ENDIF ; |
FOLD direction operator list_choice ENDFOLD ;
case:
CASE INT_LITERAL ARROW statement
direction:
LEFT | RIGHT
operator:
ADDOP | MULOP
list_choice:
list |
IDENTIFIER
condition:
expression RELOP expression |
condition logical_operator condition |
( condition ) |
NOTOP condition
logical_operator:
ANDOP | OROP
expression:
( expression ) |
expression arithmetic_operator expression |
NEGOP expression |
INT_LITERAL | REAL_LITERAL | CHAR_LITERAL |
IDENTIFIER ( expression ) |
IDENTIFIER
arithmetic_operator: ADDOP | MULOP | MODOP | EXPOP
In the above grammar, the red symbols are nonterminals, the blue symbols are terminals and the
black punctuation are EBNF metasymbols. The braces denote repetition 0 or more times and the
brackets denote optional.
You must rewrite the grammar to eliminate the EBNF brace and bracket metasymbols and to
incorporate the significance of parentheses, operator precedence and associativity for all
operators. The precedence and associativity rules are as follows:
Among binary arithmetic operators the exponentiation operator has highest precedence
followed by the multiplying operators and then the adding operators. But the unary
negation operator ~ has higher precedence that all of the binary operators.
All relational operators have the same precedence.
Among the binary logical operators, & has higher precedence than |. But the unary logical
operator ! has higher precedence than either of the binary logical operators.
All binary operators except the exponentiation operator are left associative.
The directives to specify precedence and associativity, such as %prec and %left, may not be
used.
Your parser should be able to correctly parse any syntactically correct program without any
problem.
You must modify the syntactic analyzer to detect and recover from additional syntax errors using
the semicolon as the synchronization token. To accomplish detecting additional errors an error
production must be added to the function body, to the variable declaration and to the when
clause.
Your bison input file should not produce any shift/reduce or reduce/reduce errors. Eliminating
them can be difficult so the best strategy is not introduce any. That is best achieved by making
small incremental additions to the grammar and ensuring that no addition introduces any such
errors.
An example of compilation listing output containing syntax errors is shown below:
1 — Multiple errors
2
3 function main a integer returns real;
Syntax Error, Unexpected INTEGER, expecting ‘:’
4
b: integer is * 2;
Syntax Error, Unexpected MULOP
5
c: real is 6.0;
6 begin
7
if a > c then
8
b + / .4;
Syntax Error, Unexpected MULOP
9
else
10
case b is
11
when => 2;
Syntax Error, Unexpected ARROW, expecting INT_LITERAL
12
when 2 => c;
13
endcase;
Syntax Error, Unexpected ENDCASE, expecting OTHERS or WHEN
14
endif;
15 end;
Lexical Errors 0
Syntax Errors 5
Semantic Errors 0
You are to submit two files.
The first is a .zip file that contains all the source code for the project. The .zip file
should contain the flex input file, which should be a .l file, the bison file, which should
be a .y file, all .cc and .h files and a makefile that builds the project.
The second is a Word document (PDF or RTF is also acceptable) that contains the
documentation for the project, which should include the following:
a. A discussion of how you approached the project
b. A test plan that includes test cases that you have created indicating what aspects
of the program each one is testing and a screen shot of your compiler run on that
test case
c. A discussion of lessons learned from the project and any improvements that could
be made