Page1 of 6
ACSC373: Compiler Writing
Semester: Fall 2011 Date: 12 Dec.12
Lecturer: Dr. Chrysostomos Chrysostomou Deadline: 11 Jan. 13
Project: Construction of a Simple Parser
INSTRUCTIONS:
You will illustrate the basic phases of the compilation process (lexical, syntax, and
semantic analysis) through a simple compiler for a programming language model
“MYLANG”.
The programming language “MYLANG” is very simple.
A. Lexical Conventions of MYLANG
1. The keywords of the language are the following:
declare read write
All keywords are reserved, and must be written in lower case.
2. Special symbols are the following:
{ } ( ) = + –
;
3. Other tokens are NAME that represents a string of letters and numbers, starting with a
letter, and NUMBER that represents a sequence of digits.
Lower and upper case letters are distinct.
4. White space consists of blanks, newlines, and tabs. White space is ignored except it
must separate NAME’s, NUMBER’s, and keywords.
Page 2 of 6
B. Syntax and Semantics of MYLANG
The syntax of “MYLANG” is described by the grammar rules defined as follows:
program : { statement_list
}
;
statement_list : statement ; statement_list
| ε
;
statement : declaration
| assignment
| read_statement
| write_statement
;
declaration : declare var
;
assignment : var = expression
;
read_statement : read var
;
write_statement : write expression
;
expression : term
| term + term
| term – term
;
term : NUMBER
| var
| ( expression )
;
var : NAME
;
The semantics of ‘MYLANG” should be clear: a “MYLANG” program consists of a
sequence of read/write or assignment statements. There are integer-valued variables
Page 3 of 6
(which need to be declared before they are used), and expressions are restricted to
addition and subtraction.
C. Example of MYLANG source program
A simple MYLANG program is shown below:
f
{
declare xyz;
xyz = (33+3)-35;
write xyz;
}
The output of the above program is, of course, 1.
D. Project Implementation
The project consists of three phases:
Phase I: Lexical Analysis
With the aid of the lexical analysis generator tool Flex, you will construct the lexical
analyzer, in order to transform a sequence of input characters into a sequence of tokens.
Phase II: Syntax Analysis
With the aid of the syntax analysis generator tool Bison, you will construct the syntax
analyzer, the parser, in order to check whether the sequence of tokens is grammatically
correct, according to the grammar rules that define the syntax of the source language.
Looking at the grammar rules for “MYLANG” (see section B, above) it seems clear that
a program is syntactically correct if the structure of the tokens matches the structure of a
Phase III: Semantic Analysis
Having established that the source text is syntactically correct, the compiler must now
perform additional checks such as determining the type of expressions and checking that
Page 4 of 6
all statements are correct with respect to the typing rules, that variables have been
properly declared before they are used, etc.
This phase is carried out using information from the parse tree and the symbol table.
In our project, very little needs to be checked, due to the extreme simplicity of the
language. The only checks that are performed verify that a variable has been declared
before it is used, and whether a variable has been re-declared.
E. Project Implementation Hints
Some important notes follow to help you in the implementation of the project.
1. In the Yacc/Bison specification, you need to specify the types of the attributes the
grammar symbols can hold. For example:
%union {
char *str;
int val;
}
. . .
%type
%type
. . .
2. For the symbol table implementation, you need to define a data structure (e.g., NODE)
with the appropriate members. For example, you may need to have a member of type
string to hold the name of the identifier, a member of type integer to hold the
kind of the identifier (e.g., READ, WRITE, or NAME), and a member of type
integer to mark a symbol in the table as declared.
Further, you may declare an array of type NODE, with a maximum size, say 100, or
you may create a dynamic linked list (the latter is considered a more effective
solution than the former one).
Page 5 of 6
Also, for the symbol table management, you need to define two important operations:
The insert and lookup (find) operations.
The insert function will create a new entry in the symbol table, whenever a new
identifier is declared.
The lookup (find) operation will search for a specific identifier in the symbol
table and return whether it has found it or not.
Moreover, you may need to insert the keywords read, write, and declare in the
symbol table, before the parsing begins (before the call of the yyparse() function
in the main()), in order not to be used as normal variables (identifiers).
F. Error Messages
A simple “MYLANG” program with errors is shown below:
f
{
read x;
x = x+2;
y = x+3;
write y;
declare z;
z = 3- 2;
declare z;
}
When you run your compiler, the following messages should appear:
$ ./out test-errors.ti
error no 1: Variable “x” not declared (line 2)
error no 2: Variable “x” not declared (line 3)
error no 3: Variable “x” not declared (line 3)
error no 4: Variable “x” not declared (line 4)
error no 5: Variable “y” not declared (line 4)
error no 6: Variable “y” not declared (line 5)
error no 7: Variable “z” already declared (line 8)
Page 6 of 6
Submission Deadline:
• You are allowed to form groups in order to do the coursework (project). Each group
consists of at most three students. Send an email to the instructor
(ch.chrysostomou@frederick.ac.cy) indicating the exact students names of each
group, until Friday, December 14, 2012.
• Submit all the necessary files electronically until Friday 14/01/13, through the e-
learning platform (moodle), http://e-learning.frederick.ac.cy/course/view.php?id=462
Attention:
It is very important for you to do your own writing for homework. The use of material
from other sources, to support your work if needed, should be acknowledged in the text.
• Do not copy another student’s project.
• Do not download a project from the internet, in whole or in part.
• Do not ask someone to do your project for you.
• Do not ask someone to do your writing for you.
The consequences for committing any of the previous acts of academic dishonesty can
include, at the least, a failing grade for the project.
Therefore, be careful not to plagiarize or cheat.