Regular Languages: Finite State Machines and Regular Expressions

Hello there,

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Here, I’m attaching all the homework documents and required readings. Let me know if you can access all of them.

Chapter 1
Introduction
n
Programming a physical machine requires composing ordered sequences of machine-specific operation codes
ut
io
for instructions along with providing codes for data and access to data. The operation codes are decoded
by circuits that activate other circuits in the machine to execute the instructions. With the first electronic
computers, it was difficult to read, debug, and modify the numeric codes representing the programs, thus
rib
taking more time to get programs to run. Also, deploying the programs were time-consuming because it often
took many hours or days to load the program codes into the “memory” since it involved physically wiring
st
some of the circuits together. Because of these reasons, programming languages were designed to express
programs that resembled the syntax of natural languages, like English, but with very precise meaning.
di
Translator programs were developed to translate those programs into the machine code.
The first languages (called assembler languages) and translators (assemblers) were developed to overcome
or
the challenges of reading and writing machine level code. Assembler languages use mnemonic names for the
ot
f
operations, such as ADD; operands represented as names of registers or memory addresses (variables), such
as R1 or X; and operands as constant values represented with specific syntax, such as 2 or ’A’. The assembler
instructions stand in almost a one-to-one correspondence with the machine instructions. This correspondence
N
together with mnemonic names and names for operands are what makes assembler programs easier to read
and write than the respective machine program. An example of an assembler program is shown later in this
chapter. Assembler language and assemblers were introduced in the early 1950s. Since then, and still today,
assembler languages are used when certain applications require access to specific low-level hardware (e.g.,
device drivers) or where the programmer needs to write more time and/or space efficient code than that
produced by other higher-level language translators.
High-level programming languages were developed to express abstractions found in mathematical concepts, algorithms, and computational models and used by many programmers. One of the first high-level
programming languages developed originally at IBM in the mid 1950s was ForTran (which stands for Formula Translator), which was more abstract than assembler languages. Fortran was intended for representing
scientific computations and related abstractions, although has been used for other application domains. In
the late 1950s and early 1960s COBOL, Algol60, and Lisp (List Processing) were designed and implemented
with translators that generated assembler code or machine code. COBOL, Algol60 and Lisp were designed
1
Introduction
to “abstract away” lower-level views of hardware. COBOL is closest to English and was originally designed
for business applications. Many institutions still have large amounts of legacy COBOL code in use today.
Algol60 was introduced to be more general purpose in terms of use in application domains. Algol60 also introduced many programming concepts relevant to many widely-used programming languages of today. Lisp
is based upon the concepts of expressing programs as pure mathematical functions and treating functions
as values or objects as operands to other functions. The design philosophy of Lisp was based upon a pure
function evaluation without regards to an underlying computer.
Programming language concepts consist of abstractions for expressing computations and representing
data. They serve as means to examine and compare languages. Examples of concepts include scoping
rules, lifetimes of objects, bindings and binding times, recursion, block-structures, data types, orthogonality,
abstract control structures, concurrency, functions and procedures, and parameter passing rules. There
are many more. This book is organized around groups of related concepts. For each concept, several
n
languages are sampled and compared in terms of how they utilize and implement the concept and other
ut
io
related concepts. Many concepts are interrelated and cannot be studied in isolation. Again, they are
viewed from the perspectives of the designer, programmer (user), and implementor. For example, functions,
recursion, binding times, and parameters are all interrelated. We look at the motivation of the designer to
rib
include or not include some concept in a language, the pros and cons of how a language employs a concept
from the programmer perspective, and how implementors implement that concept on a computational model
st
according to the concept’s semantics.
di
To help study programming languages, it helps to have an understanding of three areas of study: 1) theory
of computation (theoretical computer science); 2) software design methodologies or techniques (software
or
engineering); and 3) general knowledge of computer architecture and organization (computer engineering).
Each of all three views (designer, user, implementor) of programming languages utilize all three knowledge
ot
f
areas. For example, from an implementor point of view, one must have a working knowledge of theoretical
computer science, such as formal grammar theory and related finite automata and push-down automata
computational model theory; must have a good working knowledge of design and implementation practices of
engineering.
N
software engineering; and have an understanding of machine architectures and operation related to computer
For the remainder of this chapter, discussion expands on topics already mentioned above, reflects on
several other aspects of the study of programming languages, and introduces translators, both compilers and
interpreters, and hybrids of these two types. First, because programming languages are used to represent
algorithms, which are models of some process in a real or virtual world, and that concepts of programming
languages are abstractions, a brief review of the roles of models and abstractions is given. Next, we look
at several reasons for why there are so many languages. Several examples of how some changes in technology influences programming language design and implementation mentioned above are presented. A brief
philosophical look at the purpose of programming languages is also included. Then, to provide the foundation for introducing language translators, semi-formal presentations of computational problems, algorithms,
computational models, syntax, semantics and pragmatics are given. Finally, we look at the implementation
of programming language translators, namely the high-level designs of them and supportive runtime sys-
Copyright 2024
2
Introduction
1.1 Models and Abstraction
tems, linkers and libraries. In this discussion an introduction of the role and application of abstractions,
computational models, and formal grammars are given. How two types of language translators, compilers
and interpreters, work along with their advantages and disadvantages are compared. Finally, categories of
languages and some programming languages are introduced. All sections in this chapter are intended to
provide a foundation for the rest of the book.
1.1
Models and Abstraction
Scientists and engineers use models and abstractions to understand and design complex systems. They
try to understand a complex system within the natural world by forming models and designing an abstract
representation of chosen relevant objects and relationships between them. Deciding upon the relevant objects
along with deciding upon what properties and relationships to represent is the primary issue in forming a
n
given model. Abstraction is the concept of choosing the details of the representation of the objects and
ut
io
operations required for the model. Part of constructing models involves deciding on the level of abstraction for
the representation of objects, relations, and operations. For example, engineers will not use the abstraction
level of the molecular structure of steel used in the model of a bridge to be built. They will use relevant
rib
objects and an appropriate abstraction level, such as representations of steel beams to be used in a physical
bridge, and use the appropriate physics models, such as the mathematics representing static and dynamic
st
forces.
Building software products is an engineering endeavor. Therefore, practices and concepts used by other
di
engineering disciplines are needed. Similarly, software developers write code that simulates certain aspects
of the natural world, such as the code used in flight simulators, air-traffic control systems, and physics in a
or
video game. In the study of collisions of objects, scientists develop models that consider the properties of
ot
f
mass, speed, and direction of the objects and arrive at the phenomena, such as conservation of momentum.
In simulation software, a software developer expresses a scientific model or engineering design model into a
model within the constraints of a computing model. Ultimately, the scientific or engineering model has to
N
be realized on a physical computing device.
Software engineers build programs as their artifacts and use models and abstraction in: 1) uncovering
requirements; 2) formally specifying the scope of the system; 3) the creating a design of the system; and
4)the implementation of the design in a given programming language. In this book, the term, software,
includes not just the target program or programs making up the system, but all of the documents created
during the construction of the program(s). At all stages of software development, abstractions are used and
the program acts a model of some task or phenomenon.
One of the general approaches to using abstractions for understanding a software system is to organize a
system into levels of abstraction. This is used for understanding many software systems, such as operating
systems and virtual machines. Abstraction is used extensively in building translators and support code,
like runtime systems and virtual machines for the purpose of fully implementing a programming language
semantics. The study of programming languages involves the foundations of computational theory, the
use of principles of software engineering design, and the knowledge of thinking on all layers of abstraction,
Copyright 2024
3
1.2 Some Reasons for Studying Programming Language Concepts
Introduction
from the lowest-layer to highest-layer of abstraction. A deeper discussion of levels of abstraction is given in
Section 1.10.
1.2
Some Reasons for Studying Programming Language Concepts
Knowing how to program in more than one language helps expand occupational opportunities for software
development, helps people who work in a wide spectrum of disciplines, and gives programmers insights into
solving problems. So, while there are good arguments for knowing several languages, why should one be
interested in comparing the concepts surrounding programming languages? While most developers are not
going to design a full language or write compilers or interpreters, why should one be well-versed in the
implementation of language or translators? There are many reasons for studying programming language
concepts and related theory. Some are given here and will be illustrated and developed throughout the
n
book.
ut
io
Enhancing ones ability to express and implement algorithms and data structures. Knowledge of a programming language helps one’s understanding of a computational problem and the algorithm used. (Computational problems and algorithms are more formally defined later in this introduction.) As with natural
rib
languages, those fluent in a language can better express their thoughts and ideas for solving problems. Studying the design goals of a programming language and the underlying computational models or mathematical
st
models for expressing the semantics of the language helps in not only implementing an algorithm, but may
give rise to new insights into how to solve problems or may provide more efficient solutions based upon the
di
underlying computational model. An example of this is the use of move-semantics in C++11 (and later
versions) for copy constructors or assignment-member functions. In this case, knowledge of the language
or
semantics and implementation of how objects and parameter passing leads to more time and space efficient
ot
f
C++ code.
Study of programming language concepts helps develop the ability to learn new languages. Change in
the software and hardware industries is inevitable and ubiquitous. New hardware gives rise to new possible
N
applications and provides room for new innovations. These new applications and innovations also results
in new approaches for software design and implementation strategies, which usually calls for new ways to
develop software or to express solutions. Another example of the ability to learn a new language quickly helps
when confronted with an internal, proprietary language within an organization. By studying how grammars
formally define the syntax of languages and how grammars are used for building translators, one can have a
more structured approach to learning a language and deeper understanding of the syntax of the language;
this helps learn languages. Knowing how concepts apply to other languages helps with understanding the
design and proper use or pragmatics of a new language.
Knowledge of constructs of one language helps simulate concepts in another language. Knowing how to
program in a language that has built-in features to express and implement some abstraction, concept or
software design, helps one simulate it in another language that does not have the corresponding features.
Sometimes, a programmer does not have the luxury of the language choice. For example, suppose that a
programmer has to program in C and that programmer knows C++. If one knows how to use pointers to
Copyright 2024
4
Introduction
1.2 Some Reasons for Studying Programming Language Concepts
functions, how to pass pointers to structures, and knows the meaning of static variables, the C programmer
can simulate the implementation of classes and objects. Simulating iterators and generators of current
languages (e.g. Python, Ruby) in older languages is another example. Later in the book we’ll take a look at
this in detail.
Viewing user interfaces, data cleaning, data manipulation, and report generation as mini-language design
and implementation. Most software engineers do not directly design or build translators for full languages.
However, designing a user interface is similar to designing a language. For example, the task of data validation
in a front-end web page is a kind of syntax checking of legal or illegal strings. Often grammars or regular
expressions are used to formally specify what is legal and illegal text for, say, a data field in a web form. The
languages models (e.g., regular expressions, grammars) and translator-building tools (e.g., Antrl4, flex) can
be used for identifying patterns in data, thus “mining” or extracting useful data or information. Similarly,
knowledge of these tools can help develop software for identifying errors in data and for transforming data
n
into another form. A particular pattern represents a set of legal phrases or sentences that are members
ut
io
of a “mini-language.” Another example is creating a report according to a specified format. Strings that
adhere to the format forms a language, namely the language of legal reports as specified by the requirements
of a client. Having a view of these tasks as languages helps develop programs for these common tasks.
rib
There are several programming languages that have libraries or built-in functionality that can be utilized
for implementing these tasks. Moreover, this approach can be used when using software applications that
st
process text documents, spreadsheets, multi-media documents, to name a few categories of data formats.
di
XML is a mark-up language with a strict grammatical structure to represent data that is utilized by many
web applications. Knowing how parsing works in a translator helps understand code to process XML code.
or
There are many translator-building tools (e.g., Antrl4, yacc) that can be used for building applications to
process XML. XSLT is language used for easing the effort in processing XML code.
ot
f
Understanding utility tools, lower-level designs, impacts of hardware architecture can be useful. Highlevel languages exist to avoid examining low-level code or bind the code to specific hardware configurations.
However, sometimes when debugging code, the programmer must invoke the debugger utility to look at
N
the code generated by a compiler or to look at how the supporting runtime environment handles stack
management or the memory allocation. This becomes important when programming embedded systems or
an “internet-of-things” application. Knowing how the runtime works is related to the programming language
semantics and language implementation; this knowledge helps the programmer debug code and/or write more
efficient code. This is also important with security issues, such as buffer overflow problems with the runtime
stack. Sometimes, when using several different languages with separate compilation one has to use a linker to
change the way parameters are passed (e.g., stacked or queued) between assembler or machine code files. An
example of how knowing the hardware configuration and how the compiler represents data structures is that
it helps the programmer write code so that cache memory hits are higher. Knowing the parallel architecture
or the distributed computer network topology can help the programmer write better code for parallel or
distributive programming. For example, the simple knowledge of the number of processors and whether
shared or local distributed memory is configured helps the way the code is written. Studying programming
language theory helps ask the right questions and helps to know how to answer them.
Copyright 2024
5
1.3 The Evolution of Languages
1.3
Introduction
The Evolution of Languages
There are over 1000 programming languages according to some counts. There are dialects, much like U.S.
English versus British English. The difference between a dialect and a new language is not clear. Nevertheless,
there are many languages and here is an attempt to try to address why there are so many.
New approaches to software design. In early days of software design it was clear that some structure was
needed in expressing the flow of an algorithms representing programs. Flowcharts were used and in the late
1960s and early 1970s saw the introduction of the discipline of writing structured flowcharts. There should
be one entrance and one exit for each process, where each process is a simple task or a selection of one of
two sub-processes or an iteration of one subprocess. These three constructs were sequenced. Fortran and
COBOL used goto statements and had no selection statements or iteration statements. So new languages
were designed with if-then block, while-loop blocks, nested blocks, and other similar structures to reflect
n
this structured design. Thus, structured programming was adopted and new languages all included the
ut
io
higher-level control constructs to reflect this software design approach. It became apparent that non-local
variables shared between code in the nested blocks became difficult to maintain. Object-oriented design
demonstrated many advantages in development, maintenance, and reuse of code and in the 1980s and
rib
1990s new programming languages and dialects began to appear to reflect object-oriented design concepts.
Languages like Smalltalk, C++, Eiffel, and Java were introduced.
st
Changes to new hardware technology and related software designs. Smaller and faster processors, denser
and faster main memories, faster graphics processors, new display technology, and new input technology
di
(e.g., mice, touch-pads) led to personal computers and mobile devices with graphical user interfaces, which
led to the need for new approaches to software designs, such as event-driven program, related design patterns,
or
such as the model-view-controller design pattern. New languages were developed to help reflect the software
designs in a more direct way that with say older (but widely-used) languages, like C. C could be used, but
ot
f
the implementing the designs were closely modeled and it was easier to express the underlying assumptions
of the designs, such as concurrent operations. For example, threading was included in the semantics of the
N
first versions, such as Java, Ruby, Swift, Objective-C, and C#. The concept of delegation is important in
event-driven program and the concept of a delegate is in the language definition of C#.
Domain-Specific Languages. Lisp was developed for A.I. applications. In these applications symbols and
nested data structures are important in representing algorithms and data structures in these applications.
Prolog was used for processing natural languages and A.I. applications through the use of logic for representing knowledge bases. R is good for statistical applications. There are several dialects of Lisp and Prolog and
they all have been used for more general applications. Some companies develop code for a particular product and rather than use a general-purpose language, choose to develop an in-house (sometimes proprietary)
language for developing code.
Legacy Languages. Programs and software systems that were written originally in early versions of
Fortran, PL/I, and COBOL still are in use today. There has been a large amount of time and money
invested into some large systems of programs using these older languages. Several systems are tailored for
core subsystems (e.g., transaction systems) in insurance, banking, government, and health institutions. It
would be too expensive to rewrite these systems with new languages and also opens up the possibility of
Copyright 2024
6
Introduction
1.4 Reflections on the Purpose of Programming Languages
introducing errors. New software engineers and programmers have to learn (at least part) of these legacy
languages and to create interfaces with the new code written in current languages that utilize the older
systems.
1.4
Reflections on the Purpose of Programming Languages
All human languages are used for communicating thoughts and ideas and to share knowledge. One use of a
language is to convey or express knowledge of how to complete a task or solve a problem, such as transforming some data into another form of data, how to construct an artifact, or to describe how to react to the
surrounding environment given certain stimuli. Computer scientists call the steps to complete tasks, algorithms, which must be expressed in a language and they are used to solve computational problems. Human
or natural languages are used for communicating solutions or algorithms directly from human to human
n
through notations or sounds and sometimes only with sounds. Algorithms can be viewed as computations
ut
io
that are carried out by some machines or cooperative machines with or without some kind of memory and
input/output interfaces, called a computer system or, simply, a computer. Computers or machines can be
abstract (computational models) or physical. Languages used to express computations or algorithms on
rib
physical or abstract machines are called programming languages. Algorithms, computational models, computational models and computational problems are all terms that have rigorous meanings and are covered
st
in Section 1.5.
As done within the study of human-to-human communication of computations, we look at the descrip-
di
tion and use of programming languages for human-to-machine communication of computations. With all
communication there is a sender and a receiver where the exchanged content is encoded in some language. In
or
the case of human-to-machine communication, content (the message) are the instructions that are expressed
ot
f
in a programming language, the sender is a programmer who is the generator of the instructions in that
language and the receiver is a machine that is designed to decode the instructions and execute the necessary
actions. A programming language also has to have the expressiveness for representing the objects, such as
N
numbers, and their properties utilized within the algorithm. This is a very simplistic and birds-eye view of
communication, languages, and the concept of a machine. It is intended to get the reader to think about
relationships of algorithms, machines (or computational models), and languages and their roles in computations. It will provide the foundation of looking at the design, implementation, and use of programming
languages. Before moving on, let’s take a look at a simple example.
Example 1.1 To help illustrate the use of some concepts and terminology introduced thus far, let’s look at
a simple computational problem. Suppose one person wants to communicate to another person on how to
sum a finite sequence of numbers. We want to transform these numbers into one number, their collective
sum. We need an algorithm expressed in some language, such as a natural language like English. Suppose
the sender (e.g., teacher) in this case, might give the following instructions in English:
“Keep a running total. Read the first number and let that be the current running total. As long as
there is another number, add that number to the current total. When there are no more numbers
the total is your answer.”
Copyright 2024
7
1.4 Reflections on the Purpose of Programming Languages
Introduction
The receiver (student) most likely will formulate some mental or physical model of a computer system to
execute instruction according to the semantics of the English phrases. From this example we see how humanto-human communication can take place. In natural language, many assumptions are made. In this example
it assumed the total is initially zero since no numbers are read at the start of the task. There is no explicit
statement about when the sequence ends; it may be assumed that there is some sort of ending signal from
the list, such as no more numbers on the source document containing the numbers. When communicating
with a machine or telling the machine how to carry out this algorithm, we need a more detailed and formal
description of a language and machine.
When communicating an algorithm to a machine, we need to know how the machine works, use a specific
syntax with rigorous semantics, and understand how the input data to the algorithm are retrieved and where
and how to send the output. Shown below is a C++ program to represent the solution to the computational
problem of summing a finite sequence of numbers. In contrast to the general problem, suppose we had to
n
pick a particular set of numbers, such as integers, on the underlying abstract machine, which is, in turn,
ut
io
restricted by the underlying physical machine. These are just a couple of items that have to be rigorously
defined and specified.
ot
f
or
di
st
rib
Sum Function in C++
#include
using namespace std;
int main()
{
int sumTotal = 0;
int num;
while (!cin.eof()) {
cin >> num;
sumTotal += num;
}
cout

Still stressed from student homework?
Get quality assistance from academic writers!

Order your essay today and save 25% with the discount code LAVENDER