COMP 3320 – Programming LanguagesSpring 2024
Due: May 5, 2024, 11:30 PM
1. What common programming language statement, in your opinion, is most detrimental
to readability? Explain.
2.5
2. In your opinion, what major features would a perfect programming language include?
2.5
3. Describe the advantages and disadvantages of some programming environment you
have used.
5
4. Write a BNF description of the Boolean expressions of Java, including the three
operators &&, ||, and ! and the relational expressions.
5
5. Using the grammar below show a parse tree and a leftmost derivation for each of the
following statements:
4 x 2.5 = 10
a.
b.
c.
d.
A = (A + B) * C
A=B+C +A
A = A * (B + C)
A = B * (C * (A + B))
6. Consider the following grammar:
→ a c ||b
→ c |c
→ d|
Which of the following sentences are in the language generated by this grammar? Show
steps
10
a.
b.
c.
d.
e.
abcd
acccbd
acccbcc
acd
accc
7. Given the following grammar and the right sentential form, draw a parse tree and show
the phrases and simple phrases, as well as the handle.
3 x 5 = 15
S→AbB | bAc A→Ab | aBB B→Ac | cBb | c
a. aAcccbbc
b. AbcaBccb
c. baBcBbbc
9. Consider the following Python program:
5
List all the variables, along with the program units where they are declared, that are visible in
the bodies of sub1, sub2, and sub3, assuming static scoping is used.
10. Describe a situation in which the add operator in a programming language would not be
commutative.
5
11. Rewrite the following pseudocode segment using a loop structure in the specified
languages: Assume all the variables and constants floating-point type. Discuss which
language, for this code, has the best writability, the best readability, and the best
combination of the two.
5 x 5 = 25
k = (j + 13) / 27
loop:
if k > 10 then goto out
k = k + 1.2
i=3 * k – 1
goto loop
out: …
a.
b.
c.
d.
e.
C++
C#
Java
Python
Ruby
12. A functional language could use some data structure other than the list. For example, it
could use strings of single-character symbols. What primitives would such a language have
in place of the CAR, CDR, and CONS primitives of Scheme?
5
13. Write a Scheme function that takes a simple list of numbers as a parameter and returns a
list with the largest and smallest numbers in the input list.
5
14. Write a Scheme function that returns the reverse of its simple list parameter.
5
Global
edition
Concepts of
Programming Languages
ELEVENTH edition
Robert W. Sebesta
digital resources for students
Your new textbook provides 12-month access to digital resources that may include VideoNotes
(step-by-step video tutorials on programming concepts), source code, web chapters, quizzes,
and more. Refer to the preface in the textbook for a detailed list of resources.
Follow the instructions below to register for the Companion Website for Robert Sebesta’s
Concepts of Programming Languages, Eleventh Edition, Global Edition.
1. Go to www.pearsonglobaleditions.com/Sebesta
2. Click Companion Website
3. Click Register and follow the on-screen instructions to create a login name and password
Use a coin to scratch off the coating and reveal your access code.
Do not use a sharp knife or other sharp object as it may damage the code.
Use the login name and password you created during registration to start using the
digital resources that accompany your textbook.
Important:
This access code can only be used once. This subscription is valid for 12 months upon activation
and is not transferable. If the access code has already been revealed it may no longer be valid.
For technical support go to http://247pearsoned.custhelp.com
This page intentionally left blank
Concepts of
Programming Languages
Eleventh Edition
GLOBAL Edition
This page intentionally left blank
Concepts of
Programming Languages
Eleventh Edition
GLOBAL Edition
R o b ert W. S eb esta
University of Colorado at Colorado Springs
Global Edition contributions by
Soumen Mukherjee
RCC Institute of Information Technology
Arup Kumar Bhattacharjee
RCC Institute of Information Technology
Boston Columbus Indianapolis New York San Francisco Hoboken
Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal Toronto
Delhi Mexico City Sao Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo
Editorial Director: Marcia Horton
Executive Editor: Matt Goldstein
Editorial Assistant: Kelsey Loanes
VP of Marketing: Christy Lesko
Director of Field Marketing: Tim Galligan
Product Marketing Manager: Bram van Kempen
Field Marketing Manager: Demetrius Hall
Marketing Assistant: Jon Bryant
Director of Product Management: Erin Gregg
Team Lead Product Management: Scott Disanno
Program Manager: Carole Snyder
Production Project Manager: Pavithra Jayapaul,
Jouve India
Procurement Manager: Mary Fischer
Senior Specialist, Program Planning and
Support: Maura Zaldivar-Garcia
Assistant Acquisitions Editor, Global Edition:
Aditee Agarwal
Project Editor, Global Edition: Amrita Naskar
Manager, Media Production, Global Edition:
Vikram Kumar
Senior Manufacturing Controller, Production,
Global Edition: Trudy Kimber
Cover Designer: Lumina Datamatics Ltd.
Manager, Rights Management: Rachel Youdelman
Senior Project Manager, Rights Management:
Timothy Nicholls
Cover Art: Viachaslau Kraskouski/Shutterstock
Full-Service Project Management: Mahalatchoumy
Saravanan, Jouve India
Pearson Education Limited
Edinburgh Gate
Harlow
Essex CM20 2JE
England
and Associated Companies throughout the world
Visit us on the World Wide Web at:
www.pearsonglobaleditions.com
© Pearson Education Limited 2016
The rights of Robert W. Sebesta to be identified as the author of this work have been asserted by him in
accordance with the Copyright, Designs and Patents Act 1988.
Authorized adaptation from the United States edition, entitled Concepts of Programming Languages, 11th edition,
ISBN 978-0-13-394302-3, by Robert W. Sebesta, published by Pearson Education © 2016.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or
transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise,
without either the prior written permission of the publisher or a license permitting restricted copying in the
United Kingdom issued by the Copyright Licensing Agency Ltd, Saffron House, 6–10 Kirby Street, London
EC 1N 8TS.
All trademarks used herein are the property of their respective owners.The use of any trademark in t his
text does not vest in the author or publisher any trademark ownership rights in such trademarks, nor does
the use of such trademarks imply any affiliation with or endorsement of this book by such owners.
Many of the designations by manufacturers and sellers to distinguish their products are claimed as
trademarks. Where those designations appear in this book, and the publisher was aware of a trademark
claim, the designations have been printed in initial caps or all caps.
ISBN 10: 1-292-10055-9
ISBN 13: 978-1-292-10055-5
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
10 9 8 7 6 5 4 3 2 1
Typeset in Janson Text LT Std 10/12 by Aptara
Printed and bound by Vivar in Malaysia
Changes for the Eleventh Edition
of Concepts of Programming Languages
• Chapter 6: Deleted the discussions of Ada’s subrange types, array initialization, records,
union types, pointers, and strong typing
• Chapter 7: Deleted the discussions of Ada operator associativity and mixed-mode
expressions
• Chapter 8: Expanded the paragraph on F# selection statements in Section 8.2.1.5
Deleted the discussion of the Ada for statement
• Chapter 9: Added three design issues for subprograms in Section 9.3
Deleted the discussions of Ada and Fortran multidimensional parameters
• Chapter 10: Replaced example program Main_2, written in Ada, with an equivalent
program written in JavaScript in Section 10.4.2
Changed Figure 10.9 to reflect this new JavaScript example
• Chapter 11: Deleted the discussions of Ada abstract data types, generic procedures,
and packages
Added a new paragraph to Section 11.4.3 (Abstract Data Types in Java)
• Chapter 12: Added a paragraph to Section 12.2.2 (Inheritance) that discusses access
control
Expanded the discussion of class variables in Section 12.2.2
Added a paragraph to Section 12.4.4 that discusses final classes in Objective-C
Reorganized Sections 12.5 to 12.9 into a single section
Added Table 12.1 on language design choices to Section 12.4.6.4
Added a new section, Section 6 (Reflection), including example programs in Java and C#
• Chapter 13: Deleted the discussions of Ada task termination and task priorities
• Chapter 14: Deleted exception handling in Ada
Added a new section, 14.4 (Exception Handling in Python and Ruby)
This page intentionally left blank
Preface
Changes for the Eleventh Edition
T
he goals, overall structure, and approach of this eleventh edition of
Concepts of Programming Languages remain the same as those of the ten
earlier editions. The principal goals are to introduce the fundamental
constructs of contemporary programming languages and to provide the reader
with the tools necessary for the critical evaluation of existing and future programming languages. A secondary goal is to prepare the reader for the study of
compiler design, by providing an in-depth discussion of programming language
structures, presenting a formal method of describing syntax, and introducing
approaches to lexical and syntactic analysis.
The eleventh edition evolved from the tenth through several different
kinds of changes. To maintain the currency of the material, much of the discussion of older programming languages, particularly Ada and Fortran, has
been removed. For example, the descriptions of Ada’s records, union types, and
pointers were removed from Chapter 6. Likewise, the description of Ada’s for
statement was removed from Chapter 8 and the discussion of Ada’s abstract
data types was removed from Chapter 11.
On the other hand, a section on reflection that includes two complete
program examples was added to Chapter 12, a section on exception handling
in Python and Ruby was added to Chapter 14, and a table of the design choices
of a few common languages for support for object-oriented programming was
added to Chapter 12.
In some cases, material has been moved. For example, Section 9.10 was
moved backward to become the new Section 9.8.
In one case, example program MAIN_2 in Chapter 10 was rewritten in
JavaScript, previously having been written in Ada.
Chapter 12 was substantially revised, with several new paragraphs, two new
sections, and numerous other changes to improve clarity.
The Vision
This book describes the fundamental concepts of programming languages by
discussing the design issues of the various language constructs, examining the
design choices for these constructs in some of the most common languages,
and critically comparing design alternatives.
7
8 Preface
Any serious study of programming languages requires an examination of
some related topics, among which are formal methods of describing the syntax
and semantics of programming languages, which are covered in Chapter 3.
Also, implementation techniques for various language constructs must be considered: Lexical and syntax analysis are discussed in Chapter 4, and implementation of subprogram linkage is covered in Chapter 10. Implementation of
some other language constructs is discussed in various other parts of the book.
The following paragraphs outline the contents of the eleventh edition.
Chapter Outlines
Chapter 1 begins with a rationale for studying programming languages. It then
discusses the criteria used for evaluating programming languages and language
constructs. The primary influences on language design, common design tradeoffs, and the basic approaches to implementation are also examined.
Chapter 2 outlines the evolution of the languages that are discussed in
this book. Although no attempt is made to describe any language completely,
the origins, purposes, and contributions of each are discussed. This historical
overview is valuable, because it provides the background necessary to understanding the practical and theoretical basis for contemporary language design.
It also motivates further study of language design and evaluation. Because none
of the remainder of the book depends on Chapter 2, it can be read on its own,
independent of the other chapters.
Chapter 3 describes the primary formal method for describing the syntax
of programming language—BNF. This is followed by a description of attribute
grammars, which describe both the syntax and static semantics of languages.
The difficult task of semantic description is then explored, including brief
introductions to the three most common methods: operational, denotational,
and axiomatic semantics.
Chapter 4 introduces lexical and syntax analysis. This chapter is targeted to
those Computer Science departments that no longer require a compiler design
course in their curricula. Similar to Chapter 2, this chapter stands alone and
can be studied independently of the rest of the book, except for Chapter 3, on
which it depends.
Chapters 5 through 14 describe in detail the design issues for the primary
constructs of programming languages. In each case, the design choices for several example languages are presented and evaluated. Specifically, Chapter 5
covers the many characteristics of variables, Chapter 6 covers data types, and
Chapter 7 explains expressions and assignment statements. Chapter 8 describes
control statements, and Chapters 9 and 10 discuss subprograms and their implementation. Chapter 11 examines data abstraction facilities. Chapter 12 provides
an in-depth discussion of language features that support object-oriented programming (inheritance and dynamic method binding), Chapter 13 discusses
concurrent program units, and Chapter 14 is about exception handling, along
with a brief discussion of event handling.
Preface 9
Chapters 15 and 16 describe two of the most important alternative programming paradigms: functional programming and logic programming.
However, some of the data structures and control constructs of functional
programming languages are discussed in Chapters 6 and 8. Chapter 15 presents an introduction to Scheme, including descriptions of some of its primitive functions, special forms, and functional forms, as well as some examples
of simple functions written in Scheme. Brief introductions to ML, Haskell,
and F# are given to illustrate some different directions in functional language
design. Chapter 16 introduces logic programming and the logic programming
language, Prolog.
To the Instructor
In the junior-level programming language course at the University of Colorado
at Colorado Springs, the book is used as follows: We typically cover Chapters 1
and 3 in detail, and though students find it interesting and beneficial reading,
Chapter 2 receives little lecture time due to its lack of hard technical content.
Because no material in subsequent chapters depends on Chapter 2, as noted
earlier, it can be skipped entirely, and because we require a course in compiler
design, Chapter 4 is not covered.
Chapters 5 through 9 should be relatively easy for students with extensive
programming experience in C++, Java, or C#. Chapters 10 through 14 are more
challenging and require more detailed lectures.
Chapters 15 and 16 are entirely new to most students at the junior level.
Ideally, language processors for Scheme and Prolog should be available for
students required to learn the material in these chapters. Sufficient material is
included to allow students to dabble with some simple programs.
Undergraduate courses will probably not be able to cover all of the material
in the last two chapters. Graduate courses, however, should be able to completely discuss the material in those chapters by skipping over some parts of
the early chapters on imperative languages.
Supplemental Materials
The following supplements are available to all readers of this book at www.
pearsonglobaleditions.com/Sebesta.
• A set of lecture note slides. PowerPoint slides are available for each chapter
in the book.
• All of the figures from the book.
A companion Web site to the book is available at www.pearsonglobaleditions.com/
Sebesta. This site contains mini-manuals (approximately 100-page tutorials) on
a handful of languages. These assume that the student knows how to program
10 Preface
in some other language, giving the student enough information to complete
the chapter materials in each language. Currently the site includes manuals for
C++, C, Java, and Smalltalk.
Solutions to many of the problem sets are available to qualified instructors
in our Instructor Resource Center at www.pearsonglobaleditions.com/Sebesta.
Language Processor Availability
Processors for and information about some of the programming languages
discussed in this book can be found at the following Web sites:
C, C++, Fortran, and Ada
gcc.gnu.org
C# and F#
microsoft.com
Java
java.sun.com
Haskell
haskell.org
Lua
www.lua.org
Scheme
www.plt-scheme.org/software/drscheme
Perl
www.perl.com
Python
www.python.org
Ruby
www.ruby-lang.org
JavaScript is included in virtually all browsers; PHP is included in virtually all
Web servers.
All this information is also included on the companion Web site.
Acknowledgments
The suggestions from outstanding reviewers contributed greatly to this book’s
present form and contents. In alphabetical order, they are:
Aaron Rababaah
Amar Raheja
Amer Diwan
Bob Neufeld
Bruce R. Maxim
Charles Nicholas
Cristian Videira Lopes
Curtis Meadow
David E. Goldschmidt
Donald Kraft
Duane J. Jarc
Euripides Montagne
Frank J. Mitropoulos
Gloria Melara
Hossein Saiedian
I-ping Chu
Ian Barland
K. N. King
Karina Assiter
Mark Llewellyn
Matthew Michael Burke
Michael Prentice
Nancy Tinkham
Neelam Soundarajan
Nigel Gwee
Pamela Cutter
Paul M. Jackowitz
Paul Tymann
Richard M. Osborne
Richard Min
Robert McCloskey
Ryan Stansifer
Salih Yurttas
Saverio Perugini
Serita Nelesen
Simon H. Lin
University of Maryland at Eastern Shore
California State Polytechnic University–Pomona
University of Colorado
Wichita State University
University of Michigan–Dearborn
University of Maryland–Baltimore County
University of California–Irvine
University of Maine
Louisiana State University
University of Maryland, University College
University of Central Florida
Nova Southeastern University
California State University–Northridge
University of Kansas
DePaul University
Radford University
Georgia State University
Wentworth Institute of Technology
University of Central Florida
SUNY Buffalo
Rowan University
Ohio State University
Southern University–Baton Rouge
Kalamazoo College
University of Scranton
Rochester Institute of Technology
University of Colorado–Denver
University of Texas at Dallas
University of Scranton
Florida Institute of Technology
Texas A&M University
University of Dayton
Calvin College
California State University–Northridge
11
12 Acknowledgments
Stephen Edwards
Stuart C. Shapiro
Sumanth Yenduri
Teresa Cole
Thomas Turner
Tim R. Norton
Timothy Henry
Walter Pharr
Xiangyan Zeng
Virginia Tech
SUNY Buffalo
University of Southern Mississippi
Boise State University
University of Central Oklahoma
University of Colorado–Colorado Springs
University of Rhode Island
College of Charleston
Fort Valley State University
Numerous other people provided input for the previous editions of Concepts of Programming Languages at various stages of its development. All of their
comments were useful and greatly appreciated. In alphabetical order, they are:
Vicki Allan, Henry Bauer, Carter Bays, Manuel E. Bermudez, Peter Brouwer,
Margaret Burnett, Paosheng Chang, Liang Cheng, John Crenshaw, Charles
Dana, Barbara Ann Griem, Mary Lou Haag, John V. Harrison, Eileen Head,
Ralph C. Hilzer, Eric Joanis, Leon Jololian, Hikyoo Koh, Jiang B. Liu, Meiliu
Lu, Jon Mauney, Robert McCoard, Dennis L. Mumaugh, Michael G. M
urphy,
Andrew Oldroyd, Young Park, Rebecca Parsons, Steve J. Phelps, Jeffery
Popyack, Steven Rapkin, Hamilton Richard, Tom Sager, Raghvinder Sangwan,
Joseph Schell, Sibylle Schupp, Mary Louise Soffa, Neelam Soundarajan, Ryan
Stansifer, Steve Stevenson, Virginia Teller, Yang Wang, John M. Weiss, Franck
Xia, and Salih Yurnas.
Matt Goldstein, editor; Kelsey Loanes, editorial assistant; Team Lead
Product Management, Scott Disanno; Pavithra Jayapaul, and Mahalatchoumy
Saravanan, all deserve my gratitude for their efforts to produce the eleventh
edition both quickly and carefully.
The publishers would like to thank the following for reviewing the Global
Edition:
Sandeep B. L., M. S. Ramaiah Institute of Technology
Koushik S., M. S. Ramaiah Institute of Technology
Sheena V. M., Don Bosco College
About the Author
Robert Sebesta is an Associate Professor Emeritus in the Computer Science
Department at the University of Colorado–Colorado Springs. Professor
Sebesta received a BS in applied mathematics from the University of Colorado
in Boulder and MS and PhD degrees in computer science from Pennsylvania
State University. He has taught computer science for more than 40 years. His
professional interests are the design and evaluation of programming languages
and Web programming.
13
This page intentionally left blank
Contents
Chapter 1
Preliminaries
25
1.1
Reasons for Studying Concepts of Programming Languages………….26
1.2
Programming Domains………………………………………………………………29
1.3
Language Evaluation Criteria……………………………………………………..30
1.4
Influences on Language Design…………………………………………………..41
1.5
Language Categories………………………………………………………………….44
1.6
Language Design Trade-Offs………………………………………………………45
1.7
Implementation Methods……………………………………………………………46
1.8
Programming Environments………………………………………………………53
Summary • Review Questions • Problem Set………………………………………….54
Chapter 2
Evolution of the Major Programming Languages
57
2.1
Zuse’s Plankalkül……………………………………………………………………….60
2.2
Pseudocodes………………………………………………………………………………61
2.3
The IBM 704 and Fortran………………………………………………………….64
2.4
Functional Programming: Lisp……………………………………………………69
2.5
The First Step Toward Sophistication: ALGOL 60……………………….74
2.6
Computerizing Business Records: COBOL………………………………….80
2.7
The Beginnings of Timesharing: Basic…………………………………………85
Interview: Alan Cooper—User Design and Language Design……… 88
2.8
Everything for Everybody: PL/I………………………………………………….90
2.9
Two Early Dynamic Languages: APL and SNOBOL…………………….93
2.10 The Beginnings of Data Abstraction: SIMULA 67………………………..94
2.11 Orthogonal Design: ALGOL 68…………………………………………………95
2.12 Some Early Descendants of the ALGOLs…………………………………….97
15
16 Contents
2.13 Programming Based on Logic: Prolog……………………………………….101
2.14 History’s Largest Design Effort: Ada………………………………………….103
2.15 Object-Oriented Programming: Smalltalk………………………………….107
2.16 Combining Imperative and Object-Oriented Features: C++………..109
2.17 An Imperative-Based Object-Oriented Language: Java………………..112
2.18 Scripting Languages…………………………………………………………………115
2.19 The Flagship .NET Language: C#…………………………………………….122
2.20 Markup-Programming Hybrid Languages………………………………….124
Summary • Bibliographic Notes • Review Questions • Problem Set •
Programming Exercises………………………………………………………………………126
Chapter 3
Describing Syntax and Semantics
133
3.1
Introduction…………………………………………………………………………….134
3.2
The General Problem of Describing Syntax……………………………….135
3.3
Formal Methods of Describing Syntax……………………………………….137
3.4
Attribute Grammars…………………………………………………………………152
History Note………………………………………………………………………………….152
3.5
Describing the Meanings of Programs: Dynamic Semantics…………158
History Note………………………………………………………………………………….166
Summary • Bibliographic Notes • Review Questions • Problem Set……….179
Chapter 4
Lexical and Syntax Analysis
185
4.1
Introduction…………………………………………………………………………….186
4.2
Lexical Analysis………………………………………………………………………..187
4.3
The Parsing Problem……………………………………………………………….195
4.4
Recursive-Descent Parsing……………………………………………………….199
4.5
Bottom-Up Parsing………………………………………………………………….207
Summary • Review Questions • Problem Set •
Programming Exercises………………………………………………………………………215
Chapter 5
Names, Bindings, and Scopes
221
5.1
Introduction…………………………………………………………………………….222
5.2
Names…………………………………………………………………………………….223
Contents 17
History Note………………………………………………………………………………….223
5.3
Variables………………………………………………………………………………….224
5.4
The Concept of Binding…………………………………………………………..227
5.5
Scope………………………………………………………………………………………235
5.6
Scope and Lifetime…………………………………………………………………..246
5.7
Referencing Environments……………………………………………………….247
5.8
Named Constants…………………………………………………………………….248
Summary • Review Questions • Problem Set •
Programming Exercises………………………………………………………………………251
Chapter 6
Data Types
259
6.1
Introduction…………………………………………………………………………….260
6.2
Primitive Data Types………………………………………………………………..262
6.3
Character String Types……………………………………………………………..266
History Note………………………………………………………………………………….267
6.4
Enumeration Types………………………………………………………………….271
6.5
Array Types……………………………………………………………………………..274
History Note………………………………………………………………………………….275
History Note………………………………………………………………………………….275
6.6
Associative Arrays…………………………………………………………………….285
Interview: ROBERTO IERUSALIMSCHY—Lua………………………………… 286
6.7
Record Types…………………………………………………………………………..289
6.8
Tuple Types……………………………………………………………………………..292
6.9
List Types………………………………………………………………………………..294
6.10 Union Types……………………………………………………………………………296
6.11 Pointer and Reference Types…………………………………………………….299
History Note………………………………………………………………………………….302
6.12 Type Checking…………………………………………………………………………311
6.13 Strong Typing………………………………………………………………………….312
6.14 Type Equivalence……………………………………………………………………..313
6.15 Theory and Data Types…………………………………………………………….317
Summary • Bibliographic Notes • Review Questions • Problem Set •
Programming Exercises………………………………………………………………………319
18 Contents
Chapter 7
Expressions and Assignment Statements
325
7.1
Introduction…………………………………………………………………………….326
7.2
Arithmetic Expressions……………………………………………………………..326
7.3
Overloaded Operators………………………………………………………………335
7.4
Type Conversions…………………………………………………………………….337
History Note………………………………………………………………………………….339
7.5
Relational and Boolean Expressions…………………………………………..340
History Note………………………………………………………………………………….340
7.6
Short-Circuit Evaluation…………………………………………………………..342
7.7
Assignment Statements…………………………………………………………….343
History Note………………………………………………………………………………….347
7.8
Mixed-Mode Assignment………………………………………………………….348
Summary • Review Questions • Problem Set • Programming Exercises….. 348
Chapter 8
Statement-Level Control Structures
353
8.1
Introduction…………………………………………………………………………….354
8.2
Selection Statements………………………………………………………………..356
8.3
Iterative Statements………………………………………………………………….367
8.4
Unconditional Branching………………………………………………………….379
History Note………………………………………………………………………………….379
8.5
Guarded Commands………………………………………………………………..380
8.6
Conclusions…………………………………………………………………………….382
Summary • Review Questions • Problem Set • Programming Exercises….. 383
Chapter 9
Subprograms
389
9.1
Introduction…………………………………………………………………………….390
9.2
Fundamentals of Subprograms………………………………………………….390
9.3
Design Issues for Subprograms………………………………………………….398
9.4
Local Referencing Environments………………………………………………399
9.5
Parameter-Passing Methods……………………………………………………..401
History Note………………………………………………………………………………….409
Contents 19
History Note………………………………………………………………………………….409
9.6
Parameters That Are Subprograms…………………………………………….417
History Note………………………………………………………………………………….419
9.7
Calling Subprograms Indirectly…………………………………………………419
9.8
Design Issues for Functions………………………………………………………421
9.9
Overloaded Subprograms………………………………………………………….423
9.10 Generic Subprograms………………………………………………………………424
9.11 User-Defined Overloaded Operators…………………………………………430
9.12 Closures………………………………………………………………………………….430
9.13 Coroutines………………………………………………………………………………432
Summary • Review Questions • Problem Set • Programming Exercises…… 435
Chapter 10
Implementing Subprograms
441
10.1 The General Semantics of Calls and Returns……………………………..442
10.2 Implementing “Simple” Subprograms………………………………………..443
10.3 Implementing Subprograms with Stack-Dynamic
Local Variables…………………………………………………………………………445
10.4 Nested Subprograms………………………………………………………………..453
10.5 Blocks……………………………………………………………………………………..460
10.6 Implementing Dynamic Scoping……………………………………………….461
Summary • Review Questions • Problem Set • Programming Exercises…… 465
Chapter 11
Abstract Data Types and Encapsulation Constructs
471
11.1 The Concept of Abstraction………………………………………………………472
11.2 Introduction to Data Abstraction……………………………………………….473
11.3 Design Issues for Abstract Data Types………………………………………..476
11.4 Language Examples………………………………………………………………….477
Interview: bjarne stroustrup—C++: Its Birth,
Its Ubiquitousness, and Common Criticisms………………………………….. 478
11.5 Parameterized Abstract Data Types……………………………………………496
11.6 Encapsulation Constructs………………………………………………………….500
11.7 Naming Encapsulations……………………………………………………………504
Summary • Review Questions • Problem Set • Programming Exercises…… 507
20 Contents
Chapter 12
Support for Object-Oriented Programming
513
12.1 Introduction…………………………………………………………………………….514
12.2 Object-Oriented Programming…………………………………………………515
12.3 Design Issues for Object-Oriented Languages…………………………….519
12.4 Support for Object-Oriented Programming in
Specific Languages ………………………………………………………………….524
Interview: BJARNE STROUSTRUP—On Paradigms and
Better Programming……………………………………………………………………….. 528
12.5 Implementation of Object-Oriented Constructs………………………….552
12.6 Reflection………………………………………………………………………………..555
Summary • Review Questions • Problem Set • Programming Exercises…… 561
Chapter 13
Concurrency
567
13.1 Introduction…………………………………………………………………………….568
13.2 Introduction to Subprogram-Level Concurrency………………………..573
13.3 Semaphores……………………………………………………………………………..578
13.4 Monitors…………………………………………………………………………………583
13.5 Message Passing………………………………………………………………………585
13.6 Ada Support for Concurrency……………………………………………………586
13.7 Java Threads……………………………………………………………………………594
13.8 C# Threads……………………………………………………………………………..604
13.9 Concurrency in Functional Languages……………………………………….609
13.10 Statement-Level Concurrency…………………………………………………..612
Summary • Bibliographic Notes • Review Questions • Problem Set •
Programming Exercises…………………………………………………………………………. 614
Chapter 14
Exception Handling and Event Handling
621
14.1 Introduction to Exception Handling………………………………………….622
History Note………………………………………………………………………………….626
14.2 Exception Handling in C++………………………………………………………628
14.3 Exception Handling in Java……………………………………………………….632
14.4 Exception Handling in Python and Ruby……………………………………639
14.5 Introduction to Event Handling………………………………………………..642
14.6 Event Handling with Java…………………………………………………………643
Contents 21
14.7 Event Handling in C#………………………………………………………………647
Summary • Bibliographic Notes • Review Questions • Problem Set •
Programming Exercises…………………………………………………………………………. 650
Chapter 15
Functional Programming Languages
657
15.1 Introduction…………………………………………………………………………….658
15.2 Mathematical Functions……………………………………………………………659
15.3 Fundamentals of Functional Programming Languages………………..662
15.4 The First Functional Programming Language: Lisp……………………663
15.5 An Introduction to Scheme……………………………………………………….667
15.6 Common Lisp………………………………………………………………………….685
15.7 ML…………………………………………………………………………………………687
15.8 Haskell……………………………………………………………………………………692
15.9 F#…………………………………………………………………………………………..697
15.10 Support for Functional Programming in Primarily
Imperative Languages………………………………………………………………700
15.11 A Comparison of Functional and Imperative Languages………………703
Summary • Bibliographic Notes • Review Questions • Problem Set •
Programming Exercises…………………………………………………………………………. 705
Chapter 16
Logic Programming Languages
713
16.1 Introduction…………………………………………………………………………….714
16.2 A Brief Introduction to Predicate Calculus…………………………………714
16.3 Predicate Calculus and Proving Theorems…………………………………718
16.4 An Overview of Logic Programming…………………………………………720
16.5 The Origins of Prolog………………………………………………………………722
16.6 The Basic Elements of Prolog…………………………………………………..722
16.7 Deficiencies of Prolog………………………………………………………………737
16.8 Applications of Logic Programming…………………………………………..743
Summary • Bibliographic Notes • Review Questions • Problem Set •
Programming Exercises…………………………………………………………………………. 744
Bibliography…………………………………………………………………. 749
Index………………………………………………………………………….. 761
This page intentionally left blank
Concepts of
Programming Languages
Eleventh Edition
GLOBAL Edition
This page intentionally left blank
1
Preliminaries
1.1 Reasons for Studying Concepts of Programming Languages
1.2 Programming Domains
1.3 Language Evaluation Criteria
1.4 Influences on Language Design
1.5 Language Categories
1.6 Language Design Trade-Offs
1.7 Implementation Methods
1.8 Programming Environments
25
26 Chapter 1
Preliminaries
B
efore we begin discussing the concepts of programming languages, we must
consider a few preliminaries. First, we explain some reasons why computer
science students and professional software developers should study general
concepts of language design and evaluation. This discussion is especially valuable
for those who believe that a working knowledge of one or two programming languages is sufficient for computer scientists. Then, we briefly describe the major
programming domains. Next, because the book evaluates language constructs and
features, we present a list of criteria that can serve as a basis for such judgments.
Then, we discuss the two major influences on language design: machine architecture
and program design methodologies. After that, we introduce the various categories
of programming languages. Next, we describe a few of the major trade-offs that
must be considered during language design.
Because this book is also about the implementation of programming languages,
this chapter includes an overview of the most common general approaches to implementation. Finally, we briefly describe a few examples of programming environments
and discuss their impact on software production.
1.1 Reasons for Studying Concepts of Programming Languages
It is natural for students to wonder how they will benefit from the study of programming language concepts. After all, many other topics in computer science
are worthy of serious study. The following is what we believe to be a compelling
list of potential benefits of studying concepts of programming languages:
• Increased capacity to express ideas. It is widely believed that the depth at which
people can think is influenced by the expressive power of the language in
which they communicate their thoughts. Those with only a weak understanding of natural language are limited in the complexity of their thoughts,
particularly in depth of abstraction. In other words, it is difficult for people
to conceptualize structures they cannot describe, verbally or in writing.
Programmers, in the process of developing software, are similarly constrained. The language in which they develop software places limits on
the kinds of control structures, data structures, and abstractions they can
use; thus, the forms of algorithms they can construct are likewise limited.
Awareness of a wider variety of programming language features can reduce
such limitations in software development. Programmers can increase the
range of their software development thought processes by learning new
language constructs.
It might be argued that learning the capabilities of other languages does
not help a programmer who is forced to use a language that lacks those
capabilities. That argument does not hold up, however, because often, language constructs can be simulated in other languages that do not support
those constructs directly. For example, a C programmer who had learned
the structure and uses of associative arrays in Perl (Christianson et al., 2012)
might design structures that simulate associative arrays in that language.
In other words, the study of programming language concepts builds an
1.1
Reasons for Studying Concepts of Programming Languages 27
appreciation for valuable language features and constructs and encourages
programmers to use them, even when the language they are using does not
directly support such features and constructs.
• Improved background for choosing appropriate languages. Some professional
programmers have had little formal education in computer science; rather,
they have developed their programming skills independently or through
in-house training programs. Such training programs often limit instruction
to one or two languages that are directly relevant to the current projects
of the organization. Other programmers received their formal training
years ago. The languages they learned then are no longer used, and many
features now available in programming languages were not widely known
at the time. The result is that many programmers, when given a choice of
languages for a new project, use the language with which they are most
familiar, even if it is poorly suited for the project at hand. If these programmers were familiar with a wider range of languages and language constructs, they would be better able to choose the language with the features
that best address the problem.
Some of the features of one language often can be simulated in another
language. However, it is preferable to use a feature whose design has been
integrated into a language than to use a simulation of that feature, which
is often less elegant, more cumbersome, and less safe.
• Increased ability to learn new languages. Computer programming is still a relatively young discipline, and design methodologies, software development
tools, and programming languages are still in a state of continuous evolution. This makes software development an exciting profession, but it also
means that continuous learning is essential. The process of learning a new
programming language can be lengthy and difficult, especially for someone
who is comfortable with only one or two languages and has never examined
programming language concepts in general. Once a thorough understanding of the fundamental concepts of languages is acquired, it becomes far
easier to see how these concepts are incorporated into the design of the language being learned. For example, programmers who understand the concepts of o
bject-oriented programming will have a much easier time learning
Ruby (Thomas et al., 2013) than those who have never used those concepts.
The same phenomenon occurs in natural languages. The better you
know the grammar of your native language, the easier it is to learn a second language. Furthermore, learning a second language has the benefit of
teaching you more about your first language.
The TIOBE Programming Community issues an index (http://www
.tiobe.com/index.php/content/paperinfo/tpci/index.htm) that
is an indicator of the relative popularity of programming languages. For
example, according to the index, C, Java, and O
bjective-C were the three
most popular languages in use in February 2014.1 However, dozens of other
1. Note that this index is only one measure of the popularity of programming languages, and
its accuracy is not universally accepted.
28 Chapter 1
Preliminaries
languages were widely used at the time. The index data also show that the
distribution of usage of programming languages is always changing. The
number of languages in use and the dynamic nature of the statistics imply
that every software developer must be prepared to learn different languages.
Finally, it is essential that practicing programmers know the vocabulary
and fundamental concepts of programming languages so they can read and
understand programming language descriptions and evaluations, as well as
promotional literature for languages and compilers. These are the sources
of information needed in order to choose and learn a language.
• Better understanding of the significance of implementation. In learning the concepts of programming languages, it is both interesting and necessary to touch
on the implementation issues that affect those concepts. In some cases, an
understanding of implementation issues leads to an understanding of why
languages are designed the way they are. In turn, this knowledge leads to
the ability to use a language more intelligently, as it was designed to be used.
We can become better programmers by understanding the choices among
programming language constructs and the consequences of those choices.
Certain kinds of program bugs can be found and fixed only by a programmer who knows some related implementation details. Another benefit
of understanding implementation issues is that it allows us to visualize
how a computer executes various language constructs. In some cases, some
knowledge of implementation issues provides hints about the relative efficiency of alternative constructs that may be chosen for a program. For
example, programmers who know little about the complexity of the implementation of subprogram calls often do not realize that a small subprogram
that is frequently called can be a highly inefficient design choice.
Because this book touches on only a few of the issues of implementation, the previous two paragraphs also serve well as rationale for studying
compiler design.
• Better use of languages that are already known. Most contemporary programming languages are large and complex. Accordingly, it is uncommon for a
programmer to be familiar with and use all of the features of a language
he or she uses. By studying the concepts of programming languages, programmers can learn about previously unknown and unused parts of the
languages they already use and begin to use those features.
• Overall advancement of computing. Finally, there is a global view of computing that can justify the study of programming language concepts. Although
it is usually possible to determine why a particular programming language
became popular, many believe, at least in retrospect, that the most popular languages are not always the best available. In some cases, it might be
concluded that a language became widely used, at least in part, because
those in positions to choose languages were not sufficiently familiar with
programming language concepts.
For example, many people believe it would have been better if ALGOL
60 (Backus et al., 1963) had displaced Fortran (McCracken, 1961) in the
1.2
Programming Domains 29
early 1960s, because it was more elegant and had much better control statements, among other reasons. That it did not, is due partly to the programmers and software development managers of that time, many of whom did
not clearly understand the conceptual design of ALGOL 60. They found its
description difficult to read (which it was) and even more difficult to understand. They did not appreciate the benefits of block structure, recursion,
and well-structured control statements, so they failed to see the benefits of
ALGOL 60 over Fortran.
Of course, many other factors contributed to the lack of acceptance of
ALGOL 60, as we will see in Chapter 2. However, the fact that computer users
were generally unaware of the benefits of the language played a significant role.
In general, if those who choose languages were well informed, perhaps
better languages would eventually squeeze out poorer ones.
1.2 Programming Domains
Computers have been applied to a myriad of different areas, from controlling
nuclear power plants to providing video games in mobile phones. Because of
this great diversity in computer use, programming languages with very different
goals have been developed. In this section, we briefly discuss a few of the areas
of computer applications and their associated languages.
1.2.1
Scientific Applications
The first digital computers, which appeared in the late 1940s and early 1950s,
were invented and used for scientific applications. Typically, the scientific applications of that time used relatively simple data structures, but required large
numbers of floating-point arithmetic computations. The most common data
structures were arrays and matrices; the most common control structures were
counting loops and selections. The early h
igh-level programming languages
invented for scientific applications were designed to provide for those needs.
Their competition was assembly language, so efficiency was a primary concern.
The first language for scientific applications was Fortran. ALGOL 60 and most
of its descendants were also intended to be used in this area, although they were
designed to be used in related areas as well. For some scientific applications
where efficiency is the primary concern, such as those that were common in the
1950s and 1960s, no subsequent language is significantly better than Fortran,
which explains why Fortran is still used.
1.2.2
Business Applications
The use of computers for business applications began in the 1950s. Special
computers were developed for this purpose, along with special languages. The
first successful h
igh-level language for business was COBOL (ISO/IEC, 2002),
30 Chapter 1
Preliminaries
the initial version of which appeared in 1960. It probably still is the most commonly used language for these applications. Business languages are characterized by facilities for producing elaborate reports, precise ways of describing and
storing decimal numbers and character data, and the ability to specify decimal
arithmetic operations.
There have been few developments in business application languages outside the development and evolution of COBOL. Therefore, this book includes
only limited discussions of the structures in COBOL.
1.2.3
Artificial Intelligence
Artificial intelligence (AI) is a broad area of computer applications characterized
by the use of symbolic rather than numeric computations. Symbolic computation
means that symbols, consisting of names rather than numbers, are manipulated.
Also, symbolic computation is more conveniently done with linked lists of data
rather than arrays. This kind of programming sometimes requires more flexibility than other programming domains. For example, in some AI applications
the ability to create and execute code segments during execution is convenient.
The first widely used programming language developed for AI applications
was the functional language Lisp (McCarthy et al., 1965), which appeared in
1959. Most AI applications developed prior to 1990 were written in Lisp or one
of its close relatives. During the early 1970s, however, an alternative approach
to some of these applications appeared—logic programming using the Prolog
(Clocksin and Mellish, 2013) language. More recently, some AI applications
have been written in systems languages such as C. Scheme (Dybvig, 2009), a
dialect of Lisp, and Prolog are introduced in Chapters 15 and 16, respectively.
1.2.4
Web Software
The World Wide Web is supported by an eclectic collection of languages, ranging from markup languages, such as HTML, which is not a programming language, to general-purpose programming languages, such as Java. Because of the
pervasive need for dynamic Web content, some computation capability is often
included in the technology of content presentation. This functionality can be
provided by embedding programming code in an HTML document. Such code
is often in the form of a scripting language, such as JavaScript or PHP (Tatroe,
arkup-like languages that have been extended to
2013). There are also some m
include constructs that control document processing, which are discussed in
Section 1.5 and in Chapter 2.
1.3 Language Evaluation Criteria
As noted previously, the purpose of this book is to examine carefully the underlying concepts of the various constructs and capabilities of programming languages. We will also evaluate these features, focusing on their impact on the
1.3
Language Evaluation Criteria 31
software development process, including maintenance. To do this, we need a set
of evaluation criteria. Such a list of criteria is necessarily controversial, because
it is difficult to get even two computer scientists to agree on the value of some
given language characteristic relative to others. In spite of these differences,
most would agree that the criteria discussed in the following subsections are
important.
Some of the characteristics that influence three of the four most important
of these criteria are shown in Table 1.1, and the criteria themselves are discussed
in the following sections.2 Note that only the most important characteristics
are included in the table, mirroring the discussion in the following subsections.
One could probably make the case that if one considered less important characteristics, virtually all table positions could include “bullets.”
Note that some of these characteristics are broad and somewhat vague,
such as writability, whereas others are specific language constructs, such as
exception handling. Furthermore, although the discussion might seem to imply
that the criteria have equal importance, that implication is not intended, and
it is clearly not the case.
1.3.1
Readability
One of the most important criteria for judging a programming language is the
ease with which programs can be read and understood. Before 1970, software
development was largely thought of in terms of writing code. The primary
positive characteristic of programming languages was efficiency. Language
constructs were designed more from the point of view of the computer than
of the computer users. In the 1970s, however, the software life-cycle concept
(Booch, 1987) was developed; coding was relegated to a much smaller role,
Table 1.1 Language evaluation criteria and the characteristics that affect them
Criteria
Characteristic
Simplicity
Orthogonality
Data types
Syntax design
Support for abstraction
Expressivity
Type checking
Exception handling
Restricted aliasing
Readability
Writability
Reliability
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
2. The fourth primary criterion is cost, which is not included in the table because it is only
slightly related to the other criteria and the characteristics that influence them.
32 Chapter 1
Preliminaries
and maintenance was recognized as a major part of the cycle, particularly in
terms of cost. Because ease of maintenance is determined in large part by the
readability of programs, readability became an important measure of the quality
of programs and programming languages. This was an important juncture in
the evolution of programming languages. There was a distinct crossover from
a focus on machine orientation to a focus on human orientation.
Readability must be considered in the context of the problem domain. For
example, if a program that describes a computation is written in a language not
designed for such use, the program may be unnatural and convoluted, making
it unusually difficult to read.
The following subsections describe characteristics that contribute to the
readability of a programming language.
1.3.1.1 Overall Simplicity
The overall simplicity of a programming language strongly affects its readability. A language with a large number of basic constructs is more difficult to learn
than one with a smaller number. Programmers who must use a large language
often learn a subset of the language and ignore its other features. This learning
pattern is sometimes used to excuse the large number of language constructs,
but that argument is not valid. Readability problems occur whenever the program’s author has learned a different subset from that subset with which the
reader is familiar.
A second complicating characteristic of a programming language is feature
multiplicity—that is, having more than one way to accomplish a particular
operation. For example, in Java, a user can increment a simple integer variable
in four different ways:
count = count + 1
count += 1
count++
++count
Although the last two statements have slightly different meanings from each
other and from the others in some contexts, all of them have the same meaning when used as stand-alone expressions. These variations are discussed in
Chapter 7.
A third potential problem is operator overloading, in which a single operator symbol has more than one meaning. Although this is often useful, it can
lead to reduced readability if users are allowed to create their own overloading
and do not do it sensibly. For example, it is clearly acceptable to overload +
to use it for both integer and f loating-point addition. In fact, this overloading
simplifies a language by reducing the number of operators. However, suppose
the programmer defined + used between single-dimensioned array operands
to mean the sum of all elements of both arrays. Because the usual meaning of
vector addition is quite different from this, this unusual meaning could confuse
1.3
Language Evaluation Criteria 33
both the author and the program’s readers. An even more extreme example of
program confusion would be a user defining + between two vector operands
to mean the difference between their respective first elements. Operator overloading is further discussed in Chapter 7.
Simplicity in languages can, of course, be carried too far. For example,
the form and meaning of most assembly language statements are models of
simplicity, as you can see when you consider the statements that appear in the
next section. This very simplicity, however, makes assembly language programs
less readable. Because they lack more complex control statements, program
structure is less obvious; because the statements are simple, far more of them
are required than in equivalent programs in a high-level language. These same
arguments apply to the less extreme case of high-level languages with inadequate control and data-structuring constructs.
1.3.1.2 Orthogonality
Orthogonality in a programming language means that a relatively small set
of primitive constructs can be combined in a relatively small number of ways
to build the control and data structures of the language. Furthermore, every
possible combination of primitives is legal and meaningful. For example, consider data types. Suppose a language has four primitive data types (integer, float,
double, and character) and two type operators (array and pointer). If the two
type operators can be applied to themselves and the four primitive data types,
a large number of data structures can be defined.
The meaning of an orthogonal language feature is independent of the
context of its appearance in a program. (The word orthogonal comes from the
mathematical concept of orthogonal vectors, which are independent of each
other.) Orthogonality follows from a symmetry of relationships among primitives. A lack of orthogonality leads to exceptions to the rules of the language.
For example, in a programming language that supports pointers, it should be
possible to define a pointer to point to any specific type defined in the language.
However, if pointers are not allowed to point to arrays, many potentially useful
user-defined data structures cannot be defined.
We can illustrate the use of orthogonality as a design concept by comparing one aspect of the assembly languages of the IBM mainframe computers
and the VAX series of minicomputers. We consider a single simple situation:
adding two 32-bit integer values that reside in either memory or registers and
replacing one of the two values with the sum. The IBM mainframes have two
instructions for this purpose, which have the forms
A Reg1, memory_cell
AR Reg1, Reg2
where Reg1 and Reg2 represent registers. The semantics of these are
Reg1 d contents(Reg1) + contents(memory_cell)
Reg1 d contents(Reg1) + contents(Reg2)
34 Chapter 1
Preliminaries
The VAX addition instruction for 32-bit integer values is
ADDL operand_1, operand_2
whose semantics is
operand_2 d contents(operand_1) + contents(operand_2)
In this case, either operand can be a register or a memory cell.
The VAX instruction design is orthogonal in that a single instruction can
use either registers or memory cells as the operands. There are two ways to
specify operands, which can be combined in all possible ways. The IBM design
is not orthogonal. Only two out of four operand combinations possibilities are
legal, and the two require different instructions, A and AR. The IBM design is
more restricted and therefore less writable. For example, you cannot add two
values and store the sum in a memory location. Furthermore, the IBM design is
more difficult to learn because of the restrictions and the additional instruction.
Orthogonality is closely related to simplicity: The more orthogonal the
design of a language, the fewer exceptions the language rules require. Fewer
exceptions mean a higher degree of regularity in the design, which makes the
language easier to learn, read, and understand. Anyone who has learned a significant part of the English language can testify to the difficulty of learning its
many rule exceptions (for example, i before e except after c).
As examples of the lack of orthogonality in a high-level language, consider
the following rules and exceptions in C. Although C has two kinds of structured data types, arrays and records (structs), records can be returned from
functions but arrays cannot. A member of a structure can be any data type
except void or a structure of the same type. An array element can be any data
type except void or a function. Parameters are passed by value, unless they
are arrays, in which case they are, in effect, passed by reference (because the
appearance of an array name without a subscript in a C program is interpreted
to be the address of the array’s first element).
As an example of context dependence, consider the C expression
a + b
This expression often means that the values of a and b are fetched and added
together. However, if a happens to be a pointer and b is an integer, it affects
the value of b. For example, if a points to a float value that occupies four bytes,
then the value of b must be scaled—in this case multiplied by 4—before it is
added to a. Therefore, the type of a affects the treatment of the value of b. The
context of b affects its meaning.
Too much orthogonality can also cause problems. Perhaps the most
orthogonal programming language is ALGOL 68 (van Wijngaarden et al.,
1969). Every language construct in ALGOL 68 has a type, and there are no
restrictions on those types. In addition, most constructs produce values. This
combinational freedom allows extremely complex constructs. For example, a
1.3
Language Evaluation Criteria 35
conditional can appear as the left side of an assignment, along with declarations
and other assorted statements, as long as the result is an address. This extreme
form of orthogonality leads to unnecessary complexity. Furthermore, because
languages require a large number of primitives, a high degree of orthogonality
results in an explosion of combinations. So, even if the combinations are simple,
their sheer numbers lead to complexity.
Simplicity in a language, therefore, is at least in part the result of a combination of a relatively small number of primitive constructs and a limited use of
the concept of orthogonality.
Some believe that functional languages offer a good combination of simplicity and orthogonality. A functional language, such as Lisp, is one in which computations are made primarily by applying functions to given parameters. In contrast,
in imperative languages such as C, C++, and Java, computations are usually specified with variables and assignment statements. Functional languages offer potentially the greatest overall simplicity, because they can accomplish everything with
a single construct, the function call, which can be combined simply with other
function calls. This simple elegance is the reason why some language researchers
are attracted to functional languages as the primary alternative to complex nonfunctional languages such as Java. Other factors, such as efficiency, however, have
prevented functional languages from becoming more widely used.
1.3.1.3 Data Types
The presence of adequate facilities for defining data types and data structures
in a language is another significant aid to readability. For example, suppose a
numeric type is used for an indicator flag because there is no Boolean type in
the language. In such a language, we might have an assignment such as the
following:
timeOut = 1
The meaning of this statement is unclear, whereas in a language that includes
Boolean types, we would have the following:
timeOut = true
The meaning of this statement is perfectly clear.
1.3.1.4 Syntax Design
The syntax, or form, of the elements of a language has a significant effect on
the readability of programs. Following are some examples of syntactic design
choices that affect readability:
• Special words. Program appearance and thus program readability are
strongly influenced by the forms of a language’s special words (for example,
while, class, and for). Especially important is the method of forming
36 Chapter 1
Preliminaries
compound statements, or statement groups, primarily in control constructs.
Some languages have used matching pairs of special words or symbols to
form groups. C and its descendants use braces to specify compound statements. All of these languages have diminished readability because statement groups are always terminated in the same way, which makes it difficult
to determine which group is being ended when an end or a right brace
appears. Fortran 95 and Ada (ISO/IEC, 2014) make this clearer by using a
distinct closing syntax for each type of statement group. For example, Ada
uses end if to terminate a selection construct and end loop to terminate
a loop construct. This is an example of the conflict between simplicity that
results in fewer reserved words, as in Java, and the greater readability that
can result from using more reserved words, as in Ada.
Another important issue is whether the special words of a language can
be used as names for program variables. If so, the resulting programs can
be very confusing. For example, in Fortran 95, special words, such as Do
and End, are legal variable names, so the appearance of these words in a
program may or may not connote something special.
• Form and meaning. Designing statements so that their appearance at least
partially indicates their purpose is an obvious aid to readability. Semantics,
or meaning, should follow directly from syntax, or form. In some cases, this
principle is violated by two language constructs that are identical or similar
in appearance but have different meanings, depending perhaps on context.
In C, for example, the meaning of the reserved word static depends on
the context of its appearance. If used on the definition of a variable inside
a function, it means the variable is created at compile time. If used on the
definition of a variable that is outside all functions, it means the variable
is visible only in the file in which its definition appears; that is, it is not
exported from that file.
One of the primary complaints about the shell commands of UNIX
(Robbins, 2005) is that their appearance does not always suggest their
function. For example, the meaning of the UNIX command grep can be
deciphered only through prior knowledge, or perhaps cleverness and familiarity with the UNIX editor, ed. The appearance of grep connotes nothing
to UNIX beginners. (In ed, the command /regular_expression/ searches for a
substring that matches the regular expression. Preceding this with g makes
it a global command, specifying that the scope of the search is the whole
file being edited. Following the command with p specifies that lines with
the matching substring are to be printed. So g/regular_expression/p, which
can obviously be abbreviated as grep, prints all lines in a file that contain
substrings that match the regular expression.)
1.3.2
Writability
Writability is a measure of how easily a language can be used to create programs
for a chosen problem domain. Most of the language characteristics that affect
readability also affect writability. This follows directly from the fact that the
1.3
Language Evaluation Criteria 37
process of writing a program requires the programmer frequently to reread the
part of the program that is already written.
As is the case with readability, writability must be considered in the context
of the target problem domain of a language. It simply is not fair to compare
the writability of two languages in the realm of a particular application when
one was designed for that application and the other was not. For example, the
writabilities of Visual BASIC (VB) (Halvorson, 2013) and C are dramatically
different for creating a program that has a graphical user interface (GUI), for
which VB is ideal. Their writabilities are also quite different for writing systems
programs, such as an operation system, for which C was designed.
The following subsections describe the most important characteristics
influencing the writability of a language.
1.3.2.1 Simplicity and Orthogonality
If a language has a large number of different constructs, some programmers
might not be familiar with all of them. This situation can lead to a misuse of
some features and a disuse of others that may be either more elegant or more
efficient, or both, than those that are used. It may even be possible, as noted
by Hoare (1973), to use unknown features accidentally, with bizarre results.
Therefore, a smaller number of primitive constructs and a consistent set of
rules for combining them (that is, orthogonality) is much better than simply
having a large number of primitives. A programmer can design a solution to a
complex problem after learning only a simple set of primitive constructs.
On the other hand, too much orthogonality can be a detriment to writability. Errors in programs can go undetected when nearly any combination of
primitives is legal. This can lead to code absurdities that cannot be discovered
by the compiler.
1.3.2.2 Expressivity
Expressivity in a language can refer to several different characteristics. In a language such as APL (Gilman and Rose, 1983), it means that there are very powerful operators that allow a great deal of computation to be accomplished with a
very small program. More commonly, it means that a language has relatively convenient, rather than cumbersome, ways of specifying computations. For example,
in C, the notation count++ is more convenient and shorter than count =
count + 1. Also, the and then Boolean operator in Ada is a convenient way of
specifying short-circuit evaluation of a Boolean expression. The inclusion of the
for statement in Java makes writing counting loops easier than with the use of
while, which is also possible. All of these increase the writability of a language.
1.3.3
Reliability
A program is said to be reliable if it performs to its specifications under all
conditions. The following subsections describe several language features that
have a significant effect on the reliability of programs in a given language.
38 Chapter 1
Preliminaries
1.3.3.1 Type Checking
Type checking is simply testing for type errors in a given program, either
by the compiler or during program execution. Type checking is an important factor in language reliability. Because run-time type checking is expensive,
compile-time type checking is more desirable. Furthermore, the earlier errors
in programs are detected, the less expensive it is to make the required repairs.
The design of Java requires checks of the types of nearly all variables and
expressions at compile time. This virtually eliminates type errors at run time
in Java programs. Types and type checking are discussed in depth in Chapter 6.
One example of how failure to type check, at either compile time or run
time, has led to countless program errors is the use of subprogram parameters
in the original C language (Kernighan and Ritchie, 1978). In this language, the
type of an actual parameter in a function call was not checked to determine
whether its type matched that of the corresponding formal parameter in the
function. An int type variable could be used as an actual parameter in a call to
a function that expected a float type as its formal parameter, and neither the
compiler nor the r un-time system would detect the inconsistency. For example,
because the bit string that represents the integer 23 is essentially unrelated to
the bit string that represents a floating-point 23, if an integer 23 is sent to a
function that expects a floating-point parameter, any uses of the parameter in
the function will produce nonsense. Furthermore, such problems are often
difficult to diagnose.3 The current version of C has eliminated this problem
by requiring all parameters to be type checked. Subprograms and parameter-
passing techniques are discussed in Chapter 9.
1.3.3.2 Exception Handling
The ability of a program to intercept run-time errors (as well as other
unusual conditions detectable by the program), take corrective measures, and
then continue is an obvious aid to reliability. This language facility is called
exception handling. Ada, C++, Java, and C# include extensive capabilities
for exception handling, but such facilities are practically nonexistent in some
widely used languages, for example C. Exception handling is discussed in
Chapter 14.
1.3.3.3 Aliasing
Loosely defined, aliasing is having two or more distinct names in a program
that can be used to access the same memory cell. It is now generally accepted
that aliasing is a dangerous feature in a programming language. Most programming languages allow some kind of aliasing—for example, two pointers (or references) set to point to the same variable, which is possible in most languages.
In such a program, the programmer must always remember that changing the
3. In response to this and other similar problems, UNIX systems include a utility program
named lint that checks C programs for such problems.
1.3
Language Evaluation Criteria 39
value pointed to by one of the two changes the value referenced by the other.
Some kinds of aliasing, as described in Chapters 5 and 9, can be prohibited by
the design of a language.
In some languages, aliasing is used to overcome deficiencies in the language’s data abstraction facilities. Other languages greatly restrict aliasing to
increase their reliability.
1.3.3.4 Readability and Writability
Both readability and writability influence reliability. A program written in a
language that does not support natural ways to express the required algorithms
will necessarily use unnatural approaches. Unnatural approaches are less likely
to be correct for all possible situations. The easier a program is to write, the
more likely it is to be correct.
Readability affects reliability in both the writing and maintenance phases
of the life cycle. Programs that are difficult to read are difficult both to write
and to modify.
1.3.4
Cost
The total cost of a programming language is a function of many of its
characteristics.
First, there is the cost of training programmers to use the language, which
is a function of the simplicity and orthogonality of the language and the experience of the programmers. Although more powerful languages are not necessarily more difficult to learn, they often are.
Second, there is the cost of writing programs in the language. This is a
function of the writability of the language, which depends in part on its closeness in purpose to the particular application. The original efforts to design and
implement high-level languages were driven by the desire to lower the costs
of creating software.
Both the cost of training programmers and the cost of writing programs in
a language can be significantly reduced in a good programming environment.
Programming environments are discussed in Section 1.8.
Third, there is the cost of compiling programs in the language. A major
impediment to the early use of Ada was the prohibitively high cost of running the first-generation Ada compilers. This problem was diminished by the
appearance of improved Ada compilers.
Fourth, the cost of executing programs written in a language is greatly
influenced by that language’s design. A language that requires many run-time
type checks will prohibit fast code execution, regardless of the quality of the
compiler. Although execution efficiency was the foremost concern in the design
of early languages, it is now considered to be less important.
A simple trade-off can be made between compilation cost and execution
speed of the compiled code. Optimization is the name given to the collection
of techniques that compilers may use to decrease the size and/or increase the
40 Chapter 1
Preliminaries
execution speed of the code they produce. If little or no optimization is done,
compilation can be done much faster than if a significant effort is made to
produce optimized code. The choice between the two alternatives is influenced
by the environment in which the compiler will be used. In a laboratory for
beginning programming students, who often compile their programs many
times during development but use little code at execution time (their programs
are small and they must execute correctly only once), little or no optimization
should be done. In a production environment, where compiled programs are
executed many times after development, it is better to pay the extra cost to
optimize the code.
The fifth factor in the cost of a language is the cost of the language implementation system. One of the factors that explains the rapid acceptance of Java
is that free compiler/interpreter systems became available for it soon after its
design was released. A language whose implementation system is either expensive or runs only on expensive hardware will have a much smaller chance of
becoming widely used. For example, the high cost of first-generation Ada compilers helped prevent Ada from becoming popular in its early days.
Sixth, there is the cost of poor reliability. If the software fails in a critical
system, such as a nuclear power plant or an X
-ray machine for medical use, the
cost could be very high. The failures of noncritical systems can also be very
expensive in terms of lost future business or lawsuits over defective software
systems.
The final consideration is the cost of maintaining programs, which includes
both corrections and modifications to add new functionality. The cost of software maintenance depends on a number of language characteristics, primarily
readability. Because maintenance is often done by individuals other than the
original author of the software, poor readability can make the task extremely
challenging.
The importance of software maintainability cannot be overstated. It has
been estimated that for large software systems with relatively long lifetimes,
maintenance costs can be as high as two to four times as much as development
costs (Sommerville, 2010).
Of all the contributors to language costs, three are most important: program development, maintenance, and reliability. Because these are functions of
writability and readability, these two evaluation criteria are, in turn, the most
important.
Of course, a number of other criteria could be used for evaluating programming languages. One example is portability, or the ease with which programs can be moved from one implementation to another. Portability is most
strongly influenced by the degree of standardization of the language. Some
languages are not standardized at all, making programs in these languages very
difficult to move from one implementation to another. This problem is alleviated in some cases by the fact that implementations for some languages now
have single sources. Standardization is a time-consuming and difficult process.
A committee began work on producing a standard version of C++ in 1989. It
was approved in 1998.
1.4
Influences on Language Design 41
Generality (the applicability to a wide range of applications) and well-
efinedness (the completeness and precision of the language’s official defining
d
document) are two other criteria.
Most criteria, particularly readability, writability, and reliability, are neither precisely defined nor precisely measurable. Nevertheless, they are useful
concepts and they provide valuable insight into the design and evaluation of
programming languages.
A final note on evaluation criteria: language design criteria are weighed
differently from different perspectives. Language implementors are concerned
primarily with the difficulty of implementing the constructs and features of the
language. Language users are worried about writability first and readability
later. Language designers are likely to emphasize elegance and the ability to
attract widespread use. These characteristics often conflict with one another.
1.4 Influences on Language Design
In addition to those factors described in Section 1.3, several other factors influence the basic design of programming languages. The most important of these
are computer architecture and programming design methodologies.
1.4.1
Computer Architecture
The basic architecture of computers has had a profound effect on language
design. Most of the popular languages of the past 60 years have been designed
around the prevalent computer architecture, called the von Neumann
architecture, after one of its originators, John von Neumann (pronounced
“von Noyman”). These languages are called imperative languages. In a von
Neumann computer, both data and programs are stored in the same memory.
The central processing unit (CPU), which executes instructions, is separate
from the memory. Therefore, instructions and data must be transmitted, or
piped, from memory to the CPU. Results of operations in the CPU must be
moved back to memory. Nearly all digital computers built since the 1940s have
been based on the von Neumann architecture. The overall structure of a von
Neumann computer is shown in Figure 1.1.
Because of the von Neumann architecture, the central features of imperative languages are variables, which model the memory cells; assignment
statements, which are based on the piping operation; and the iterative form
of repetition, which is the most efficient way to implement repetition on this
architecture. Operands in expressions are piped from memory to the CPU,
and the result of evaluating the expression is piped back to the memory cell
represented by the left side of the assignment. Iteration is fast on von Neumann
computers because instructions are stored in adjacent cells of memory and
repeating the execution of a section of code requires only a branch instruction.
This efficiency discourages the use of recursion for repetition, although recursion is sometimes more natural.
42 Chapter 1
Preliminaries
Figure 1.1
The von Neumann
computer architecture
Memory (stores both instructions and data)
Results of
operations
Arithmetic and
logic unit
Instructions and data
Control
unit
Input and output devices
Central processing unit
The execution of a machine code program on a von Neumann architecture
computer occurs in a process called the fetch-execute cycle. As stated earlier,
programs reside in memory but are executed in the CPU. Each instruction to
be executed must be moved from memory to the processor. The address of the
next instruction to be executed is maintained in a register called the program
counter. The fetch-execute cycle can be simply described by the following
algorithm:
initialize the program counter
repeat forever
fetch the instruction pointed to by the program counter
increment the program counter to point at the next instruction
decode the instruction
execute the instruction
end repeat
The “decode the instruction” step in the algorithm means the instruction is
examined to determine what action it specifies. Program execution terminates
when a stop instruction is encountered, although on an actual computer a stop
instruction is rarely executed. Rather, control transfers from the operating system to a user program for its execution and then back to the operating system
when the user program execution is complete. In a computer system in which
more than one user program may be in memory at a given time, this process
is far more complex.
As stated earlier, a functional, or applicative, language is one in which
the primary means of computation is applying functions to given parameters.
Programming can be done in a functional language without the kind of variables that are used in imperative languages, without assignment statements, and
1.4
Influences on Language Design 43
without iteration. Although many computer scientists have expounded on the
myriad benefits of functional languages, such as Scheme, it is unlikely that they
will displace the imperative languages until a n
on–von Neumann computer is
designed that allows efficient execution of programs in functional languages.
Among those who have bemoaned this fact, the most eloquent is John Backus
(1978), the principal designer of the original version of Fortran.
In spite of the fact that the structure of imperative programming languages
is modeled on a machine architecture, rather than on the abilities and inclinations of the users of programming languages, some believe that using imperative languages is somehow more natural than using a functional language.
So, these people believe that even if functional programs were as efficient as
imperative programs, the use of imperative programming languages would still
dominate.
1.4.2
Programming Design Methodologies
The late 1960s and early 1970s brought an intense analysis, begun in large part
by the structured-programming movement, of both the software development
process and programming language design.
An important reason for this research was the shift in the major cost of
computing from hardware to software, as hardware costs decreased and programmer costs increased. Increases in programmer productivity were relatively
small. In addition, progressively larger and more complex problems were being
solved by computers. Rather than simply solving sets of equations to simulate
satellite tracks, as in the early 1960s, programs were being written for large
and complex tasks, such as controlling large petroleum-refining facilities and
providing worldwide airline reservation systems.
The new software development methodologies that emerged as a result
of the research of the 1970s were called t op-down design and stepwise refinement. The primary programming language deficiencies that were discovered
were incompleteness of type checking and inadequacy of control statements
(requiring the extensive use of gotos).
In the late 1970s, a shift from procedure-oriented to data-oriented program design methodologies began. Simply put, d
ata-oriented methods emphasize data design, focusing on the use of abstract data types to solve problems.
For data abstraction to be used effectively in software system design, it
must be supported by the languages used for implementation. The first language to provide even limited support for data abstraction was SIMULA 67
(Birtwistle et al., 1973), although that language certainly was not propelled
to popularity because of it. The benefits of data abstraction were not widely
recognized until the early 1970s. However, most languages designed since the
late 1970s support data abstraction, which is discussed in detail in Chapter 11.
The latest step in the evolution of data-oriented software development,
which began in the early 1980s, is o
bject-oriented design. O
bject-oriented
methodology begins with data abstraction, which encapsulates processing with
data objects and controls access to data, and adds inheritance and dynamic
44 Chapter 1
Preliminaries
method binding. Inheritance is a powerful concept that greatly enhances the
potential reuse of existing software, thereby providing the possibility of significant increases in software development productivity. This is an important factor
in the increase in popularity of object-oriented languages. Dynamic (run-time)
method binding allows more flexible use of inheritance.
Object-oriented programming developed along with a language that
supported its concepts: Smalltalk (Goldberg and Robson, 1989). Although
Smalltalk never became as widely used as many other languages, support for
object-oriented programming is now part of most popular imperative languages,
including Java, C++, and C#. O
bject-oriented concepts have also found their
way into functional programming in CLOS (Bobrow et al., 1988) and F# (Syme
et al., 2010), as well as logic programming in Prolog++ (Moss, 1994). Language
support for object-oriented programming is discussed in detail in Chapter 12.
Procedure-oriented programming is, in a sense, the opposite of d
ata-
oriented programming. Although data-oriented methods now dominate software development, procedure-oriented methods have not been abandoned. On
the contrary, in recent years, a good deal of research has occurred in procedure-
oriented programming, especially in the area of concurrency. These research
efforts brought with them the need for language facilities for creating and
controlling concurrent program units. Java and C# include such capabilities.
Concurrency is discussed in detail in Chapter 13.
All of these evolutionary steps in software development methodologies led
to new language constructs to support them.
1.5 Language Categories
Programming languages are often categorized into four bins: imperative,
functional, logic, and object oriented. However, we do not consider languages
that support object-oriented programming to form a separate category of
languages. We have described how the most popular languages that support
object-oriented programming grew out of imperative languages. Although
the object-oriented software development paradigm differs significantly from
the procedure-oriented paradigm usually used with imperative languages, the
extensions to an imperative language required to support object-oriented programming are not intensive. For example, the expressions, assignment statements, and control statements of C and Java are nearly identical. (On the other
hand, the arrays, subprograms, and semantics of Java are very different from
those of C.) Similar statements can be made for functional languages that support object-oriented programming.
Some authors refer to scripting languages as a separate category of programming languages. However, languages in this category are bound together
more by their implementation method, partial or full interpretation, than by
a common language design. The languages that are typically called scripting
languages, among them Perl, JavaScript (Flanagan, 2011), and Ruby, are imperative languages in every sense.
1.6
Language Design
Trade-
Offs 45
A logic programming language is an example of a rule-based language. In
an imperative language, an algorithm is specified in great detail, and the specific order of execution of the instructions or statements must be included. In
a rule-based language, however, rules are specified in no particular order, and
the language implementation system must choose an order in which the rules
are used to produce the desired result. This approach to software development
is radically different from those used with the other two categories of languages
and clearly requires a completely different kind of language. Prolog, the most
commonly used logic programming language, and logic programming are discussed in Chapter 16.
In recent years, a new category of languages has emerged, the markup/
programming hybrid languages. Markup languages are not programming languages. For instance, HTML, the most widely used markup language, is used to
specify the layout of information in Web documents. However, some programming capability has crept into some extensions to HTML and XML. Among
these are the Java Server Pages Standard Tag Library (JSTL) and eXtensible Stylesheet Language Transformations (XSLT). Both of these are briefly
introduced in Chapter 2. Those languages cannot be compared to any of the
complete programming languages and therefore will not be discussed after
Chapter 2.
1.6 Language Design Trade-Offs
The programming language evaluation criteria described in Section 1.3 provide a framework for language design. Unfortunately, that framework is s elf-
contradictory. In his insightful paper on language design, Hoare (1973) stated
that “there are so many important but conflicting criteria, that their reconciliation and satisfaction is a major engineering task.”
Two criteria that conflict are reliability and cost of execution. For example,
the Java language definition demands that all references to array elements be
checked to ensure that the index or indices are in their legal ranges. This step
adds a great deal to the cost of execution of Java programs that contain large
numbers of references to array elements. C does not require index range checking, so C programs execute faster than semantically equivalent Java programs,
although Java programs are more reliable. The designers of Java traded execution efficiency for reliability.
As another example of conflicting criteria that leads directly to design
trade-offs, consider the case of APL. APL includes a powerful set of operators
for array operands. Because of the large number of operators, a significant number of new symbols had to be included in APL to represent the operators. Also,
many APL operators can be used in a single, long, complex expression. One
result of this high degree of expressivity is that, for applications involving many
array operations, APL is very writable. Indeed, a huge amount of computation
can be specified in a very small program. Another result is that APL programs
have very poor readability. A compact and concise expression has a certain
46 Chapter 1
Preliminaries
mathematical beauty but it is difficult for anyone other than the programmer
to understand. Well-known author Daniel McCracken (1970) once noted that
it took him four hours to…