Study the Hidden-Surface Removal problem and implement the Painter’s algorithm using WebGL.
Clearly provide the details of your program including the screenshots of your working program.
- Describe the object (primitive) that you are working with.
Adequately comment your source code.
This page intentionally left blank
Editorial Director, ECS Marcia Horton
Acquisitions Editor Matt Goldstein
Program Manager Kayla Smith-Tarbox
Director of Marketing Christy Lesko
Marketing Assistant Jon Bryant
Director of Production Erin Gregg
Senior Managing Editor Scott Disanno
Senior Project Manager Marilyn Lloyd
Manufacturing Buyer Linda Sager
Cover Designer Joyce Cosentino Wells
Manager, Text Permissions Tim Nicholls
Text Permission Project Manager William Opaluch
Media Project Manager Renata Butera
Full-Service Project Management Cypress Graphics, Paul C. Anagnostopoulos
Printer/Binder Courier Kendallville
Cover Printer Lehigh Phoenix-Color
Text Font Minion and Avenir
Cover Image: One frame of a particle physics simulation created with DomeGL, a version of
OpenGL designed for generating images for multiprojector domed environments. Used with
permission from Matthew Dosanjh, Jeff Bowles, and Joe Kniss, ARTS Lab, University of New
Mexico.
Credits and acknowledgments borrowed from other sources and reproduced, with permission,
appear on the appropriate page in the text.
Copyright © 2015, 2012, 2009 Pearson Education, Inc., publishing as Addison-Wesley. All rights
reserved. Printed in the United States of America. This publication is protected by Copyright, and
permission should be obtained from the publisher prior to any prohibited reproduction, storage
in a retrieval system, or transmission in any form or by any means, electronic, mechanical,
photocopying, recording, or likewise. To obtain permission(s) to use material from this work,
please submit a written request to Pearson Education, Inc., Permissions Department, One Lake
Street, Upper Saddle River, New Jersey 07458, or you may fax your request to 201-236-3290.
Many of the designations by manufacturers and sellers to distinguish their products are claimed
as trademarks. Where those designations appear in this book, and the publisher was aware of a
trademark claim, the designations have been printed in initial caps or all caps.
The programs and applications presented in this book have been included for their instructional
value. They have been tested with care, but are not guaranteed for any particular purpose. The
publisher does not offer any warranties or representations, nor does it accept any liabilities with
respect to the programs or applications.
Library of Congress Cataloging-in-Publication Data
Angel, Edward.
Interactive computer graphics : a top-down approach with WebGL / Edward Angel,
Dave Shreiner. — 7th edition.
pages cm
Includes bibliographical references and indexes.
ISBN-13: 978-0-13-357484-5 (alkaline paper)
ISBN-10: 0-13-357484-9 (alkaline paper)
1. Computer graphics. 2. Interactive computer systems. 3. WebGL (Computerprogram language) 4. OpenGL. I. Shreiner, Dave. II. Title.
T385.A5133 2014
006.6633—dc23
2013050594
10 9 8 7 6 5 4 3 2 1—V011—18 17 16 15 14
ISBN 10: 0-13-357484-9
ISBN 13: 978-0-13-357484-5
To Rose Mary
—E.A.
To Vicki, Bonnie, Bob, Cookie, and Goatee
—D.S.
This page intentionally left blank
CONTE NTS
Preface
xxi
CHAPTER 1
GRAPHICS SYSTEMS AND MODELS
1
1.1
Applications of Computer Graphics
1.1.1
1.1.2
1.1.3
1.1.4
Display of Information
2
Design
3
Simulation and Animation
User Interfaces
4
2
1.2
A Graphics System
1.2.1
1.2.2
1.2.3
1.2.4
Pixels and the Framebuffer
The CPU and the GPU
6
Output Devices
7
Input Devices
9
1.3
Images: Physical and Synthetic
1.3.1
1.3.2
1.3.3
Objects and Viewers
10
Light and Images
12
Imaging Models
13
1.4
Imaging Systems
1.4.1
1.4.2
The Pinhole Camera
15
The Human Visual System
1.5
The Synthetic-Camera Model
18
1.6
The Programmer’s Interface
20
1.6.1
1.6.2
1.6.3
1.6.4
The Pen-Plotter Model
21
Three-Dimensional APIs
23
A Sequence of Images
26
The Modeling–Rendering Paradigm
1.7
Graphics Architectures
1.7.1
1.7.2
1.7.3
1.7.4
1.7.5
1.7.6
1.7.7
Display Processors
29
Pipeline Architectures
29
The Graphics Pipeline
30
Vertex Processing
31
Clipping and Primitive Assembly
Rasterization
32
Fragment Processing
32
3
5
5
10
15
17
27
28
31
vii
viii
Contents
1.8
Programmable Pipelines
32
1.9
Performance Characteristics
33
1.10
OpenGL Versions and WebGL
34
Summary and Notes
36
Suggested Readings
36
Exercises
CHAPTER 2
37
GRAPHICS PROGRAMMING
39
2.1
The Sierpinski Gasket
39
2.2
Programming Two-Dimensional Applications
42
47
2.3
The WebGL Application Programming Interface
2.3.1
2.3.2
2.3.3
2.3.4
2.3.5
Graphics Functions
47
The Graphics Pipeline and State Machines
OpenGL and WebGL
50
The WebGL Interface
50
Coordinate Systems 51
2.4
Primitives and Attributes
2.4.1
2.4.2
2.4.3
2.4.4
2.4.5
2.4.6
2.4.7
Polygon Basics 55
Polygons in WebGL 56
Approximating a Sphere 57
Triangulation
58
Text
59
Curved Objects
60
Attributes 61
2.5
Color
2.5.1
2.5.2
2.5.3
RGB Color
64
Indexed Color 66
Setting of Color Attributes
2.6
Viewing
2.6.1
2.6.2
The Orthographic View 68
Two-Dimensional Viewing
71
2.7
Control Functions
2.7.1
2.7.2
2.7.3
Interaction with the Window System
Aspect Ratio and Viewports
73
Application Organization 74
2.8
The Gasket Program
2.8.1
2.8.2
2.8.3
2.8.4
Sending Data to the GPU 78
Rendering the Points
78
The Vertex Shader 79
The Fragment Shader
80
49
53
62
67
68
71
72
75
Contents
2.8.5
2.8.6
2.8.7
2.8.8
Combining the Parts
80
The initShaders Function
81
The init Function
82
Reading the Shaders from the Application
2.9
Polygons and Recursion
83
2.10
The Three-Dimensional Gasket
86
2.10.1
2.10.2
2.10.3
2.10.4
Use of Three-Dimensional Points
86
Naming Conventions
88
Use of Polygons in Three Dimensions
88
Hidden-Surface Removal
91
Summary and Notes
93
Suggested Readings
94
Exercises
CHAPTER 3
83
95
INTERACTION AND ANIMATION
99
3.1
Animation
3.1.1
3.1.2
3.1.3
3.1.4
3.1.5
The Rotating Square
100
The Display Process
102
Double Buffering
103
Using a Timer
104
Using requestAnimFrame 105
99
3.2
Interaction
106
3.3
Input Devices
107
3.4
Physical Input Devices
108
3.4.1
3.4.2
3.4.3
3.4.4
3.4.5
3.4.6
3.4.7
3.4.8
Keyboard Codes
108
The Light Pen
109
The Mouse and the Trackball
109
Data Tablets,Touch Pads, and Touch Screens
The Joystick
111
Multidimensional Input Devices
111
Logical Devices
112
Input Modes
113
3.5
Clients and Servers
3.6
Programming Event-Driven Input
3.6.1
3.6.2
3.6.3
3.6.4
3.6.5
Events and Event Listeners
Adding a Button
117
Menus
119
Using Keycodes
120
Sliders
121
3.7
Position Input
110
115
116
117
122
ix
x
Contents
3.8
Window Events
123
3.9
Picking
125
3.10
Building Models Interactively
126
3.11
Design of Interactive Programs
130
Summary and Notes
130
Suggested Readings
131
Exercises
CHAPTER 4
132
GEOMETRIC OBJECTS AND TRANSFORMATIONS
135
4.1
Scalars, Points, and Vectors
4.1.1
4.1.2
4.1.3
4.1.4
4.1.5
4.1.6
4.1.7
4.1.8
4.1.9
4.1.10
Geometric Objects 136
Coordinate-Free Geometry 138
The Mathematical View: Vector and Affine Spaces
The Computer Science View
139
Geometric ADTs
140
Lines
141
Affine Sums 141
Convexity
142
Dot and Cross Products
142
Planes
143
4.2
Three-Dimensional Primitives
145
4.3
Coordinate Systems and Frames
146
4.3.1
4.3.2
4.3.3
4.3.4
4.3.5
4.3.6
Representations and N-Tuples
148
Change of Coordinate Systems
149
Example: Change of Representation 151
Homogeneous Coordinates
153
Example: Change in Frames 155
Working with Representations 157
4.4
Frames in WebGL
159
4.5
Matrix and Vector Types
163
4.5.1
Row versus Column Major Matrix Representations
4.6
Modeling a Colored Cube
4.6.1
4.6.2
4.6.3
4.6.4
4.6.5
4.6.6
4.6.7
Modeling the Faces 166
Inward- and Outward-Pointing Faces 167
Data Structures for Object Representation
167
The Colored Cube 168
Color Interpolation
170
Displaying the Cube
170
Drawing with Elements 171
4.7
Affine Transformations
136
138
165
165
172
Contents
4.8
Translation, Rotation, and Scaling
4.8.1
4.8.2
4.8.3
Translation
175
Rotation
176
Scaling
177
4.9
Transformations in Homogeneous Coordinates
4.9.1
4.9.2
4.9.3
4.9.4
Translation
179
Scaling
181
Rotation
181
Shear
183
4.10
Concatenation of Transformations
4.10.1
4.10.2
4.10.3
4.10.4
Rotation About a Fixed Point
185
General Rotation
186
The Instance Transformation
187
Rotation About an Arbitrary Axis
188
4.11
Transformation Matrices in WebGL
4.11.1
4.11.2
4.11.3
4.11.4
4.11.5
Current Transformation Matrices
192
Basic Matrix Functions
193
Rotation, Translation, and Scaling 194
Rotation About a Fixed Point
195
Order of Transformations
195
4.12
Spinning of the Cube
4.12.1
Uniform Matrices
4.13
Interfaces to Three-Dimensional Applications
4.13.1
4.13.2
4.13.3
4.13.4
Using Areas of the Screen
201
A Virtual Trackball
201
Smooth Rotations
204
Incremental Rotation
205
4.14
Quaternions
4.14.1
4.14.2
4.14.3
Complex Numbers and Quaternions
Quaternions and Rotation
207
Quaternions and Gimbal Lock
209
184
191
196
200
206
210
Suggested Readings
211
CHAPTER 5
179
198
Summary and Notes
Exercises
175
206
211
VIEWING
215
5.1
Classical and Computer Viewing
5.1.1
5.1.2
5.1.3
Classical Viewing
217
Orthographic Projections
Axonometric Projections
217
218
215
xi
xii
Contents
5.1.4
5.1.5
Oblique Projections
Perspective Viewing
5.2
Viewing with a Computer
5.3
Positioning of the Camera
5.3.1
5.3.2
5.3.3
5.3.4
Positioning of the Camera Frame
Two Viewing APIs
229
The Look-At Function 232
Other Viewing APIs
233
5.4
Parallel Projections
5.4.1
5.4.2
5.4.3
5.4.4
5.4.5
5.4.6
Orthogonal Projections
234
Parallel Viewing with WebGL 235
Projection Normalization
236
Orthogonal Projection Matrices 237
Oblique Projections
239
An Interactive Viewer 242
5.5
Perspective Projections
5.5.1
Simple Perspective Projections
5.6
Perspective Projections with WebGL
5.6.1
Perspective Functions
5.7
Perspective Projection Matrices
5.7.1
5.7.2
5.7.3
Perspective Normalization 250
WebGL Perspective Transformations
Perspective Example
256
5.8
Hidden-Surface Removal
5.8.1
Culling
5.9
Displaying Meshes
5.9.1
5.9.2
5.9.3
Displaying Meshes as Surfaces 262
Polygon Offset 264
Walking through a Scene 265
5.10
Projections and Shadows
5.10.1
Projected Shadows
5.11
Shadow Maps
224
224
244
245
248
249
250
254
256
258
259
265
266
270
271
Suggested Readings
272
CHAPTER 6
222
234
Summary and Notes
Exercises
6.1
220
221
272
LIGHTING AND SHADING
275
Light and Matter
276
6.2
Light Sources
279
6.2.1
Color Sources
280
Contents
6.2.2
6.2.3
6.2.4
6.2.5
Ambient Light
280
Point Sources
281
Spotlights
282
Distant Light Sources
6.3
The Phong Reflection Model
6.3.1
6.3.2
6.3.3
6.3.4
Ambient Reflection
285
Diffuse Reflection
285
Specular Reflection
286
The Modified Phong Model
6.4
Computation of Vectors
6.4.1
6.4.2
Normal Vectors
289
Angle of Reflection
292
6.5
Polygonal Shading
6.5.1
6.5.2
6.5.3
Flat Shading
293
Smooth and Gouraud Shading
Phong Shading
296
6.6
Approximation of a Sphere by Recursive Subdivision
297
6.7
Specifying Lighting Parameters
299
6.7.1
6.7.2
Light Sources
299
Materials
301
6.8
Implementing a Lighting Model
6.8.1
6.8.2
6.8.3
Applying the Lighting Model in the Application
Efficiency
304
Lighting in the Vertex Shader
305
6.9
Shading of the Sphere Model
310
6.10
Per-Fragment Lighting
311
6.11
Nonphotorealistic Shading
313
6.12
Global Illumination
314
282
315
Suggested Readings
316
CHAPTER 7
288
289
293
Summary and Notes
Exercises
283
294
301
302
316
DISCRETE TECHNIQUES
319
7.1
Buffers
320
7.2
Digital Images
321
7.3
Mapping Methods
325
7.4
Two-Dimensional Texture Mapping
327
7.5
Texture Mapping in WebGL
333
7.5.1
Texture Objects
334
xiii
xiv
Contents
7.5.2
7.5.3
7.5.4
7.5.5
7.5.6
The Texture Image Array 335
Texture Coordinates and Samplers
Texture Sampling
341
Working with Texture Coordinates
Multitexturing
345
7.6
Texture Generation
348
7.7
Environment Maps
349
7.8
Reflection Map Example
353
7.9
Bump Mapping
357
7.9.1
7.9.2
Finding Bump Maps
Bump Map Example
7.10
Blending Techniques
7.10.1
7.10.2
7.10.3
7.10.4
7.10.5
7.10.6
7.10.7
7.10.8
Opacity and Blending 366
Image Blending
367
Blending in WebGL
367
Antialiasing Revisited 369
Back-to-Front and Front-to-Back Rendering 371
Scene Antialiasing and Multisampling
371
Image Processing
372
Other Multipass Methods
374
7.11
GPGPU
374
7.12
Framebuffer Objects
378
7.13
Buffer Ping-Ponging
384
7.14
Picking
387
8.1
365
392
Suggested Readings
393
CHAPTER 8
344
358
361
Summary and Notes
Exercises
336
394
FROM GEOMETRY TO PIXELS
397
Basic Implementation Strategies
398
400
8.2
Four Major Tasks
8.2.1
8.2.2
8.2.3
8.2.4
Modeling 400
Geometry Processing
Rasterization 402
Fragment Processing
8.3
Clipping
403
8.4
Line-Segment Clipping
404
8.4.1
8.4.2
Cohen-Sutherland Clipping 404
Liang-Barsky Clipping
407
401
403
Contents
8.5
Polygon Clipping
408
8.6
Clipping of Other Primitives
410
8.6.1
8.6.2
8.6.3
Bounding Boxes and Volumes
410
Curves, Surfaces, and Text
412
Clipping in the Framebuffer
413
8.7
Clipping in Three Dimensions
413
8.8
Rasterization
416
8.9
Bresenham’s Algorithm
418
8.10
Polygon Rasterization
420
8.10.1
8.10.2
8.10.3
8.10.4
8.10.5
Inside–Outside Testing
421
WebGL and Concave Polygons
Fill and Sort
423
Flood Fill
423
Singularities
424
8.11
Hidden-Surface Removal
8.11.1
8.11.2
8.11.3
8.11.4
8.11.5
8.11.6
8.11.7
Object-Space and Image-Space Approaches
424
Sorting and Hidden-Surface Removal 426
Scan Line Algorithms
426
Back-Face Removal
427
The z-Buffer Algorithm
429
Scan Conversion with the z-Buffer 431
Depth Sort and the Painter’s Algorithm
432
8.12
Antialiasing
435
8.13
Display Considerations
437
8.13.1
8.13.2
8.13.3
8.13.4
Color Systems
437
The Color Matrix
441
Gamma Correction
441
Dithering and Halftoning
442
Summary and Notes
443
Suggested Readings
445
Exercises
CHAPTER 9
422
424
445
MODELING AND HIERARCHY
449
9.1
Symbols and Instances
450
9.2
Hierarchical Models
451
9.3
A Robot Arm
453
9.4
Trees and Traversal
9.4.1
A Stack-Based Traversal
456
9.5
Use of Tree Data Structures
457
460
xv
xvi
Contents
9.6
9.7
Animation
Graphical Objects
9.7.1
9.7.2
9.7.3
9.7.4
Methods, Attributes, and Messages 466
A Cube Object 467
Objects and Hierarchy
468
Geometric and Nongeometric Objects 469
9.8
9.9
9.10
Scene Graphs
Implementing Scene Graphs
Other Tree Structures
9.10.1
9.10.2
9.10.3
CSG Trees
474
BSP Trees 475
Quadtrees and Octrees
479
Suggested Readings
480
CHAPTER 10
470
472
474
478
Summary and Notes
Exercises
10.1
10.2
464
465
480
PROCEDURAL METHODS
483
Algorithmic Models
Physically Based Models and Particle Systems
483
485
486
10.3
Newtonian Particles
10.3.1
10.3.2
10.3.3
Independent Particles
488
Spring Forces
488
Attractive and Repulsive Forces
10.4
10.5
Solving Particle Systems
Constraints
10.5.1
10.5.2
Collisions 494
Soft Constraints
10.6
A Simple Particle System
10.6.1
10.6.2
10.6.3
10.6.4
10.6.5
Displaying the Particles 498
Updating Particle Positions
498
Collisions 499
Forces 500
Flocking
500
10.7
10.8
10.9
Agent-Based Models
Language-Based Models
Recursive Methods and Fractals
10.9.1
10.9.2
10.9.3
10.9.4
Rulers and Length 508
Fractal Dimension
509
Midpoint Division and Brownian Motion
Fractal Mountains
511
490
491
494
496
497
501
503
507
510
Contents
10.9.5
10.9.6
The Mandelbrot Set
512
Mandelbrot Fragment Shader
516
10.10 Procedural Noise
517
Summary and Notes
521
Suggested Readings
521
Exercises
CHAPTER 11
522
CURVES AND SURFACES
525
525
11.1
Representation of Curves and Surfaces
11.1.1
11.1.2
11.1.3
11.1.4
11.1.5
Explicit Representation
525
Implicit Representations
527
Parametric Form
528
Parametric Polynomial Curves
529
Parametric Polynomial Surfaces
530
11.2
Design Criteria
530
11.3
Parametric Cubic Polynomial Curves
532
11.4
Interpolation
533
11.4.1
11.4.2
Blending Functions
534
The Cubic Interpolating Patch
11.5
Hermite Curves and Surfaces
11.5.1
11.5.2
The Hermite Form
538
Geometric and Parametric Continuity
11.6
Bézier Curves and Surfaces
11.6.1
11.6.2
Bézier Curves
542
Bézier Surface Patches
11.7
Cubic B-Splines
11.7.1
11.7.2
11.7.3
The Cubic B-Spline Curve
B-Splines and Basis
548
Spline Surfaces
549
11.8
General B-Splines
11.8.1
11.8.2
11.8.3
11.8.4
11.8.5
Recursively Defined B-Splines
Uniform Splines
552
Nonuniform B-Splines
552
NURBS
553
Catmull-Rom Splines
554
11.9
Rendering Curves and Surfaces
11.9.1
11.9.2
11.9.3
11.9.4
Polynomial Evaluation Methods
556
Recursive Subdivision of Bézier Polynomials
557
Rendering Other Polynomial Curves by Subdivision
Subdivision of Bézier Surfaces 561
536
538
540
541
544
545
545
550
551
555
560
xvii
xviii
Contents
11.10 The Utah Teapot
562
11.11 Algebraic Surfaces
565
11.11.1 Quadrics
565
11.11.2 Rendering of Surfaces by Ray Casting
566
11.12 Subdivision Curves and Surfaces
11.12.1 Mesh Subdivision
567
568
11.13 Mesh Generation from Data
11.13.1 Height Fields Revisited
11.13.2 Delaunay Triangulation
11.13.3 Point Clouds
575
571
571
571
11.14 Graphics API support for Curves and Surfaces
576
11.14.1 Tessellation Shading 576
11.14.2 Geometry Shading
577
Summary and Notes
577
Suggested Readings
578
Exercises
CHAPTER 12
578
ADVANCED RENDERING
581
12.1
Going Beyond Pipeline Rendering
581
12.2
Ray Tracing
582
12.3
Building a Simple Ray Tracer
586
12.3.1
12.3.2
12.3.3
Recursive Ray Tracing 586
Calculating Intersections 588
Ray-Tracing Variations
590
12.4
The Rendering Equation
591
12.5
Radiosity
593
12.5.1
12.5.2
12.5.3
12.5.4
The Radiosity Equation 594
Solving the Radiosity Equation 595
Computing Form Factors
597
Carrying Out Radiosity 599
12.6
Global Illumination and Path Tracing
600
12.7
RenderMan
602
12.8
Parallel Rendering
603
12.8.1
12.8.2
12.8.3
Sort-Middle Rendering
605
Sort-Last Rendering 606
Sort-First Rendering
610
12.9
Hardware GPU Implementations
12.10 Implicit Functions and Contour Maps
12.10.1 Marching Squares
613
611
612
Contents
12.10.2 Marching Triangles
617
12.11 Volume Rendering
618
12.11.1 Volumetric Data Sets
618
12.11.2 Visualization of Implicit Functions
619
12.12 Isosurfaces and Marching Cubes
621
12.13 Marching Tetrahedra
624
12.14 Mesh Simplification
625
12.15 Direct Volume Rendering
625
12.15.1 Assignment of Color and Opacity
626
12.15.2 Splatting
627
12.15.3 Volume Ray Tracing
628
12.15.4 Texture Mapping of Volumes
629
12.16 Image-Based Rendering
12.16.1 A Simple Example
Summary and Notes
632
Suggested Readings
633
Exercises
APPENDIX A
630
630
634
INITIALIZING SHADERS
637
A.1
Shaders in the HTML file
637
A.2
Reading Shaders from Source Files
640
APPENDIX B
SPACES
643
B.1
Scalars
643
B.2
Vector Spaces
644
B.3
Affine Spaces
646
B.4
Euclidean Spaces
647
B.5
Projections
648
B.6
Gram-Schmidt Orthogonalization
649
Suggested Readings
Exercises
APPENDIX C
650
650
MATRICES
651
C.1
Definitions
651
C.2
Matrix Operations
652
C.3
Row and Column Matrices
653
C.4
Rank
654
C.5
Change of Representation
655
xix
xx
Contents
C.6
C.7
C.8
The Cross Product
Eigenvalues and Eigenvectors
Vector and Matrix Objects
Suggested Readings
Exercises
APPENDIX D
D.1
D.2
D.3
659
660
SAMPLING AND ALIASING
Sampling Theory
Reconstruction
Quantization
References
669
WebGL Index
681
Subject Index
683
657
657
659
661
661
666
668
P RE FACE
T
his book is an introduction to computer graphics with an emphasis on applications programming. The first edition, which was published in 1997, was somewhat revolutionary in using OpenGL and a top-down approach. Over the succeeding
16 years and 6 editions, this approach has been adopted by most introductory classes
in computer graphics and by virtually all the competing textbooks.
The sixth edition reflected the recent major changes in graphics software due to
major changes in graphics hardware. In particular, the sixth edition was fully shaderbased, enabling readers to create applications that could fully exploit the capabilities
of modern GPUs. We noted that these changes are also part of OpenGL ES 2.0, which
is being used to develop applications for embedded systems and handheld devices,
such as cell phones and tablets, and of WebGL, its JavaScript implementation. At the
time, we did not anticipate the extraordinary interest in WebGL that began as soon
as web browsers became available that support WebGL through HTML5.
As we continued to write our books, teach our SIGGRAPH courses, and pursue
other graphics-related activities, we became aware of the growing excitement about
WebGL. WebGL applications were running everywhere, including on some of the
latest smart phones, and even though WebGL lacks some of the advanced features
of the latest versions of OpenGL, the ability to integrate it with HTML5 opened up a
wealth of new application areas. As an added benefit, we found it much better suited
than desktop OpenGL for teaching computer graphics. Consequently, we decided to
do a seventh edition that uses WebGL exclusively. We believe that this edition is every
bit as revolutionary as any of the previous editions.
New to the Seventh Edition
WebGL is used throughout.
All code is written in JavaScript.
All code runs in recent web browsers.
A new chapter on interaction is included.
Additional material on render-to-texture has been added.
Additional material on displaying meshes has been added.
An efficient matrix–vector package is included.
An introduction to agent-based modeling has been added.
xxi
xxii
Preface
A Top-Down Approach
Recent advances and the success of the first six editions continue to reinforce our
belief in a top-down, programming-oriented approach to introductory computer
graphics. Although many computer science and engineering departments now support more than one course in computer graphics, most students will take only a
single course. Such a course usually is placed in the curriculum after students have already studied programming, data structures, algorithms, software engineering, and
basic mathematics. Consequently, a class in computer graphics allows the instructor to build on these topics in a way that can be both informative and fun. We want
these students to be programming three-dimensional applications as soon as possible. Low-level algorithms, such as those that draw lines or fill polygons, can be dealt
with later, after students are creating graphics.
When asked “why teach programming,” John Kemeny, a pioneer in computer
education, used a familiar automobile analogy: You don’t have to know what’s under
the hood to be literate, but unless you know how to program, you’ll be sitting in the
back seat instead of driving. That same analogy applies to the way we teach computer
graphics. One approach—the algorithmic approach—is to teach everything about
what makes a car function: the engine, the transmission, the combustion process.
A second approach—the survey approach—is to hire a chauffeur, sit back, and see
the world as a spectator. The third approach—the programming approach that we
have adopted here—is to teach you how to drive and how to take yourself wherever
you want to go. As the old auto rental commercial used to say, “Let us put you in the
driver’s seat.”
Programming with WebGL and JavaScript
When Ed began teaching computer graphics 30 years ago, the greatest impediment
to implementing a programming-oriented course, and to writing a textbook for that
course, was the lack of a widely accepted graphics library or application programming
interface (API). Difficulties included high cost, limited availability, lack of generality,
and high complexity. The development of OpenGL resolved most of the difficulties
many of us had experienced with other APIs and with the alternative of using homebrewed software. OpenGL today is supported on all platforms and is widely accepted
as a cross-platform standard.
A graphics class teaches far more than the use of a particular API, but a good API
makes it easier to teach key graphics topics, including three-dimensional graphics,
lighting and shading, client–server graphics, modeling, and implementation algorithms. We believe that OpenGL’s extensive capabilities and well-defined architecture
lead to a stronger foundation for teaching both theoretical and practical aspects of
the field and for teaching advanced concepts, including texture mapping, compositing, and programmable shaders.
Ed switched his classes to OpenGL about 18 years ago and the results astounded him. By the middle of the semester, every student was able to write a
moderately complex three-dimensional application that required understanding of
three-dimensional viewing and event-driven input. In the previous years of teaching
Preface
computer graphics, he had never come even close to this result. That class led to the
first edition of this book.
This book is a textbook on computer graphics; it is not an OpenGL or WebGL
manual. Consequently, it does not cover all aspects of the WebGL API but rather
explains only what is necessary for mastering this book’s contents. It presents WebGL
at a level that should permit users of other APIs to have little difficulty with the
material.
Unlike previous editions, this one uses WebGL and JavaScript for all the examples. WebGL is a JavaScript implementation of OpenGL ES 2.0 and runs in most
recent browsers. Because it is supported by HTML5, not only does it provide compatibility with other applications but also there are no platform dependences; WebGL
runs within the browser and makes use of the local graphics hardware. Although
JavaScript is not the usual programming language with which we teach most programming courses, it is the language of the Web. Over the past few years, JavaScript
has become increasingly more powerful and our experience is that students who are
comfortable with Java, C, or C++ will have little trouble programming in JavaScript.
All the modern versions of OpenGL, including WebGL, require every application
to provide two shaders written in the OpenGL Shading Language (GLSL). GLSL is
similar to C but adds vectors and matrices as basic types, along with some C++
features such as operator overloading. We have added a JavaScript library MV.js that
supports both our presentation of graphics functions and the types and operations
in GLSL.
Intended Audience
This book is suitable for advanced undergraduates and first-year graduate students
in computer science and engineering and for students in other disciplines who have
good programming skills. The book also will be useful to many professionals. Between us, we have taught well over 100 short courses for professionals; our experiences with these nontraditional students have had a great influence on what we chose
to include in the book.
Prerequisites for the book are good programming skills in JavaScript, C, C++, or
Java; an understanding of basic data structures (linked lists, trees); and a rudimentary
knowledge of linear algebra and trigonometry. We have found that the mathematical
backgrounds of computer science students, whether undergraduates or graduates,
vary considerably. Hence, we have chosen to integrate into the text much of the linear
algebra and geometry that is required for fundamental computer graphics.
Organization of the Book
The book is organized as follows. Chapter 1 provides an overview of the field and
introduces image formation by optical devices; thus, we start with three-dimensional
concepts immediately. Chapter 2 introduces programming using WebGL. Although
the first example program that we develop (each chapter has one or more complete
programming examples) is two-dimensional, it is embedded in a three-dimensional
setting and leads to a three-dimensional extension. We introduce interactive graphics
xxiii
xxiv
Preface
in Chapter 3 and develop event-driven graphics within the browser environment.
Chapters 4 and 5 concentrate on three-dimensional concepts. Chapter 4 is concerned
with defining and manipulating three-dimensional objects, whereas Chapter 5 is
concerned with viewing them. Chapter 6 introduces light–material interactions and
shading. Chapter 7 introduces many of the new discrete capabilities that are now
supported in graphics hardware and by WebGL. All these techniques involve working
with various buffers. These chapters should be covered in order and can be taught in
about 10 weeks of a 15-week semester.
The last five chapters can be read in almost any order. All five are somewhat
open-ended and can be covered at a survey level, or individual topics can be pursued
in depth. Chapter 8 surveys implementation. It gives one or two major algorithms for
each of the basic steps, including clipping, line generation, and polygon fill. Chapter 9 includes a number of topics that fit loosely under the heading of hierarchical
modeling. The topics range from building models that encapsulate the relationships
between the parts of a model, to high-level approaches to graphics over the Internet. Chapter 9 also includes an introduction to scene graphs. Chapter 10 introduces a
number of procedural methods, including particle systems, fractals, and procedural
noise. Curves and surfaces, including subdivision surfaces, are discussed in Chapter 11. Chapter 12 surveys alternate approaches to rendering. It includes expanded
discussions of ray tracing and radiosity, and an introduction to image-based rendering and parallel rendering.
Appendix A presents the details of the WebGL functions needed to read, compile,
and link the application and shaders. Appendices B and C contain a review of the
background mathematics. Appendix D discusses sampling and aliasing starting with
Nyquist’s theorem and applying these results to computer graphics.
Changes from the Sixth Edition
The reaction of readers to the first six editions of this book was overwhelmingly
positive, especially to the use of OpenGL and the top-down approach. In the sixth
edition, we abandoned the fixed-function pipeline and went to full shader-based
OpenGL. In this edition, we move to WebGL, which is not only fully shader-based—
each application must provide at least a vertex shader and a fragment shader–but also
a version that works within the latest web browsers.
Applications are written in JavaScript. Although JavaScript has its own idiosyncrasies, we do not expect that students with experience in a high-level language, such
as Java, C, or C++, will experience any serious problems with it.
As we pointed out earlier in this preface, every application must provide its own
shaders. Consequently, programmable shaders and GLSL need to be introduced in
Chapter 2. Many of the examples produce the same output as in previous editions,
but the code is very different.
In the sixth edition, we eliminated a separate chapter on input and interaction,
incorporating the material in other chapters. With this edition, we revert to a separate
chapter. This decision is based on the ease and flexibility with which we can integrate
event-driven input with WebGL through HTML5.
Preface
We have added additional material on off-screen rendering and render-totexture. These techniques have become fundamental to using GPUs for a variety of
compute-intensive applications such as image processing and simulation.
Given the positive feedback we’ve received on the core material from Chapters 1–6 in previous editions, we’ve tried to keep the changes to those chapters to
a minimum. We see Chapters 1–7 as the core of any introductory course in computer
graphics. Chapters 8–12 can be used in almost any order, either as a survey in a onesemester course or as the basis of a two-semester sequence.
Support Materials
The support for the book is on the Web, both through the author’s website www.cs
.unm.edu/~angel and at www.pearsonhighered.com. Support material that is available to all readers of this book includes
Sources of information on WebGL
Program code
Solutions to selected exercises
PowerPoint lectures
Figures from the book
Additional support materials, including solutions to all the nonprogramming
exercises, are available only to instructors adopting this textbook for classroom
use. Please contact your school’s Pearson Education representative or visit www
.pearsonhighered.com/irc for information on obtaining access to this material.
Acknowledgments
Ed has been fortunate over the past few years to have worked with wonderful students
at the University of New Mexico. They were the first to get him interested in OpenGL,
and he has learned much from them. They include Ye Cong, Pat Crossno, Tommie
Daniel, Chris Davis, Lisa Desjarlais, Kim Edlund, Lee Ann Fisk, Maria Gallegos,
Brian Jones, Christopher Jordan, Takeshi Hakamata, Max Hazelrigg, Sheryl Hurley,
Thomas Keller, Ge Li, Pat McCormick, Al McPherson, Ken Moreland, Martin Muller,
David Munich, Jim Pinkerton, Jim Prewett, Dave Rogers, Hal Smyer, Dave Vick, Hue
(Bumgarner-Kirby) Walker, Brian Wylie, and Jin Xiong. Many of the examples in the
color plates were created by these students.
The first edition of this book was written during Ed’s sabbatical; various parts
were written in five different countries. The task would not have been accomplished
without the help of a number of people and institutions that made their facilities
available to him. He is greatly indebted to Jonas Montilva and Chris Birkbeck of
the Universidad de los Andes (Venezuela), to Rodrigo Gallegos and Aristides Novoa
of the Universidad Tecnologica Equinoccial (Ecuador), to Long Wen Chang of the
National Tsing Hua University (Taiwan), and to Kim Hong Wong and Pheng Ann
Heng of the Chinese University of Hong Kong. Ramiro Jordan of ISTEC and the
University of New Mexico made possible many of these visits. John Brayer and Jason
xxv
xxvi
Preface
Stewart at the University of New Mexico and Helen Goldstein at Addison-Wesley
somehow managed to get a variety of items to him wherever he happened to be. His
website contains a description of his adventures writing the first edition.
David Kirk and Mark Kilgard at NVIDIA were kind enough to provide graphics
cards for testing many of the algorithms. A number of other people provided significant help. Ed thanks Ben Bederson, Gonzalo Cartagenova, Tom Caudell, Kathi
Collins, Kathleen Danielson, Roger Ehrich, Robert Geist, Chuck Hansen, Mark
Henne, Bernard Moret, Dick Nordhaus, Helena Saona, Dave Shreiner, Vicki Shreiner,
Gwen Sylvan, and Mason Woo. Mark Kilgard, Brian Paul, and Nate Robins are owed
a great debt by the OpenGL community for creating software that enables OpenGL
code to be developed over a variety of platforms.
At the University of New Mexico, the Art, Research, Technology, and Science
Laboratory (ARTS Lab) and the Center for High Performance Computing have provided support for many of Ed’s projects. The Computer Science Department, the
Arts Technology Center in the College of Fine Arts, the National Science Foundation,
Sandia National Laboratories, and Los Alamos National Laboratory have supported
many of Ed’s students and research projects that led to parts of this book. David Beining, formerly with the Lodestar Astronomy Center and now at the ARTS Lab, has
provided tremendous support for the Fulldome Project. Sheryl Hurley, Christopher
Jordan, Laurel Ladwig, Jon Strawn and Hue (Bumgarner-Kirby) Walker provided
some of the images in the color plates through Fulldome projects. Hue Walker has
done the wonderful covers for previous editions and some of the examples in the
Color Plates.
Ed would also like to acknowledge the informal group that started at the Santa
Fe Complex, including Jeff Bowles, Ruth Chabay, Stephen Guerin, Bruce Sherwood,
Scott Wittenberg, and especially JavaScript evangelist Owen Densmore, who convinced him to teach a graphics course in Santa Fe in exchange for getting him involved
with JavaScript. We’ve all gained by the experience.
Dave would like first to thank Ed for asking him to participate in this project.
We’ve exchanged ideas on OpenGL and how to teach it for many years, and it’s
exciting to advance those concepts to new audiences. Dave would also like to thank
those who created OpenGL, and who worked at Silicon Graphics Computer Systems,
leading the way in their day. He would like to recognize the various Khronos working
groups who continue to evolve the API and bring graphics to unexpected places.
Finally, as Ed mentioned, SIGGRAPH has featured prominently in the development
of these materials, and is definitely owed a debt of gratitude for providing access to
enthusiastic test subjects for exploring our ideas.
Reviewers of the manuscript drafts provided a variety of viewpoints on what we
should include and what level of presentation we should use. These reviewers for
previous editions include Gur Saran Adhar (University of North Carolina at Wilmington), Mario Agrular (Jacksonville State University), Michael Anderson (University
of Hartford), Norman I. Badler (University of Pennsylvania), Mike Bailey (Oregon
State University), Marty Barrett (East Tennessee State University), C. S. Bauer (University of Central Florida), Bedrich Benes (Purdue University), Kabekode V. Bhat
(The Pennsylvania State University), Isabelle Bichindaritz (University of Washington,
Preface
Tacoma), Cory D. Boatright (University of Pennsylvania), Eric Brown, Robert P. Burton (Brigham Young University), Sam Buss (University of California, San Diego), Kai
H. Chang (Auburn University), James Cremer (University of Iowa), Ron DiNapoli
(Cornell University), John David N. Dionisio (Loyola Marymount University), Eric
Alan Durant (Milwaukee School of Engineering), David S. Ebert (Purdue University), Richard R. Eckert (Binghamton University), W. Randolph Franklin (Rensselaer
Polytechnic Institute), Natacha Gueorguieva (City University of New York/College of
Staten Island), Jianchao (Jack) Han (California State University, Dominguez Hills),
Chenyi Hu (University of Central Arkansas), George Kamberov (Stevens Institute
of Technology), Mark Kilgard (NVIDIA Corporation), Lisa B. Lancor (Southern
Connecticut State University), Chung Lee (California State Polytechnic University,
Pomona), John L. Lowther (Michigan Technological University), R. Marshall (Boston
University and Bridgewater State College), Hugh C. Masterman (University of Massachusetts, Lowell), Bruce A. Maxwell (Swathmore College), Tim McGraw (West Virginia University), James R. Miller (University of Kansas), Rodrigo Obando (Columbus State University), Jon A. Preston (Southern Polytechnic State University), Andrea
Salgian (The College of New Jersey), Lori L. Scarlatos (Brooklyn College, CUNY),
Han-Wei Shen (The Ohio State University), Oliver Staadt (University of California, Davis), Stephen L. Stepoway (Southern Methodist University), Bill Toll (Taylor
University), Michael Wainer (Southern Illinois University, Carbondale), Yang Wang
(Southern Methodist State University), Steve Warren (Kansas State University), Mike
Way (Florida Southern College), George Wolberg (City College of New York), Xiaoyu Zhang (California State University San Marcos), Ye Zhao (Kent State University). and Ying Zhu (Georgia State University). Although the final decisions may not
reflect their views—which often differed considerably from one another—each reviewer forced us to reflect on every page of the manuscript.
The reviewers for this edition were particularly supportive. They include Mike
Bailey (Oregon State University), Patrick Cozzi (University of Pennsylvania and Analytic Graphics, Inc) and Jeff Parker (Harvard University). All of them were familiar
with previous editions and excited about the potential of moving their classes to
WebGL.
We would also like to acknowledge the entire production team at AddisonWesley. Ed’s editors, Peter Gordon, Maite Suarez-Rivas, and Matt Goldstein, have
been a pleasure to work with through seven editions of this book and the OpenGL
primer. For this edition, Marilyn Lloyd and Kayla Smith-Tarbox at Pearson have provided considerable help. Through seven editions, Paul Anagnostopoulos at Windfall
Software has always been more than helpful in assisting with TEX problems. Ed is
especially grateful to Lyn Dupré. If the readers could see the original draft of the first
edition, they would understand the wonders that Lyn does with a manuscript.
Ed wants to particularly recognize his wife, Rose Mary Molnar, who did the
figures for his first graphics book, many of which form the basis for the figures
in this book. Probably only other authors can fully appreciate the effort that goes
into the book production process and the many contributions and sacrifices our
partners make to that effort. The dedication to this book is a sincere but inadequate
recognition of all of Rose Mary’s contributions to Ed’s work.
xxvii
xxviii
Preface
Dave would like to recognize the support and encouragement of Vicki, his wife,
without whom creating works like this would never occur. Not only does she provide
warmth and companionship but also provides invaluable feedback on our presentation and materials. She’s been a valuable, unrecognized partner in all of Dave’s
OpenGL endeavors.
Ed Angel
Dave Shreiner
Preface
xxix
This page intentionally left blank
CHA P TE R
1
GRAPHICS SYSTEMS
AND MODELS
I
t would be difficult to overstate the importance of computer and communication
technologies in our lives. Activities as wide-ranging as filmmaking, publishing,
banking, and education have undergone revolutionary changes as these technologies
alter the ways in which we conduct our daily activities. The combination of computers, networks, and the complex human visual system, through computer graphics,
has been instrumental in these advances and has led to new ways of displaying information, seeing virtual worlds, and communicating with both other people and
machines.
Computer graphics is concerned with all aspects of producing pictures or images using a computer. The field began humbly 50 years ago, with the display of a few
lines on a cathode-ray tube (CRT); now, we can generate images by computer that
are indistinguishable from photographs of real objects. We routinely train pilots with
simulated airplanes, generating graphical displays of a virtual environment in real
time. Feature-length movies made entirely by computer have been successful, both
critically and financially.
In this chapter, we start our journey with a short discussion of applications of
computer graphics. Then we overview graphics systems and imaging. Throughout
this book, our approach stresses the relationships between computer graphics and
image formation by familiar methods, such as drawing by hand and photography. We
will see that these relationships can help us to design application programs, graphics
libraries, and architectures for graphics systems.
In this book, we will use WebGL, a graphics software system supported by most
modern web browsers. WebGL is a version of OpenGL, which is the widely accepted
standard for developing graphics applications. WebGL is easy to learn, and it possesses most of the characteristics of the full (or desktop) OpenGL and of other important graphics systems. Our approach is top-down. We want you to start writing,
as quickly as possible, application programs that will generate graphical output. After you begin writing simple programs, we shall discuss how the underlying graphics
library and the hardware are implemented. This chapter should give a sufficient overview for you to proceed to writing programs.
1
2
Chapter 1
Graphics Systems and Models
1.1
APPLICATIONS OF COMPUTER GRAPHICS
The development of computer graphics has been driven both by the needs of the user
community and by advances in hardware and software. The applications of computer
graphics are many and varied; we can, however, divide them into four major areas:
1. Display of information
2. Design
3. Simulation and animation
4. User interfaces
Although many applications span two or more of these areas, the development of the
field was based largely on separate work in each.
1.1.1 Display of Information
Classical graphics techniques arose as a medium to convey information among
people. Although spoken and written languages serve a similar purpose, the human
visual system is unrivaled both as a processor of data and as a pattern recognizer.
More than 4000 years ago, the Babylonians displayed floor plans of buildings on
stones. More than 2000 years ago, the Greeks were able to convey their architectural
ideas graphically, even though the related mathematics was not developed until the
Renaissance. Today, the same type of information is generated by architects, mechanical designers, and draftspeople using computer-based drafting systems.
For centuries, cartographers have developed maps to display celestial and geographical information. Such maps were crucial to navigators as these people explored
the ends of the earth; maps are no less important today in fields such as geographic
information systems. Now, maps can be developed and manipulated in real time over
the Internet.
During the past 100 years, workers in the field of statistics have explored techniques for generating plots that aid the viewer in determining the information in a
set of data. Now, we have computer plotting packages that provide a variety of plotting techniques and color tools that can handle multiple large data sets. Nevertheless,
it is still the human ability to recognize visual patterns that ultimately allows us to
interpret the information contained in the data. The field of information visualization is becoming increasingly more important as we have to deal with understanding
complex phenomena, from problems in bioinformatics to detecting security threats.
Medical imaging poses interesting and important data analysis problems. Modern imaging technologies—such as computed tomography (CT), magnetic resonance
imaging (MRI), ultrasound, and positron-emission tomography (PET)—generate
three-dimensional data that must be subjected to algorithmic manipulation to provide useful information. Color Plate 20 shows an image of a person’s head in which
the skin is displayed as transparent and the internal structures are displayed as
opaque. Although the data were collected by a medical imaging system, computer
graphics produced the image that shows the structures.
1.1 Applications of Computer Graphics
Supercomputers now allow researchers in many areas to solve previously intractable problems. The field of scientific visualization provides graphical tools that
help these researchers interpret the vast quantity of data that they generate. In fields
such as fluid flow, molecular biology, and mathematics, images generated by conversion of data to geometric entities that can be displayed have yielded new insights into
complex processes. For example, Color Plate 19 shows fluid dynamics in the mantle
of the earth. The system used a mathematical model to generate the data. We present
various visualization techniques as examples throughout the rest of the text.
1.1.2 Design
Professions such as engineering and architecture are concerned with design. Starting
with a set of specifications, engineers and architects seek a cost-effective and aesthetic
solution that satisfies the specifications. Design is an iterative process. Rarely in the
real world is a problem specified such that there is a unique optimal solution. Design
problems are either overdetermined, such that they possess no solution that satisfies
all the criteria, much less an optimal solution, or underdetermined, such that they
have multiple solutions that satisfy the design criteria. Thus, the designer works in an
iterative manner. She generates a possible design, tests it, and then uses the results as
the basis for exploring other solutions.
The power of the paradigm of humans interacting with images on the screen
of a CRT was recognized by Ivan Sutherland over 50 years ago. Today, the use of
interactive graphical tools in computer-aided design (CAD) pervades fields such as
architecture and the design of mechanical parts and of very-large-scale integrated
(VLSI) circuits. In many such applications, the graphics are used in a number of
distinct ways. For example, in a VLSI design, the graphics provide an interactive
interface between the user and the design package, usually by means of such tools
as menus and icons. In addition, after the user produces a possible design, other
tools analyze the design and display the analysis graphically. Color Plates 9 and 10
show two views of the same architectural design. Both images were generated with the
same CAD system. They demonstrate the importance of having the tools available to
generate different images of the same objects at different stages of the design process.
1.1.3 Simulation and Animation
Once graphics systems evolved to be capable of generating sophisticated images in
real time, engineers and researchers began to use them as simulators. One of the most
important uses has been in the training of pilots. Graphical flight simulators have
proved both to increase safety and to reduce training expenses. The use of special
VLSI chips has led to a generation of arcade games as sophisticated as flight simulators. Games and educational software for home computers are almost as impressive.
The success of flight simulators led to the use of computer graphics for animation in the television, motion picture, and advertising industries. Entire animated
movies can now be made by computer at a cost less than that of movies made with
traditional hand-animation techniques. The use of computer graphics with hand animation allows the creation of technical and artistic effects that are not possible with
either alone. Whereas computer animations have a distinct look, we can also generate
3
4
Chapter 1
Graphics Systems and Models
photorealistic images by computer. Images that we see on television, in movies, and
in magazines often are so realistic that we cannot distinguish computer-generated or
computer-altered images from photographs. In Chapter 6, we discuss many of the
lighting effects used to produce computer animations. Color Plates 15 and 23 show
realistic lighting effects that were created by artists and computer scientists using animation software. Although these images were created for commercial animations,
interactive software to create such effects is widely available.
The field of virtual reality (VR) has opened up many new horizons. A human
viewer can be equipped with a display headset that allows her to see separate images
with her right eye and her left eye so that she has the effect of stereoscopic vision. In
addition, her body location and position, possibly including her head and finger positions, are tracked by the computer. She may have other interactive devices available,
including force-sensing gloves and sound. She can then act as part of a computergenerated scene, limited only by the image generation ability of the computer. For
example, a surgical intern might be trained to do an operation in this way, or an astronaut might be trained to work in a weightless environment. Color Plate 22 shows
one frame of a VR simulation of a simulated patient used for remote training of medical personnel.
Simulation and virtual reality have come together in many exciting ways in the
film industry. Recently, stereo (3D) movies have become both profitable and highly
acclaimed by audiences. Special effects created using computer graphics are part of
virtually all movies, as are more mundane uses of computer graphics such as removal
of artifacts from scenes. Simulations of physics are used to create visual effects ranging
from fluid flow to crowd dynamics.
1.1.4 User Interfaces
Our interaction with computers has become dominated by a visual paradigm that includes windows, icons, menus, and a pointing device, such as a mouse. From a user’s
perspective, windowing systems such as the X Window System, Microsoft Windows,
and the Macintosh Operating System differ only in details. More recently, millions of
people have become users of the Internet. Their access is through graphical network
browsers, such as Firefox, Chrome, Safari, and Internet Explorer, that use these same
interface tools. We have become so accustomed to this style of interface that we often
forget that what we are doing is working with computer graphics.
Although personal computers and workstations evolved by somewhat different
paths, at present they are indistinguishable. When you add in smart phones, tablets,
and game consoles, we have an incredible variety of devices with considerable computing power, all of which can access the World Wide Web through a browser. For
lack of a better term, we will tend to use computer to include all these devices.
Color Plate 13 shows the interface used with a high-level modeling package.
It demonstrates the variety of tools available in such packages and the interactive
devices the user can employ in modeling geometric objects. Although we are familiar
with this style of graphical user interface, devices such as smart phones and tablets
have popularized touch-sensitive interfaces that allow the user to interact with every
pixel on the display.
1.2 A Graphics System
1.2
A GRAPHICS SYSTEM
A computer graphics system is a computer system; as such, it must have all the
components of a general-purpose computer system. Let us start with the high-level
view of a graphics system, as shown in the block diagram in Figure 1.1. There are six
major elements in our system:
1. Input devices
2. Central Processing Unit
3. Graphics Processing Unit
4. Memory
5. Framebuffer
6. Output devices
This model is general enough to include workstations and personal computers, interactive game systems, mobile phones, GPS systems, and sophisticated image generation systems. Although most of the components are present in a standard computer,
it is the way each element is specialized for computer graphics that characterizes this
diagram as a portrait of a graphics system. As more and more functionality can be
included in a single chip, many of the components are not physically separate. The
CPU and GPU can be on the same chip and their memory can be shared. Nevertheless, the model still describes the software architecture and will be helpful as we study
the various parts of computer graphics systems.
1.2.1 Pixels and the Framebuffer
Virtually all modern graphics systems are raster based. The image we see on the
output device is an array—the raster—of picture elements, or pixels, produced by
the graphics system. As we can see from Figure 1.2, each pixel corresponds to a location, or small area, in the image. Collectively, the pixels are stored in a part of
FIGURE 1.1 A graphics system.
Processor
(CPU)
Graphics
processor
CPU
memory
GPU
memory
Framebuffer
5
6
Chapter 1
Graphics Systems and Models
(b)
(a)
FIGURE 1.2 Pixels. (a) Image of Yeti the cat. (b) Detail of area around one eye
showing individual pixels.
memory called the framebuffer.1 The framebuffer can be viewed as the core element of a graphics system. Its resolution—the number of pixels in the framebuffer—
determines the detail that you can see in the image. The depth, or precision, of the
framebuffer, defined as the number of bits that are used for each pixel, determines
properties such as how many colors can be represented on a given system. For example, a 1-bit-deep framebuffer allows only two colors, whereas an 8-bit-deep framebuffer allows 28 (256) colors. In full-color systems, there are 24 (or more) bits per
pixel. Such systems can display sufficient colors to represent most images realistically.
They are also called true-color systems, or RGB color systems, because individual
groups of bits in each pixel are assigned to each of the three primary colors—red,
green, and blue—used in most displays. High dynamic range (HDR) systems use 12
or more bits for each color component. Until recently, framebuffers stored colors in
integer formats. Recent framebuffers use floating point and thus support HDR colors
more easily.
In a simple system, the framebuffer holds only the colored pixels that are displayed on the screen. In most systems, the framebuffer holds far more information,
such as depth information needed for creating images from three-dimensional data.
In these systems, the framebuffer comprises multiple buffers, one or more of which
are color buffers that hold the colored pixels that are displayed. For now, we can use
the terms framebuffer and color buffer synonymously without confusion.
1.2.2 The CPU and the GPU
In a simple system, there may be only one processor, the central processing unit
(CPU), which must perform both the normal processing and the graphical process-
1. Some references use frame buffer rather than framebuffer.
1.2 A Graphics System
ing. The main graphical function of the processor is to take specifications of graphical
primitives (such as lines, circles, and polygons) generated by application programs
and to assign values to the pixels in the framebuffer that best represent these entities.
For example, a triangle is specified by its three vertices, but to display its outline by the
three line segments connecting the vertices, the graphics system must generate a set of
pixels that appear as line segments to the viewer. The conversion of geometric entities
to pixel colors and locations in the framebuffer is known as rasterization or scan conversion. In early graphics systems, the framebuffer was part of the standard memory
that could be directly addressed by the CPU. Today, virtually all graphics systems are
characterized by special-purpose graphics processing units (GPUs), custom-tailored
to carry out specific graphics functions. The GPU can be located on the motherboard
of the system or on a graphics card. The framebuffer is accessed through the graphics
processing unit and usually is on the same circuit board as the GPU.
GPUs have evolved to the point where they are as complex or even more complex
than CPUs. They are characterized both by special-purpose modules geared toward
graphical operations and by a high degree of parallelism—recent GPUs contain over
100 processing units, each of which is user programmable. GPUs are so powerful that
they can often be used as mini supercomputers for general-purpose computing. We
will discuss GPU architectures in more detail in Section 1.7.
1.2.3 Output Devices
Until recently, the dominant type of display (or monitor) was the cathode-ray tube
(CRT). A simplified picture of a CRT is shown in Figure 1.3. When electrons strike the
phosphor coating on the tube, light is emitted. The direction of the beam is controlled
by two pairs of deflection plates. The output of the computer is converted, by digitalto-analog converters, to voltages across the x and y deflection plates. Light appears
on the surface of the CRT when a sufficiently intense beam of electrons is directed at
the phosphor.
If the voltages steering the beam change at a constant rate, the beam will trace
a straight line, visible to the viewer. Such a device is known as the random-scan,
y deflect
Electron gun
x deflect
Focus
FIGURE 1.3 The cathode-ray tube (CRT).
Phosphor
7
8
Chapter 1
Graphics Systems and Models
calligraphic, or vector CRT, because the beam can be moved directly from any
position to any other position. If intensity of the beam is turned off, the beam can
be moved to a new position without changing any visible display. This configuration
was the basis of early graphics systems that predated the present raster technology.
A typical CRT will emit light for only a short time—usually, a few milliseconds—
after the phosphor is excited by the electron beam. For a human to see a steady,
flicker-free image on most CRT displays, the same path must be retraced, or refreshed, by the beam at a sufficiently high rate, the refresh rate. In older systems,
the refresh rate is determined by the frequency of the power system, 60 cycles per second or 60 hertz (Hz) in the United States and 50 Hz in much of the rest of the world.
Modern displays are no longer coupled to these low frequencies and operate at rates
up to about 85 Hz.
In a raster system, the graphics system takes pixels from the framebuffer and
displays them as points on the surface of the display in one of two fundamental
ways. In a noninterlaced system, the pixels are displayed row by row, or scan line
by scan line, at the refresh rate. In an interlaced display, odd rows and even rows
are refreshed alternately. Interlaced displays are used in commercial television. In an
interlaced display operating at 60 Hz, the screen is redrawn in its entirety only 30
times per second, although the visual system is tricked into thinking the refresh rate
is 60 Hz rather than 30 Hz. Viewers located near the screen, however, can tell the
difference between the interlaced and noninterlaced displays. Noninterlaced displays
are becoming more widespread, even though these displays must process pixels at
twice the rate of the interlaced display.
Color CRTs have three different-colored phosphors (red, green, and blue), arranged in small groups. One common style arranges the phosphors in triangular
groups called triads, each triad consisting of three phosphors, one of each primary.
Most color CRTs have three electron beams, corresponding to the three types of phosphors. In the shadow-mask CRT (Figure 1.4), a metal screen with small holes—the
shadow mask—ensures that an electron beam excites only phosphors of the proper
color.
Blue gun
Triad
Green Red
Blue
Green gun
Red gun
Shadow mask
FIGURE 1.4 Shadow-mask CRT.
1.2 A Graphics System
Vertical
grid
Light-emitting
elements
Horizontal
grid
FIGURE 1.5 Generic flat-panel display.
Although CRTs are still common display devices, they are rapidly being replaced
by flat-screen technologies. Flat-panel monitors are inherently raster based. Although
there are multiple technologies available, including light-emitting diodes (LEDs),
liquid-crystal displays (LCDs), and plasma panels, all use a two-dimensional grid
to address individual light-emitting elements. Figure 1.5 shows a generic flat-panel
monitor. The two outside plates each contain parallel grids of wires that are oriented
perpendicular to each other. By sending electrical signals to the proper wire in each
grid, the electrical field at a location, determined by the intersection of two wires, can
be made strong enough to control the corresponding element in the middle plate.
The middle plate in an LED panel contains light-emitting diodes that can be turned
on and off by the electrical signals sent to the grid. In an LCD display, the electrical
field controls the polarization of the liquid crystals in the middle panel, thus turning
on and off the light passing through the panel. A plasma panel uses the voltages on
the grids to energize gases embedded between the glass panels holding the grids. The
energized gas becomes a glowing plasma.
Most projection systems are also raster devices. These systems use a variety of
technologies, including CRTs and digital light projection (DLP). From a user perspective, they act as standard monitors with similar resolutions and precisions. Hard-copy
devices, such as printers and plotters, are also raster based but cannot be refreshed.
Stereo (3D) television displays use alternate refresh cycles to switch the display
between an image for the left eye and an image for the right eye. The viewer wears
special glasses that are coupled to the refresh cycle. 3D movie projectors produce
two images with different polarizations. The viewer wears polarized glasses so that
each eye sees only one of the two projected images. As we shall see in later chapters,
producing stereo images is basically a matter of changing the location of the viewer
for each frame to obtain the left- and right-eye views.
1.2.4 Input Devices
Most graphics systems provide a keyboard and at least one other input device. The
most common input devices are the mouse, the joystick, and the data tablet. Each
9
10
Chapter 1
Graphics Systems and Models
provides positional information to the system, and each is usually equipped with one
or more buttons to provide signals to the processor. Often called pointing devices,
these devices allow a user to indicate a particular location on the display.
Modern systems, such as game consoles, provide a much richer set of input
devices, with new devices appearing almost weekly. In addition, there are devices
that provide three- (and more) dimensional input. Consequently, we want to provide
a flexible model for incorporating the input from such devices into our graphics
programs. We will discuss input devices and how to use them in Chapter 3.
1.3
IMAGES: PHYSICAL AND SYNTHETIC
For many years, the pedagogical approach to teaching computer graphics started with
how to construct raster images of simple two-dimensional geometric entities (for
example, points, line segments, and polygons) in the framebuffer. Next, most textbooks discussed how to define two- and three-dimensional mathematical objects in
the computer and image them with the set of two-dimensional rasterized primitives.
This approach worked well for creating simple images of simple objects. In modern systems, however, we want to exploit the capabilities of the software and hardware
to create realistic images of computer-generated three-dimensional objects—a task
that involves many aspects of image formation, such as lighting, shading, and properties of materials. Because such functionality is supported directly by most present
computer graphics systems, we prefer to set the stage for creating these images now,
rather than to expand a limited model later.
Computer-generated images are synthetic or artificial, in the sense that the objects being imaged do not exist physically. In this chapter, we argue that the preferred
method to form computer-generated images is similar to traditional imaging methods, such as cameras and the human visual system. Hence, before we discuss the
mechanics of writing programs to generate images, we discuss the way images are
formed by optical systems. We construct a model of the image formation process that
we can then use to understand and develop computer-generated imaging systems.
In this chapter, we make minimal use of mathematics. We want to establish a paradigm for creating images and to present a computer architecture for implementing
that paradigm. Details are presented in subsequent chapters, where we shall derive
the relevant equations.
1.3.1 Objects and Viewers
We live in a world of three-dimensional objects. The development of many branches
of mathematics, including geometry and trigonometry, was in response to the desire to systematize conceptually simple ideas, such as the measurement of the size of
objects and the distance between objects. Often, we seek to represent our understanding of such spatial relationships with pictures or images, such as maps, paintings,
and photographs. Likewise, the development of many physical devices—including
cameras, microscopes, and telescopes—was tied to the desire to visualize spatial relationships among objects. Hence, there always has been a fundamental link between
the physics and the mathematics of image formation—one that we can exploit in our
development of computer image formation.
1.3 Images: Physical and Synthetic
B
11
C
(a)
(b)
(c)
FIGURE 1.6 Image seen by three different viewers. (a) A’s view. (b) B’s view.
(c) C’s view.
Two basic entities must be part of any image formation process, be it mathematical or physical: object and viewer. The object exists in space independent of any
image formation process and of any viewer. In computer graphics, where we deal with
synthetic objects, we form objects by specifying the positions in space of various geometric primitives, such as points, lines, and polygons. In most graphics systems, a set
of locations in space, or of vertices, is sufficient to define, or approximate, most objects. For example, a line can be specified by two vertices; a polygon can be specified
by an ordered list of vertices; and a sphere can be specified by two vertices that specify
its center and any point on its circumference. One of the main functions of a CAD system is to provide an interface that makes it easy for a user to build a synthetic model
of the world. In Chapter 2, we show how WebGL allows us to build simple objects;
in Chapter 9, we learn to define objects in a manner that incorporates relationships
among objects.
Every imaging system must provide a means of forming images from objects.
To form an image, we must have someone or something that is viewing our objects,
be it a human, a camera, or a digitizer. It is the viewer that forms the image of our
objects. In the human visual system, the image is formed on the back of the eye. In a
camera, the image is formed in the film plane. It is easy to confuse images and objects.
We usually see an object from our single perspective and forget that other viewers,
located in other places, will see the same object differently. Figure 1.6(a) shows two
viewers observing the same building. This image is what is seen by an observer A
who is far enough away from the building to see both the building and the two other
viewers B and C. From A’s perspective, B and C appear as objects, just as the building
does. Figure 1.6(b) and (c) shows the images seen by B and C, respectively. All three
images contain the same building, but the image of the building is different in all
three.
Figure 1.7 shows a camera system viewing a building. Here we can observe that
both the object and the viewer exist in a three-dimensional world. However, the image that they define—what we find on the projection plane—is two-dimensional. The
process by which the specification of the object is combined with the specification of
the viewer to produce a two-dimensional image is the essence of image formation,
and we will study it in detail.
FIGURE 1.7 Camera system.
12
Chapter 1
Graphics Systems and Models
FIGURE 1.8 A camera system with an object and a light source.
1.3.2 Light and Images
The preceding description of image formation is far from complete. For example, we
have yet to mention light. If there were no light sources, the objects would be dark,
and there would be nothing visible in our image. Nor have we indicated how color
enters the picture or what the effects of the surface properties of the objects are.
Taking a more physical approach, we can start with the arrangement in Figure 1.8, which shows a simple physical imaging system. Again, we see a physical object
and a viewer (the camera); now, however, there is a light source in the scene. Light
from the source strikes various surfaces of the object, and a portion of the reflected
light enters the camera through the lens. The details of the interaction between light
and the surfaces of the object determine how much light enters the camera.
Light is a form of electromagnetic radiation. Taking the classical view, we look
at electromagnetic energy as traveling as waves2 that can be characterized either by
their wavelengths or by their frequencies.3 The electromagnetic spectrum (Figure 1.9)
includes radio waves, infrared (heat), and a portion that causes a response in our
visual systems. This visible spectrum, which has wavelengths in the range of 350
to 780 nanometers (nm), is called (visible) light. A given light source has a color
determined by the energy that it emits at various wavelengths. Wavelengths in the
middle of the range, around 520 nm, are seen as green; those near 450 nm are seen
as blue; and those near 650 nm are seen as red. Just as with a rainbow, light at
wavelengths between red and green we see as yellow, and at wavelengths shorter than
blue we see as violet.
Light sources can emit light either as a set of discrete frequencies or over a
continuous range. A laser, for example, emits light at a single frequency, whereas an
incandescent lamp emits energy over a range of frequencies. Fortunately, in computer
2. In Chapter 12, we will introduce photon mapping that is based on light being emitted in discrete
packets.
3. The relationship between frequency (f ) and wavelength (λ) is f λ = c, where c is the speed of light.
1.3 Images: Physical and Synthetic
X-rays
Light
Radio
(nm)
Blue
350
Green
Red
(nm)
780
FIGURE 1.9 The electromagnetic spectrum.
graphics, except for recognizing that distinct frequencies are visible as distinct colors,
we rarely need to deal with the physical properties of light.
Instead, we can follow a more traditional path that is correct when we are operating with sufficiently high light levels and at a scale where the wave nature of light is not
a significant factor. Geometric optics models light sources as emitters of light energy,
each of which have a fixed intensity. Modeled geometrically, light travels in straight
lines, from the sources to those objects with which it interacts. An ideal point source
emits energy from a single location at one or more frequencies equally in all directions. More complex sources, such as a lightbulb, can be characterized as emitting
light over an area and by emitting more light in one direction than another. A particular source is characterized by the intensity of light that it emits at each frequency and
by that light’s directionality. We consider only point sources for now. More complex
sources often can be approximated by a number of carefully placed point sources.
Modeling of light sources is discussed in Chapter 6.
1.3.3 Imaging Models
There are multiple approaches to forming images from a set of objects, the lightreflecting properties of these objects, and the properties of the light sources in the
scene. In this section, we introduce two physical approaches. Although these approaches are not suitable for the real-time graphics that we ultimately want, they will
give us some insight into how we can build a useful imaging architecture. We return
to these approaches in Chapter 12.
We can start building an imaging model by following light from a source. Consider the scene in Figure 1.10; it is illuminated by a single point source. We include
the viewer in the figure because we are interested in the light that reaches her eye.
The viewer can also be a camera, as shown in Figure 1.18. A ray is a semi-infinite line
that emanates from a point and travels to infinity in a particular direction. Because
light travels in straight lines, we can think in terms of rays of light emanating in all
directions from our point source. A portion of these infinite rays contributes to the
image on the film plane of our camera. For example, if the source is visible from the
camera, some of the rays go directly from the source through the lens of the camera
and strike the film plane. Most rays, however, go off to infinity, neither entering the
13
14
Chapter 1
Graphics Systems and Models
FIGURE 1.10 Scene with a single point light source.
B
C
D
A
FIGURE 1.11 Ray interactions. Ray A enters camera directly. Ray B goes off to
infinity. Ray C is reflected by a mirror. Ray D goes through a transparent sphere.
camera directly nor striking any of the objects. These rays contribute nothing to the
image, although they may be seen by some other viewer. The remaining rays strike
and illuminate objects. These rays can interact with the objects’ surfaces in a variety
of ways. For example, if the surface is a mirror, a reflected ray might—depending on
the orientation of the surface—enter the lens of the camera and contribute to the image. Other surfaces scatter light in all directions. If the surface is transparent, the light
ray from the source can pass through it and may interact with other objects, enter the
camera, or travel to infinity without striking another surface. Figure 1.11 shows some
of the possibilities.
1.4 Imaging Systems
Ray tracing and photon mapping are image formation techniques that are based
on these ideas and that can form the basis for producing computer-generated images.
We can use the ray-tracing idea to simulate physical effects as complex as we wish, as
long as we are willing to carry out the requisite computing. Although tracing rays can
provide a close approximation to the physical world, it is usually not well suited for
real-time computation.
Other physical approaches to image formation are based on conservation of
energy. The most important in computer graphics is radiosity. This method works
best for surfaces that scatter the incoming light equally in all directions. Even in this
case, radiosity requires more computation than can be done in real time. We defer
discussion of these techniques until Chapter 12.
1.4
IMAGING SYSTEMS
We now introduce two imaging systems: the pinhole camera and the human visual
system. The pinhole camera is a simple example of an imaging system that will enable
us to understand the functioning of cameras and of other optical imagers. We emulate it to build a model of image formation. The human visual system is extremely
complex but still obeys the physical principles of other optical imaging systems. We
introduce it not only as an example of an imaging system but also because understanding its properties will help us to exploit the capabilities of computer graphics
systems.
1.4.1 The Pinhole Camera
The pinhole camera in Figure 1.12 provides an example of image formation that we
can understand with a simple geometric model. A pinhole camera is a box with a
small hole in the center of one side; the film is placed inside the box on the side
opposite the pinhole. Suppose that we orient our camera along the z-axis, with the
pinhole at the origin of our coordinate system. We assume that the hole is so small
that only a single ray of light, emanating from a point, can enter it. The film plane is
y
x
(x, y, z)
z
(xp, yp, zp )
d
FIGURE 1.12 Pinhole camera.
15
16
Chapter 1
Graphics Systems and Models
y
(y, z)
z
(yp , d )
d
FIGURE 1.13 Side view of pinhole camera.
located a distance d from the pinhole. A side view (Figure 1.13) allows us to calculate
where the image of the point (x, y, z) is on the film plane z = −d. Using the fact that
the two triangles in Figure 1.13 are similar, we find that the y coordinate of the image
is at yp, where
yp = −
y
.
z/d
A similar calculation, using a top view, yields
xp = −
x
.
z/d
The point (xp , yp , −d) is called the projection of the point (x, y, z). In our idealized
model, the color on the film plane at this point will be the color of the point (x, y, z).
The field, or angle, of view of our camera is the angle made by the largest object that
our camera can image on its film plane. We can calculate the field of view with the
aid of Figure 1.14.4 If h is the height of the camera, the field of view (or angle of view)
θ is
θ = 2 tan−1
h
.
2d
The ideal pinhole camera has an infinite depth of field: Every point within its
field of view is in focus. Every point in its field of view projects to a point on the back
of the camera. The pinhole camera has two disadvantages. First, because the pinhole
is so small—it admits only a single ray from a point source—almost no light enters
the camera. Second, the camera cannot be adjusted to have a different field of view.
The jump to more sophisticated cameras and to other imaging systems that have
lenses is a small one. By replacing the pinhole with a lens, we solve the two problems
of the pinhole camera. First, the lens gathers more light than can pass through the
4. If we consider the problem in three, rather than two, dimensions, then the diagonal length of the
film will substitute for h.
17
1.4 Imaging Systems
y
h
z
d
FIGURE 1.14 Field of view.
pinhole. The larger the aperture of the lens, the more light the lens can collect.
Second, by picking a lens with the proper focal length—a selection equivalent to
choosing d for the pinhole camera—we can achieve any desired field of view (up to
180 degrees). Lenses, however, do not have an infinite depth of field: Not all distances
from the lens are in focus.
For our purposes, in this chapter we can work with a pinhole camera whose focal
length is the distance d from the front of the camera to the film plane. Like the pinhole
camera, computer graphics produces images in which all objects are in focus.
Retina
Lens
Rods
and cones
1.4.2 The Human Visual System
Our extremely complex visual system has all the components of a physical imaging
system such as a camera or a microscope. The major components of the visual system
are shown in Figure 1.15. Light enters the eye through the cornea, a transparent
structure that protects the eye, and the lens. The iris opens and closes to adjust the
amount of light entering the eye. The lens forms an image on a two-dimensional
structure called the retina at the back of the eye. The rods and cones (so named
because of their appearance when magnified) are light sensors and are located on
the retina. They are excited by electromagnetic energy in the range of 350 to 780 nm.
The rods are low-level-light sensors that account for our night vision and are not
color sensitive; the cones are responsible for our color vision. The sizes of the rods
and cones, coupled with the optical properties of the lens and cornea, determine the
resolution of our visual systems, or our visual acuity. Resolution is a measure of what
size objects we can see. More technically, it is a measure of how close we can place two
points and still recognize that there are two distinct points.
The sensors in the human eye do not react uniformly to light energy at different
wavelengths. There are three types of cones and a single type of rod. Whereas intensity
is a physical measure of light energy, brightness is a measure of how intense we
perceive the light emitted from an object to be. The human visual system does not
have the same response to a monochromatic (single-frequency) red light as to a
monochromatic green light. If these two lights were to emit the same energy, they
would appear to us to have different brightness, because of the unequal response
Cornea
Iris
Optic nerve
FIGURE 1.15 The human visual
system.
18
Chapter 1
Graphics Systems and Models
of the cones to red and green light. We are most sensitive to green light, and least
sensitive to red and blue.
Brightness is an overall measure of how we react to the intensity of light. Human
color-vision capabilities are due to the different sensitivities of the three types of
cones. The major consequence of having three types of cones is that, instead of having
to work with all visible wavelengths individually, we can use three standard primaries
to approximate any color that we can perceive. Consequently, most image production
systems, including film and video, work with just three basic, or primary, colors. We
discuss color in greater depth in Chapters 2 and 8.
The initial processing of light in the human visual system is based on the same
principles used by most optical systems. However, the human visual system has a
back end much more complex than that of a camera or telescope. The optic nerve
is connected to the rods and cones in an extremely complex arrangement that has
many of the characteristics of a sophisticated signal processor. The final processing
is done in a part of the brain called the visual cortex, where high-level functions,
such as object recognition, are carried out. We shall omit any discussion of high-level
processing; instead, we can think simply in terms of an image that is conveyed from
the rods and cones to the brain.
1.5
THE SYNTHETIC-CAMERA MODEL
Our models of optical imaging systems lead directly to the conceptual foundation
for modern three-dimensional computer graphics. We look at creating a computergenerated image as being similar to forming an image using an optical system. This
paradigm has become known as the synthetic-camera model. Consider the imaging
system shown in Figure 1.16. We again see objects and a viewer. In this case, the viewer
is a bellows camera.5 The image is formed on the film plane at the back of the camera.
So that we can emulate this process to create artificial images, we need to identify a
few basic principles.
First, the specification of the objects is independent of the specification of the
viewer. Hence, we should expect that, within a graphics library, there will be separate
functions for specifying the objects and the viewer.
Second, we can compute the image using simple geometric calculations, just as
we did with the pinhole camera. Consider the side view of the camera and a simple
object in Figure 1.17. The view in part (a) of the figure is similar to that of the
pinhole camera. Note that the image of the object is flipped relative to the object.
Whereas with a real camera we would simply flip the film to regain the original
orientation of the object, with our synthetic camera we can avoid the flipping by a
simple trick. We draw another plane in front of the lens (Figure 1.17(b)) and work in
three dimensions, as shown in Figure 1.18. We find the image of a point on the object
5. In a bellows camera, the front plane of the camera, where the lens is located, and the back of the
camera, the film plane, are connected by flexible sides. Thus, we can move the back of the camera
independently of the front of the camera, introducing additional flexibility in the image formation
process. We use this flexibility in Chapter 5.
1.5 The Synthetic-Camera Model
FIGURE 1.16 Imaging system.
y
y
(y, z)
(y, z)
Camera
(yp , d)
Projector
Object
z
z
COP
(yp, –d)
(a)
(b)
FIGURE 1.17 Equivalent views of image formation. (a) Image formed on the
back of the camera. (b) Image plane moved in front of the camera.
on the virtual image plane by drawing a line, called a projector, from the point to
the center of the lens, or the center of projection (COP). Note that all projectors
are rays emanating from the center of projection. In our synthetic camera, the virtual
image plane that we have moved in front of the lens is called the projection plane. The
image of the point is located where the projector passes through the projection plane.
In Chapter 5, we discuss this process in detail and derive the relevant mathematical
formulas.
We must also consider the limited size of the image. As we saw, not all objects
can be imaged onto the pinhole camera’s film plane. The field of view expresses this
limitation. In the synthetic camera, we can move this limitation to the front by placing a clipping rectangle, or clipping window, in the projection plane (Figure 1.19).
This rectangle acts as a window through which a viewer, located at the center of projection, sees the world. Given the location of the center of projection, the location
and orientation of the projection plane, and the size of the clipping rectangle, we can
determine which objects will appear in the image.
19
20
Chapter 1
Graphics Systems and Models
FIGURE 1.18 Imaging with the synthetic camera.
(a)
(b)
FIGURE 1.19 Clipping. (a) Window in initial position. (b) Window shifted.
1.6
THE PROGRAMMER’S INTERFACE
There are numerous ways that a user can interact with a graphics system. With
completely self-contained packages such as those used in the CAD community, a user
develops images through interactions with the display using input devices such as
a mouse and a keyboard. In a typical application, such as the painting program in
Figure 1.20, the user sees menus and icons that represent possible actions. By clicking
on these items, the user guides the software and produces images without having to
write programs.
Of course, someone has to develop the code for these applications, and many
of us, despite the sophistication of commercial products, still have to write our own
graphics application programs (and even enjoy doing so).
The interface between an application program and a graphics system can be
specified through a set of functions that resides in a graphics library. These speci-
1.6 The Programmer’s Interface
FIGURE 1.20 Interface for a painting program.
Keyboard
Application
program
Graphics
library
(API)
Drivers
Mouse
Display
FIGURE 1.21 Application programmer’s model of graphics system.
fications are called the application programming interface (API). The application
programmer’s model of the system is shown in Figure 1.21. The application programmer sees only the API and is thus shielded from the details of both the hardware and
the software implementation of the graphics library. The software drivers are responsible for interpreting the output of the API and converting these data to a form that
is understood by the particular hardware. From the perspective of the writer of an
application program, the functions available through the API should match the conceptual model that the user wishes to employ to specify images. By developing code
that uses the API, the application programmer is able to develop applications that can
be used with different hardware and software platforms.
1.6.1 The Pen-Plotter Model
Historically, most early graphics systems were two-dimensional systems. The conceptual model that they used is now referred to as the pen-plotter model, referring to the
output device that was available on these systems. A pen plotter (Figure 1.22) produces images by moving a pen held by a gantry, a structure that can move the pen
FIGURE 1.22 Pen plotter.
21
22
Chapter 1
Graphics Systems and Models
in two orthogonal directions across the paper. The plotter can raise and lower the
pen as required to create the desired image. Pen plotters are still in use; they are well
suited for drawing large diagrams, such as blueprints. Various APIs—such as LOGO
and PostScript—have their origins in this model. The HTML5 canvas upon which
we will display the output from WebGL also has its origins in the pen-plotter model.
Although they differ from one another, they have a common view of the process of
creating an image as being similar to the process of drawing on a pad of paper. The
user works on a two-dimensional surface of some size. She moves a pen around on
this surface, leaving an image on the paper.
We can describe such a graphics system with two drawing functions:
moveto(x, y);
lineto(x, y);
(a)
(b)
FIGURE 1.23 Output of penplotter program for (a) a square,
and (b) a projection of a cube.
Execution of the moveto function moves the pen to the location (x, y) on the paper
without leaving a mark. The lineto function moves the pen to (x, y) and draws a
line from the old to the new location of the pen. Once we add a few initialization
and termination procedures, as well as the ability to change pens to alter the drawing
color or line thickness, we have a simple—but complete—graphics system. Here is a
fragment of a simple program in such a system:
moveto(0, 0);
lineto(1, 0);
lineto(1, 1);
lineto(0, 1);
lineto(0, 0);
This fragment would generate the output in Figure 1.23(a). If we added the code
moveto(0, 1);
lineto(0.5, 1.866);
lineto(1.5, 1.866);
lineto(1.5, 0.866);
lineto(1, 0);
moveto(1, 1);
lineto(1.5, 1.866);
we would have the image of a cube formed by an oblique projection, as is shown in
Figure 1.23(b).
For certain applications, such as page layout in the printing industry, systems
built on this model work well. For example, the PostScript page description language,
a sophisticated extension of these ideas, is a standard for controlling typesetters and
printers.
An alternate raster-based (but still limited) two-dimensional model relies on
writing pixels directly into a framebuffer. Such a system could be based on a single
function of the form
writePixel(x, y, color);
1.6 The Programmer’s Interface
where x,y is the location of the pixel in the framebuffer and color gives the color to
be written there. Such models are well suited to writing the algorithms for rasterization and processing of digital images.
We are much more interested, however, in the three-dimensional world. The
pen-plotter model does not extend well to three-dimensional graphics systems. For
example, if we wish to use the pen-plotter model to produce the image of a threedimensional object on our two-dimensional pad, either by hand or by computer, then
we have to figure out where on the page to place two-dimensional points corresponding to points on our three-dimensional object. These two-dimensional points are,
as we saw in Section 1.5, the projections of points in three-dimensional space. The
mathematical process of determining projections is an application of trigonometry.
We develop the mathematics of projection in Chapter 5; understanding projection
is crucial to understanding three-dimensional graphics. We prefer, however, to use
an API that allows users to work directly in the domain of their problems and to use
computers to carry out the details of the projection process automatically, without the
users having to make any trigonometric calculations within the application program.
That approach should be a boon to users who have difficulty learning to draw various
projections on a drafting board or sketching objects in perspective. More important,
users can rely on hardware and software implementations of projections within the
implementation of the API that are far more efficient than any possible implementation of projections within their programs would be.
Three-dimensional printers are revolutionizing design and manufacturing, allowing the fabrication of items as varied as mechanical parts, art, and biological items
constructed from living cells. They illustrate the importance of separating the lowlevel production of the final piece from the high-level software used for design. At
the physical level, they function much like our description of a pen plotter except
that rather than depositing ink, they can deposit almost any material. The threedimensional piece is built up in layers, each of which can be described using our
pen-plotter model. However, the design is done in three dimensions with a high-level
API that can output a file that is converted into a stack of layers for the printer.
1.6.2 Three-Dimensional APIs
The synthetic-camera model is the basis for a number of popular APIs, including
OpenGL and Direct3D. If we are to follow the synthetic-camera model, we need
functions in the API to specify the following:
Objects
A viewer
Light sources
Material properties
Objects are usually defined by sets of vertices. For simple geometric objects—
such as line segments, rectangles, and polygons—there is a simple relationship between a list of vertices, or positions in space, and the object. For more complex
23
24
Chapter 1
Graphics Systems and Models
objects, there may be multiple ways of defining the object from a set of vertices. A circle, for example, can be defined by three points on its circumference, or by its center
and one point on the circumference.
Most APIs provide similar sets of primitive objects for the user. These primitives
are usually those that can be displayed rapidly on the hardware. The usual sets include
points, line segments, and triangles. WebGL programs specify primitives through lists
of vertices. The following code fragment shows one way to specify three vertices in
JavaScript for use with WebGL:
var vertices = [ ];
y
vertices[0] = [0.0, 0.0, 0.0]; // Vertex A
vertices[1] = [0.0, 1.0, 0.0]; // Vertex B
vertices[2] = [0.0, 0.0, 1.0]; // Vertex C
B
Or we could use
var vertices = [ ];
A
x
C
z
FIGURE 1.24 A triangle.
w
COP
h
FIGURE 1.25 Camera
specification.
vertices.push([0.0, 0.0, 0.0]); // Vertex A
vertices.push([0.0, 1.0, 0.0]); // Vertex B
vertices.push([0.0, 0.0, 1.0]); // Vertex C
We could either send this array to the GPU each time that we want it to be displayed or store it on the GPU for later display. Note that these three vertices only give
three locations in a three-dimensional space and do not specify the geometric entity
that they define. The locations could describe a triangle, as in Figure 1.24, or we could
use them to specify two line segments, using the first two locations to specify the first
segment and the second and third locations to specify the second segment. We could
also use the three points to display three pixels at locations in the framebuffer corresponding to the three vertices. We make this choice in our application by setting
a parameter corresponding to the geometric entity we would like these locations to
specify. For example, in WebGL we would use gl.TRIANGLES, gl.LINE_STRIP,
or gl.POINTS for the three possibilities we just described. Although we are not yet
ready to describe all the details of how we accomplish this task, we can note that, regardless of which geometric entity we wish our vertices to specify, we are specifying
the geometry and leaving it to the graphics system to determine which pixels to color
in the framebuffer.
Some APIs let the user work directly in the framebuffer by providing functions
that read and write pixels. Additionally, some APIs provide curves and surfaces as
primitives; often, however, these types are approximated by a series of simpler primitives within the application program. WebGL provides access to the framebuffer
through texture maps.
We can define a viewer or camera in a variety of ways. Available APIs differ both
in how much flexibility they provide in camera selection and in how many different
methods they allow. If we look at the camera in Figure 1.25, we can identify four types
of necessary specifications:
1. Position The camera location usually is given by the position of the center
of the lens, which is the center of projection (COP).
1.6 The Programmer’s Interface
2. Orientation Once we have positioned the camera, we can place a camera
coordinate system with its origin at the center of projection. We can then
rotate the camera independently around the three axes of this system.
3. Focal length The focal length of the lens determines the size of the image
on the film plane or, equivalently, the portion of the world the camera sees.
4. Film plane The back of the camera has a height and a width. On the bellows
camera, and in some APIs, the orientation of the back of the camera can be
adjusted independently of the orientation of the lens.
These specifications can be satisfied in various ways. One way to develop the
specifications for the camera location and orientation is through a series of
coordinate-system transformations. These transformations convert object positions
represented in a coordinate system that specifies object vertices to object positions in
a coordinate system centered at the COP. This approach is useful, both for doing implementation and for getting the full set of views that a flexible camera can provide.
We use this approach extensively, starting in Chapter 5.
Having many parameters to adjust, however, can also make it difficult to get a
desired image. Part of the problem lies with the synthetic-camera model. Classical
viewing techniques, such as are used in architecture, stress the relationship between
the object and the viewer, rather than the independence that the synthetic-camera
model emphasizes. Thus, the class…