One on the most visually appealing branches of computer graphics is the production of shaded images of three dimensional scenes. The results of such a process can be readily appreciated by most anyone, non only by computer scientists. Until recently, the equipment and knowledge needed to produce such images has not been widespread. Now, however, there are several companies manufacturing hardware for real time image generation and several others offering their services for off line production of computer generated film animation. The constraints of real time operation place limits on the quality of the produced images. This paper will describe the effects which can be obtained when these limits are removed.
A computer synthesized image consists of a rectangular array of small spots called picture elements, or "pixels". Typical sizes of these picture element arrays range from 256x256 to 4096x4096. The computer must calculate an intensity value for each picture element so that, when viewed from a distance, the array of intensities merges into a photograph-like image of the desired scene. The calculation of the pixel intensities is basically a simulation process. The first step is a geometrical simulation. A line of sight is extended from the eye position through a particular pixel and out into the scene. Any objects intersecting this line of sight are identified by geometrical calculations. The intersection closest to the eye is selected as being the object fragment visible at that pixel. The second step is a light reflection simulation. This calculates the intensity if the pixel from knowledge of the direction of simulated incident light, relative orientation of the surface and simulated properties of the surface. The degree of realism of the image can then be characterized by the accuracy by which these simulations are performed. For each portion of the process, some approximation is made to the functions which describe the real world. Simple, crude, approximations may be used either to speed calculations or due to lack of knowledge of the real funtion. Progress made in the generation of more and more realistic images has been that of replacing simple functional approximations with more accurate ones.
One major application area of shaded computer graphics is in real time flight simulation systems. Such systems are designed to train aircraft pilots by allowing them to fly a simulated airplane complete with a view out the window of the surrounding terrain. In such systems, a new image must be constructed some thirty times per second in synchronism with a television based display device. It is the job of the system designer to provide the best possible picture consistent with these time constraints. This severely limits the sophistication and thus the accuracy of the models used in the simulation. Even so, it requires approximately a million dollars worth of special purpose hardware to construct images at the correct rate.
In many situations real-time motion is not so crucial. Applications of such systems might be the display of a few views of some geometrical construct for visualization purposes or to produce motion pictures but in a stop-frame animation mode. Such systems typically use general purpose computers and record the image in a digital refresh memory or scan it out directly on film. The designer of such a system has much more freedom of options. For example, the same simple geometric models and algorithms may be used as in real time systems. Without special purpose hardware such algorithms typically take from ten seconds to several minutes to produce a picture. On the other hand, with the removal of the tight time constraints, non real time systems have the opportunity of producing far higher quality images than real time systems. By using more accurate models of the imaging process, some extra time spent in computation is rewarded by more realistic looking images. For high quality animation applications, the non real time mode will always be best since it will always produce better images than can be obtained in real time. In effect, if the goal is realism, the system designer is attempting to take as long as possible to construct an image.
The geometrical model of the desired scene must provide two pieces of information to the image generation process. First, for visibility testing, it must be possible to find the intersection of each object of the scene with a particular scan ray from the eye. Second, for light reflection simulation, it must be possible to find the direction of a vector normal to each surface at each intersection. The simplest such modeling primitive is the flat polygon. Polygons have the advantage of being fairly general and very simple to manipulate mathematically. The linear nature of their defining equations makes it possible to generate fast algorithms for performing geometric calculations. One general technique for speeding computations is to turn absolute calculations into incremental calculations. Instead of re-calculating a functional value independently for several inputs, one calculates only the change from its previous value. For example, an edge of a polygon can be represented by the linear equation aX+bY+c=0. To find the intersection of this edge with a series of horizontal scan lines one could simply substitute the Y coordinate into the equation
for each desired Y value. A faster algorithm would be to calculate one X value and repeatedly add the constant (-b/a) to it for successive values of Y.
The disadvantage in using flat polygons is that not all real objects are flat. Early attempts to model curved surfaces approximated them with a mesh of flat polygons. The resulting pictures bore a reasonable resemblance to the desired object but the polygonal nature of the approximation was very apparent. This was because the intensity of a polygon was calculated as the same over the whole polygon, giving a step function approximation to the actual smoothly varying intensity pattern across a real curved surface. Simple attempts to rectify this calculated average intensities at corners of polygons and then linearly interpolate the intensities for the intervening picture elements [Gouraud]. The technique of linear interpolation of intensities is still quite simple to implement in hardware. Still better interpolation techniques have been applied to the direction of the normal vector rather than the intensity [Phong]. Such techniques are not so easy to implement in hardware.
In order to accurately draw pictures of curved surfaces it is necessary to model them as such. Two main mathematical representations of curves surfaces have been used. The first such is the "quadric surface" [Mahl, MAGI]. This class includes such objects as spheres, cones and cylinders. Quadric surfaces are a little harder to operate on mathematically than planes, requiring the solution of second order equations rather than first order. The second surface type is the parametric surface. Such surfaces are defined by three bivariate functions, [X(u,v), Y(u,v), Z(u,v)]. As the values of the parameters vary between some defined limits the generated point sweeps out the surface. This surface primitive is much more general than the quadric and is quite popular with computer aided designers. Intersection calculation for such surfaces is basically the operation of inverting the functions. For a desired pixel position (X,Y) it is necessary to find the corresponding (u,v) values and substitute them into the Z defining function. In general this is not at all straightforward and often requires numerical analysis. It is only recently that algorithms have been developed for dealing with such surface types [Catmull, Blinn, Whitted].
5. INTENSITY CALCULATIONS
The second major operation involved in image generation is the calculation of the intensity of a pixel. The intensity seen at the eye is basically the product of the amount of light hitting the surface with the amount of light reflected to the eye. The problem then breaks down into the simulation of the lighting environment of the scene and the simulation of the light reflective properties of the surfaces in it.
The simplest light reflection model is the, so called, Lambert's law. This models a surface as a perfect diffuser so that the reflected light is only a function of the angle of incidence of the light with the surface. This produces images of ideally matte surfaces and is the model used in hardware simulators. A more accurate model also includes the effects of specular reflection [Phong, Blinn]. Specular reflection produces highlights on the surface and makes it look more shiny. In addition, light reflected specularly does not interact with pigments in the surface so that this component of the intensity desaturates the color of the object making the highlights look white. Finally, other surface properties such as transparency may be simulated [Crow]. In this case, the intensity of a pixel is not only determined by the light reflective properties of the nearest surface fragment. A contribution also comes from the light transmission properties of the object and the reflective properties of the surface fragment behind it. Other properties, such as refraction and polarization, have yet to be explored.
The lighting environment of a scene is most simply modeled as a single point light source located infinitely far away. The incoming light vector can then be assumed to be constant for all objects in the scene. This, again, is the model typically used in real time simulation systems. A simple generalization of this environment is to allow the light source to be at some specified point in space. The vector from a surface fragment to the light source must then be recalculated for each surface fragment. Another generalization is to use several point light sources. The contribution of each light source to the displayed intensity are simply summed. A completely general environmental lighting model would require a spherical integration of the light impinging on the surface from all directions. This has only been crudely approximated for some restricted geometrical situations [Blinn Newell]. Finally, the topic of the lighting environment also includes the problem of shadows [Crow]. From the point of view of a particular surface fragment, the light source might be obscured by another object. Less light impinges on the surface fragment and thus less is reflected to the eye. The calculation of shadows is typically done by making one view of the scene from the point of view of the light source to identify those objects hidden from the light as being in shadow. In summary, the simulation of environmental lighting effects is one which can quickly lead to astronomical amounts of calculation. It is a fertile ground for further work to reduce the computational load.
The above techniques have been used to simulate curved surfaces with various homogeneous surface properties. In order to simulate the fine detail of textured surfaces a straightforward extension would be to model each wrinkle and color variation as a separate, small surface element. This quickly becomes impractical for textures of reasonable complexity. A simple method of achieving the visual effect of texture has been applied to parametrically defined surfaces [Catmull]. The color, in addition to the coordinates, of the surface is made a function of the two parameters u and v. After the picture generation process calculates the parameter values at a pixel, it uses them to evaluate the texturing function. The value of this function is then used to scale the displayed intensity. A texturing function is typically defined as a table of samples of the color at equally spaced intervals of u and v. A simple method of generating such tables is to use a digitized photograph of the desired texture. The image generation process then appears to wrap this photograph around the base surface the three dimensions. This technique raises the question of the boundaries between computer graphics or image processing. Are we doing computer graphics with texture mapping or are we performing a sophisticated image transform on the photograph? Finally, other surface properties such as shininess or transparency may be made non-homogeneous with this method. One other, not so obvious, property is the direction of the surface normal itself. By perturbing the direction of the normal according to a texturing function , small bumps and wrinkles can be simulated which appear correct with respect to the simulated light source.
In general, the more time one spends making an image the more realistic it can appear. In addition to providing high quality animation effects, slow running algorithms serve to show the boundaries of realism in the picture making process. Before applying simplifications to an algorithm to make it faster it is a good idea to first learn how to make the best possible image, even if it is slow. Finally, the speed/quality trade off is not absolute. It can be broken by the application of a third quantity, intelligence. As more and more becomes known about the imaging process, better simulations can be performed without brute force application of current techniques. The non-real time images of today will become the real time images of tomorrow.