This is the first in a sequence of educational 3D graphics renderers.

Basic Renderer

A "renderer" is a collection of algorithms that takes as its input a Scene data structure and produces as its output a FrameBuffer data structure.

{@code
                           Renderer
                       +--------------+
       Scene           |              |         FrameBuffer
       data     ====>  |  Rendering   |  ====>     data
     structure         |  algorithms  |          structure
                       |              |
                       +--------------+
}

A Scene data structure contains information that describes a "virtual scene" that we want to take a "picture" of. The renderer is kind of like a digital camera that takes a picture of the scene and stores the picture's data in the FrameBuffer data structure. The FrameBuffer holds the actual pixel information that describes the picture of the scene.

The rendering algorithms can be implemented in hardware (a graphics card or GPU) or in software. In this class we will write a software renderer using the Java programming language.

Our software renderer is made up of four "packages" of Java classes. Each package is contained in its own directory. The name of the directory is the name of the package.

The first package is the collection of input data structures. This is called the "scene" package. The data structure files in the scene package are:

The second package is the output data structure. It is called the "framebuffer" package and contains the file

The third package is a collection of algorithms that manipulate the data structures from the other two packages. This package is called the "pipeline" package. The algorithm files are:

The fourth package is a library of geometric models. This package is called the "models" package. It contains a number of files for geometric shapes such as sphere, cylinder, cube, cone, pyramid, tetrahedron, dodecahedron, and mathematical curves and surfaces.

There is also a fifth collection of source files, a collection of client programs that use the renderer. These files are in the top level directory of the renderer.

Here is a brief description of the data structures from the scene package and framebuffer packages.

Scene

A Scene object represents a collection of geometric models positioned in the three dimensional space. The models are in front of a Camera which is located at the origin and looks down the negative z-axis. Each Model object in a Scene object represents a distinct geometric shape in the scene. A Model object is a list of Vertex objects and a list of LineSegment objects. Each LineSegment object refers to two of the Model's Vertex objects. The Vertex objects represent points in the camera's coordinate system. The model's line segments represent the geometric object as a "wire-frame", that is, the geometric object is drawn as a collection of "edges". This is a fairly simplistic way of doing 3D graphics and we will improve this in later renderers.

http://en.wikipedia.org/wiki/Wire-frame_model
https://www.google.com/search?q=graphics+wireframe&tbm=isch

Camera

The Camera data structure represents a camera located at the origin, looking down the negative z-axis. A Camera has associated to it a "view volume" that determines what part of space the camera "sees" when we use the camera to take a picture (that is, when we render a Scene).

A camera can "take a picture" two ways, using a perspective projection or a parallel (orthographic) projection. Each way of taking a picture has a different shape for its view volume.

For the perspective projection, the view volume is an infinitely long pyramid that is formed by the pyramid with its apex at the origin and its base in the plane {@code z = -1} with edges {@code x = -1}, {@code x = +1}, {@code y = -1}, and {@code y = +1}.

http://math.hws.edu/graphicsbook/c3/projection-frustum.png

For the orthographic projection, the view volume is an infinitely long rectangular cylinder parallel to the z-axis and with sides {@code x = -1}, {@code x = +1}, {@code y = -1}, and {@code y = +1} (an infinite parallelepiped).

http://math.hws.edu/graphicsbook/c3/projection-parallelepiped.png

When the graphics rendering pipeline uses a Camera to render a Scene, the renderer "sees" only the geometry from the scene that is contained in the camera's view volume. (Notice that this means the orthographic camera will see geometry that is behind the camera. In fact, the perspective camera also sees geometry that is behind the camera.)

The plane {@code z = -1} is the camera's image plane. The rectangle in the image plane with corners {@code (-1, -1, -1)} and {@code (+1, +1, -1)} is the camera's view rectangle. The view rectangle is like the film in a real camera, it is where the camera's image appears when you take a picture. The contents of the camera's view rectangle is what gets rasterized, by the renderer's Rasterize pipeline stages, into a FrameBuffer's viewport.

https://webglfundamentals.org/webgl/frustum-diagram.html
https://threejs.org/examples/#webgl_camera
http://math.hws.edu/graphicsbook/demos/c3/transform-equivalence-3d.html

FrameBuffer

A FrameBuffer object holds an array of pixel data that represents an image that can be displayed on a computer's screen. For each pixel in the image, the framebuffer's array holds three byte values, one byte that represents the red component of the pixel's color, one byte that represents the green component, and one byte that represents the blue component of the pixel's color. Each of these three bytes is only eight bits in size, so each of the three colors has only 256 shades (but there are 256^3 = 16,777,216 distinct colors). The three bytes of color for each pixel are packed into one Java integer (which has four bytes, so one of the integer's bytes is not used). If a FrameBuffer has dimensions {@code n} rows of pixels by {@code m} columns of pixels, then the FrameBuffer holds {@code n*m} integers. The pixel data is NOT stored as a "two-dimensional" {@code n} by {@code m} array of integers nor is it stored as a "three-dimensional" {@code n} by {@code m} by 3 array of bytes. It is stored as a one-dimensional array of integers of length {@code n*m}. This array is in "row major" form, meaning that the first {@code m} integers in the array are the pixels from the image's first row. The next {@code m} integers are the pixels from the image's second row, etc. Finally, the first row of pixels is the top row of the image when it is displayed on the computer's screen.

The FrameBuffer data structure also defines a Viewport which is a rectangular sub-array of the pixel data in the framebuffer. The viewport is the active part of the framebuffer, the part of the framebuffer that the renderer is actually writing into. The viewport has width and height dimensions, {@code w} and {@code h}, with {@code w <= m} and {@code h <= n}. Quite often the viewport will be the whole framebuffer. But the viewport idea makes it easy for us to implement effects like "split screen" (two independent images in the FrameBuffer), or "picture in a picture" (a smaller picture superimposed on a larger picture). In future renderers (starting with renderer 7), another use of a viewport that is smaller than the whole FrameBuffer is when we want the viewport to have the same aspect ratio as the Camera‘s view rectangle.

https://en.wikipedia.org/wiki/Split_screen_(computer_graphics)
https://en.wikipedia.org/wiki/Picture-in-picture

Renderer

Here is a brief overview of how the renderer algorithms process a Scene data structure to produce a filled in viewport within the FrameBuffer object.

First of all, remember that:

The main job of the renderer is to "draw" in the FrameBuffer's viewport appropriate pixels for each LineSgement in each Model from the Scene. The "appropriate pixels" are the pixels "seen" by the camera. At its top level, the renderer iterates through the Scene object's list of Model objects, and for each Model object the renderer iterates through the Model object's list of LineSegment objects. When the renderer has drilled down to a LineSegment object, then it can render the line segment into the framebuffer's viewport. So the renderer really renders line segments.

The renderer does its work on a LineSegment object in a "pipeline" of stages. This simple renderer has just three pipeline stages. The stages that a LineSegment object passes through in this renderer are

To understand the algorithms used in the "project then rasterize" process, we need to trace through the rendering pipeline what happens to each Vertex and LineSegment object from a Model.

Start with a Model's list of vertices.

{@code
        v0 ...  vn     A Model's list of Vertex objects
         \     /
          \   /
            |
            | camera coordinates (of v0 ... vn)
            |
        +-------+
        |       |
        |  P1   |    Projection (of the vertices)
        |       |
        +-------+
            |
            | image plane coordinates (of v0 ... vn)
            |
          /---\
         /     \
        /  P2   \   Rasterization (of each line segment)
        \       /
         \     /
          \---/
            |
            |  pixels (for each line segment)
            |
           \|/
        FrameBuffer
}

Projection

The projection stage takes the model's list of vertices in three dimensional (camera) space and computes the two-dimensional coordinates of where each vertex "projects" onto the camera's image plane (the plane with equation {@code z = -1}). The projection stage takes the vertices inside of the camera's view volume and projects them into the camera's view rectangle (and points outside of the camera's view volume will, of course, project to points outside of the view rectangle).

Let us derive the formulas for the perspective projection transformation (the formulas for the parallel projection transformation are pretty obvious). We will derive the x-coordinate formula; the y-coordinate formula is similar.

Let {@code (x_c, y_c, z_c)} denote a point in the 3-dimensional camera coordinate system. Let {@code (x_p, y_p, -1)} denote the point's perspective projection into the image plane, {@code z = -1}. Here is a "picture" of just the xz-plane from camera space. This picture shows the point {@code (x_c, z_c)} and its projection to the point {@code (x_p, -1)} in the image plane.

{@code

           x                  /
           |                 /
       x_c +                + (x_c, z_c)
           |               /|
           |              / |
           |             /  |
           |            /   |
           |           /    |
           |          /     |
           |         /      |
           |        /       |
       x_p +       +        |
           |      /|        |
           |     / |        |
           |    /  |        |
           |   /   |        |
           |  /    |        |
           | /     |        |
           +-------+--------+------------> -z
        (0,0)     -1       z_c
}

We are looking for a formula that computes {@code x_p} in terms of {@code x_c} and {@code z_c}. There are two similar triangles in this picture that share a vertex at the origin. Using the properties of similar triangles we have the following ratios. (Remember that these are ratios of positive lengths, so we write {@code -z_c}, since {@code z_c} is on the negative z-axis).

{@code
              x_p       x_c
             -----  =  -----
               1       -z_c
}

If we solve this ratio for the unknown, {@code x_p}, we get the projection formula,

{@code
              x_p = -x_c / z_c.
}

The equivalent formula for the y-coordinate is

{@code
              y_p = -y_c / z_c.

}

Rasterization

The rasterizing stage first takes the two-dimensional coordinates of a vertex in the camera's image plane and computes that vertex's location in a "logical pixel plane". This is referred to as the "viewport transformation". The purpose of the logical pixel plane and the viewport transformation is to make the rasterization stage easier to implement.

The camera's image plane contains a view rectangle with edges {@code x = +1, x = -1, y = +1}, and {@code y = -1}. The pixel plane contains a logical viewport rectangle with edges {@code x = 0.5, x = w+0.5, y = 0.5}, and {@code y = h+0.5} (where {@code h} and {@code w} are the height and width of the framebuffer's viewport).

Recall that the role of the camera's view rectangle is to determine what part of a scene is visible to the camera. Vertices inside of the camera's view rectangle should end up as pixels in the framebuffer's viewport. Another way to say this is that we want only that part of each projected line segment contained in the view rectangle to be visible to our renderer and rasterized into the framebuffer's viewport.

Any point inside of the image plane's view rectangle should be transformed to a point inside of the pixel plane's logical viewport. Any vertex outside of the image plane's view rectangle should be transformed to a point outside of the pixel plane's logical viewport.

{@code
                      View Rectangle
                (in the Camera's image plane)

                          y-axis
                            |
                            |       (+1,+1)
                  +---------|---------+
                  |         |         |
                  |         |         |
                  |         |         |
                  |         |         |
                  |         |         |
               -------------+---------------- x-axis
                  |         |         |
                  |         |         |
                  |         |         |
                  |         |         |
                  |         |         |
                  +---------|---------+
              (-1,-1)       |
                            |
                            |

                            ||
                            ||
                            ||  Viewport Transformation
                            ||
                            ||
                            \/

                      Logical Viewport
                                               (w+0.5, h+0.5)
      +----------------------------------------------+
      | . . . . . . . . . . . . . . . . . . . . . . .|
      | . . . . . . . . . . . . . . . . . . . . . . .|
      | . . . . . . . . . . . . . . . . . . . . . . .|
      | . . . . . . . . . . . . . . . . . . . . . . .|
      | . . . . . . . . . . . . . . . . . . . . . . .|   The logical pixels
      | . . . . . . . . . . . . . . . . . . . . . . .|   are the points in the
      | . . . . . . . . . . . . . . . . . . . . . . .|   logical viewport with
      | . . . . . . . . . . . . . . . . . . . . . . .|   integer coordinates.
      | . . . . . . . . . . . . . . . . . . . . . . .|
      | . . . . . . . . . . . . . . . . . . . . . . .|
      | . . . . . . . . . . . . . . . . . . . . . . .|
      | . . . . . . . . . . . . . . . . . . . . . . .|
      | . . . . . . . . . . . . . . . . . . . . . . .|
      +----------------------------------------------+
 (0.5, 0.5)

                            ||
                            ||
                            ||  Rasterizer
                            ||
                            ||
                            \/

                         Viewport
                    (in the FrameBuffer)
      (0,0)
        +-------------------------------------------+
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|   The physical pixels
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|   are the entries in
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|   the FrameBuffer
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|   array.
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|
        +-------------------------------------------+
                                                (w-1,h-1)
}

After the viewport transformation of the two endpoints of a line segment, the rasterization stage will convert the given line segment in the pixel plane into pixels in the framebuffer's viewport. The rasterization stage computes all the pixels in the framebuffer's viewport that are on the line segment connecting the transformed vertices v0 and v1. Any point inside the logical viewport that is on this line segment is rasterized to a pixel inside the framebuffer's viewport. Any point on this line segment that is outside the logical viewport should not be rasterized to a pixel in the framebuffer.

View Rectangle to Logical Viewport Transformation

The view rectangle in the camera's view plane has

{@code
       -1 <= x <= 1,
       -1 <= y <= 1.
}

The logical viewport in the pixel plane has

{@code
       0.5 <= x < w + 0.5,
       0.5 <= y < h + 0.5,
}

where

We want a transformation (formulas) that sends points from the camera's view rectangle to proportional points in the pixel plane's logical viewport.

The goal of this transformation is to put a logical pixel with integer coordinates at the center of each square physical pixel. The logical pixel with integer coordinates {@code (m, n)} represents the square physical pixel with

{@code
  m - 0.5 <= x < m + 0.5,
  n - 0.5 <= y < n + 0.5.
}

Notice that logical pixels have integer coordinates {@code (m,n)} with

{@code
  1 <= m <= w
  1 <= n <= h.
}

Let us derive the formulas for the viewport transformation (we will derive the x-coordinate formula; the y-coordinate formula is similar).

Let {@code x_p} denote an x-coordinate in the image plane and let {@code x_vp} denote an x-coordinate in the viewport. If a vertex is on the left edge of the view rectangle (with {@code x_p = -1}), then it should be transformed to the left edge of the viewport (with {@code x_vp = 0.5}). And if a vertex is on the right edge of the view rectangle (with {@code x_p = 1}), then it should be transformed to the right edge of the viewport (with {@code x_vp = w + 0.5}). These two facts are all we need to know to find the linear function for the transformation of the x-coordinate.

We need to calculate the slope {@code m} and intercept {@code b} of a linear function

{@code
          x_vp = m * x_p + b
}

that converts image plane coordinates into viewport coordinates. We know, from what we said above about the left and right edges of the view rectangle, that

{@code
           0.5 = (m * -1) + b,
       w + 0.5 = (m *  1) + b.
}

If we add these last two equations together we get

{@code
         w + 1 = 2*b
}
or
{@code
         b = (w + 1)/2.
}

If we use {@code b} to solve for {@code m} we have

{@code
           0.5 = (m * -1) + (w + 1)/2
             1 = -2*m + w + 1
           2*m = w
             m = w/2.
}

So the linear transformation of the x-coordinate is

{@code
       x_vp = (w/2) * x_p + (w+1)/2
            = 0.5 + w/2 * (x_p + 1).
}

The equivalent formula for the y-coordinate is

{@code
       y_vp = 0.5 + h/2 * (y_p + 1).
}