Welcome to my Website!

IB Extended Essay

Computer Science

Comparison of Raytracing and Rasterization for Real-Time Rendering

May 2020

Word Count:

Research Question
How is raytracing an improvement over rasterization for realtime physically accurate rendering?


The world of 3D computer graphics is rapidly transforming: the recent shift towards the famed concept of realtime raytracing has generated a flurry of research. Behind the scenes, the entertainment industry has continued to discover innovative algorithms and models for accurate rendering of clouds, hair, fog, skin, and many other complex aspects of the real world.
Photorealistic rendering is a critical part of an immersive world. While many games and movies use unrealistic / cartoon styles for artistic purposes, this paper will focus on physically-based rendering.
Real-world measurements and units are employed so that objects look natural across a variety of conditions. Energy must be conserved: objects cannot reflect more light than they receive. Reflections and shadows must be correctly placed.

Entertainment is the most visible area for physically-based rendering; games and movies use it to increase the quality of their visuals. 3D engine developers are constantly looking for ways to increase efficiency for rendering, both time and resource-wise. The several rendering methods have differing strengths and weaknesses with respect to efficiency.
Virtual reality is increasingly being used in diverse areas, from training emergency responders to practicing medical surgery. In these situations physical realism must be accurately maintained to ensure immersion. Applications such as flight simulators equally benefit from visual fidelity.

There are two main classes of rendering methods: rasterization-based and ray-based methods.
Rasterization is traditionally the faster way of rendering in 3D; it has been used since the beginnings of computer graphics in the 1960's. All objects are decomposed into triangles, which are the simplest shape to draw (they are always convex and planar). In rasterization, each triangle is individually drawn onto the screen by traversing the pixels that it covers. A depth buffer is used to ensure that closer objects are drawn on top of farther ones.

Ray tracing has been used for a long time in non-realtime rendering, such as animated movies. It was first described in 1968/9 by Arthur Appel. Ray tracing uses rays to calculate visibility. For each pixel on the screen, a ray is generated that points in a specific direction. Each ray is tested to see if it intersects with the objects in the scene (which can be an analytical primitive e.g. sphere or a triangle). The closest object then transfers its color / attributes to the pixel on the screen.
There are several types of ray tracing...

Performance differences impact the amount of content (screen resolution, scene complexity) that can be rendered smoothly; framerate considerations are especially important in Virtual Reality applications where users may experience motion sickness at lower framerates and resolutions.

        Firstly, the concept of physically based rendering will be explained. Both rendering frameworks will be introduced, and the important differences with respect to physical accuracy. Explanation of how parallelization and threading can speed up rendering for both methods.
        High-level comparison: practical investigation of visual and performance-wise differences using empirical data. For example, running a scene with raytracing then rasterization and comparing results. Performance benchmarks from secondary sources could also be considered. Analyze bottlenecks (CPU, GPU, large-scale memory transfer) in both methods and how to improve them. Examine high-level optimizations, like occlusion culling and bounding box hierarchies / k-d trees. Compare effectiveness of techniques used for shadows, reflections, global illumination, volumetric rendering, anti-aliasing, and other aspects of realism. Examine integration of these techniques in commercial game engines (CryEngine, Unreal) and compare performance of rendering methods.
        Shortcuts: explanation of techniques to improve performance with minimal visual impact, and their effect on performance. Techniques like Level of Detail and 2d sprites are practical in rasterization but may not have as much of an effect with raytracing. Deep learning can reduce or even eliminate the need for supersampling in raytracing.
        Low-level comparison: theoretical investigation of rendering pipeline. Look at pseudocode and real examples of shaders and light calculations for various illumination models. Compare number of instructions and memory accesses for both rendering methods. Analyze impacts of vectorization and cache coherence on program efficiency.
        Hardware limitations. Does hardware acceleration for each method make a large difference in speed? Does the use of different APIs (DirectX, OpenGL) affect performance? Explore cost, accessibility, and scalability for both rendering methods. Compare results on devices with different GPU, CPU, and memory capabilities. Examine applicability to virtual reality and mobile devices. Consider future hardware improvements and their effect on performance.

Theoretical / fundamental structure


Ray tracing allows for accurate interactions beyond direct visibility calculations. 

For each triangle {
  For each pixel {
    If pixel in triangle then fill

Ray tracing
For each pixel {
  For each triangle {
    If pixel in triangle then fill

Above is the high-level pseudocode for both frameworks. The main difference is the swapping of the inner and outer loops.
The outer loop can be efficiently parallelized in both frameworks: in rasterization each processing thread is assigned a triangle, while in ray tracing each thread is assigned a ray / pixel.

Texture and shading operations have a similar structure in both frameworks and are not included in performance tests.

High-level comparison

A typical scene with rasterization may run 30fps; with ray tracing it would run maybe 5fps.

Intensive scenes are usually rendered with _deferred_ rasterization. In this case, several full-resolution buffers are used to store the properties of visible elements. These buffers take up significant memory space, especially with higher screen resolutions. Normals, diffuse color, and texture coordinates require large buffers.
Processor - calculating barycentric coordinates
The rasterization pipeline: Vertex processing => Triangle setup => Triangle projection => Pixel processing

(pseudocode for triangle projection or pixel processing)

Memory - traversing tree
Processor - traversing tree, intersecting
The raytracing pipeline: Tree processing => Ray generation => Ray-scene intersection => Shading => More ray intersection & shading

(pseudocode for ray-object intersection)

Effect of greater scene complexity or more triangles
Rasterization is prone to overdraw - when many objects overlap each other, the resources spent on shading and texturing the farther ones are wasted because they are overwritten by the closer ones. The impact of overdraw can be decreased by using deferred rendering. More threads are necessary but each thread still has the same amount of work.
For raytracing, intersection with objects is expensive; more objects means more computations. This inner loop cannot be as effectively parallelized away, but the impact of more objects can be decreased by using acceleration structures which are explained in the next section.

Effect of increasing resolution

A basic optimization for rasterization is Frustrum Culling. The part of a scene visible to the camera is typically a section of a rectangular pyramid; objects outside this volume can be completely skipped.
At the per-pixel level, modern tiled rasterization techniques essentially perform raytracing on the 2d triangles post-transform. Using barycentric coordinates it is possible to get the vertex weights for the current pixel.

(explain barycentric coordinates?)

Finding the intersection of a ray and objects can be accelerated by using Bounding Volume Hierarchies. Several objects can be grouped together; if the ray does not hit this collective volume then all of the included objects can be skipped.

(explain BVH)

In a bounding volume hierarchy, each primitive (triangle) is assigned a bounding box which simplifies collision / intersection computation. Bounding volumes which are close to each other are grouped together, and a single larger bounding volume is constructed containing all of the constituent volumes. The enclosed volumes are added as children of the larger volume. This way, if a ray is found to not intersect with the large bounding volume, the children can be completely skipped, saving computation and memory access time. This process is repeated recursively until the whole scene is made of a single or few nodes.

The axis-aligned bounding box is chosen for its simplicity and compact data representation - it only needs 2 points describing opposite corners.

An example of a BVH class is as follows:

class boundingVolume {
    *boundingVolume children;
    float3 max_point;
    float3 min_point;
    int triangleIndex;

Generating the BVH structure is a complex task. Top-down or bottom-up methods are used for the static geometry, and insertion is used for dynamic objects.

Traversal of this structure could be done recursively or iteratively. Breadth-first traversal is the optimal way.

function traverseBVH (ray, startBV) {
    currentBV = startBV;
    while (currentBV.hasChildren) {
        for i in currentBV.children {
            if (boxIntersect(ray, i)) currentBV = i;
    intersectTriangle(ray, currentBV.triangleIndex);

Binary Space Partitions are another way of accelerating collision computation, however they are less flexible than BVH, and most applications require dynamic objects which are unsuitable for BSPs.

With rasterization shadows are made with a shadow map or stencil buffer. While shadow maps are faster, they exhibit aliasing artifacts with lower resolutions.
With raytracing a ray is cast towards the light to determine shadow. This necessitates the creation of another ray which is intersected with the scene.

Global Illumination

Global illumination is an integral part of physically realistic rendering. Light bounces around the world, picking up color from the objects it hits. Simulating this complex interaction creates much more appealing and realistic visual results.

With rasterization, global illumination is typically pre-rendered and 'baked' into an extra texture.
The Finite Element method is commonly used with rasterization. In this method, for each vertex, a hemicube is rendered around it and the incoming light is integrated according to the Rendering Equation. This is multiplied with the diffuse color of the vertex to get the final color. This method is very slow. This process must be repeated for each light bounce.

With raytracing, a number of rays can be cast from the surface in question to sample the hemisphere; these samples are then put into the Rendering Equation. This is a Monte Carlo method where the rays are randomly generated. Sufficient samples are needed to decrease noise. However, naive random sampling does not yield accurate results.
Importance Sampling - if there is a bright light in the scene, most illuminance will come from it, and therefore more rays should be cast in its direction.

The rendering equation.
Outgoing radiance = Emissive coefficient + integral of incoming radiance * diffuse coefficient

In a physically-based rendering framework, all materials have a Bidirectional Reflectance Distribution Function (BRDF) that dictates how much light is scattered at a certain angle; this is typically empirically determined with measurements. This means that the light contribution is View-Dependent (the brightness of a particular spot changes with the angle which one views it at), unlike the Lambert diffuse model.

Cubemap is only accurate for one point, planar reflections are expensive.
Raytraced reflections are good

Using rasterization, effects such as fog and light rays can be approximated using depth peeling and screen-space techniques.
Deep opacity shadow maps can be used to determine volumetric shadowing.
(explain more)

Raytracing allows a physically accurate rendering of volumetrics. Raymarching with a set or adaptive number of samples is used for lighting calculations.
(explain more)

Both require many samples for good quality.


When objects are far away and small, they can be rendered at a lower detail since high-frequency features will not be visible.

Several raycasting effects can be emulated in _screen space_ with rasterization, for example Ambient Occlusion

Machine learning can reduce the need for supersampling in raytracing. Neural networks can be trained to recognize noise caused by illumination sampling and smooth it out.

Low-level comparison & Effect of hardware

?? merge with hardware ??
Vectorization - barycentric coordinate calculations are good
Multithreading, divergence
Cache coherence ??? => Level of Detail, Tiles
Register <= Cache <= Main memory

The rasterization pipeline has greatly evolved; modern GPUs have specialized circuitry. Render output units transfer pixels to the framebuffer, texture multipliers multiply textures, and unified shader cores execute vertex and pixel shaders.
Hardware-accelerated raytracing is a very recent development. Hardware includes specialized pipes for generating rays, computing collisions, and traversing the bounding volume tree.

Current GPUs commonly use multithreading, having many ALUs and registers for each instruction dispatcher. A consequence of this design is that all threads execute the same step together; any conditionals used will necessitate no-op instructions to be issued and consume additional resources. This phenomenon is called thread divergence.

OpenCL is extensively used for raytracing. Vulkan is very new and supports raytracing. DirectX is a Microsoft proprietary API which also supports raytracing.

Results / high level

Conclusion & Evaluation

Raytracing is much more intensive but gives much better physically accurate effects. With the increasing power of modern graphics cards and dedicated hardware support, fully realtime raytracing is bound to become a reality. However, the performance of rasterization will likely ensure its prevalence in mobile and embedded applications.

Several extensions to this research are possible. A hybrid method using both frameworks could be considered; the latest game engines use rasterization for primary visibility and only use raycasting for complex effects. Raymarching, an alternative to raycasting, could also be considered. Ray marching can render complex structures like implicit functions and fractals more effectively than ray tracing. Sphere tracing.

Tomas Akenine-MAAller, Eric Haines, Naty Hoffman, Angelo Pesce, MichaAA Iwanicki, S├ębastien Hillaire. (2018). Real-Time Rendering, Fourth Edition. http://www.realtimerendering.com

NVIDIA Corporation. (2008). GPU Gems. https://developer.nvidia.com/gpugems/GPUGems

Matt Pharr, Wenzel Jakob, and Greg Humphreys. (2016). Physically Based Rendering: From Theory To Implementation. http://www.pbr-book.org

CryTek. (2018). CryEngine V Manual. https://docs.cryengine.com/display/CEMANUAL/CRYENGINE+V+Manual

Epic Games. (2016). Engine Features. https://docs.unrealengine.com/en-us/Engine

Romain Guy, Mathias Agopian. (2019). Physically-Based Rendering in Filament. https://google.github.io/filament/Filament.md.html