The NVIDIA OptiX ray tracing engine is a programmable system designed for NVIDIA GPUs and other highly parallel architectures. The OptiX engine builds on the key observation that most ray tracing algorithms can be implemented using a small set of programmable operations. Consequently, the core of OptiX is a domain-specific just-in-time compiler that generates custom ray tracing kernels by combining user-supplied programs for ray generation, material shading, object intersection, and scene traversal. This enables the implementation of a highly diverse set of ray tracing-based algorithms and applications, including interactive rendering, offline rendering, collision detection systems, artificial intelligence queries, and scientific simulations such as sound propagation. OptiX achieves high performance through a compact object model and application of several ray tracing-specific compiler optimizations. For ease of use it exposes a single-ray programming model with full support for recursion and a dynamic dispatch mechanism similar to virtual function calls.
We describe the design space for real-time photon density estimation, the key step of rendering global illumination (GI) via photon mapping. We then detail and analyze efficient GPU implementations of four best-of-breed algorithms. All produce reasonable results on NVIDIA GeForce 670 at 1920x1080 for complex scenes with multiple-bounce diffuse effects, caustics, and glossy reflection in real-time. Across the designs we conclude that tiled, deferred photon gathering in a compute shader gives the best combination of performance and quality.
Modern game production is at a crisis. The primary bottleneck on quality, budgets, and schedules for most games is the limited available time of experienced artists. There is little opportunity to increase the number of artists because inflation-adjusted profit margins are stagnant or falling. Simultaneously, consumer expectations for the fidelity of the entertainment experience continue to rise. The solution to this crisis is to recognize that the digital artist's ultimate tool is computation...and that newer GPUs and the cloud disrupt historical trends in available computation for production workflow. We can't make more artists, but we can augment existing ones through computation to increase their effective skill and multiply their efforts. In this talk, I chart this space through case studies of production realities and research possibilities, including scalable assets and algorithms, procedural assets, digital content creation tools, and cloud & crowd resources.
Producing visual effects that scale with resolution and
hardware capabilities is a major challenge in real-time 3D
graphics production today. Effects must run on older hardware,
cost no more than linear time in resolution to support HD displays,
and exhibit and increase in quality on faster hardware (that may not
even exist today) without artist intervention. These course notes
describe practical implementation details of two phenomenologically based algorithms for motion blur and ambient occlusion that exhibit this scalable property.
I've been preparing clean, easy-to-use versions of popular
graphics research and education data for
distribution. About half of the data is available now
and the rest will be coming online throughout 2013.
SAO is a new screen-space ambient occlusion/obscurance algorithm
that produces radiosity-like ambient shadowing over distances
from centimeters to meters. Most previous screen-space AO
algorithms run disproportionally slow as sampling radius and
pixel density increase because they become dominated by
main-memory DRAM (vs. on-chip L1 and L2) memory operations, thus
limiting most previous algorithms to centimeter scale. SAO
exhibits perfect scaling in the number of pixels sampled and has
performance-per-pixel independent of screen density or
world-space sampling radius.
This paper describes a novel filter for simulating motion blur
phenomena in real time
by applying ideas from offline stochastic reconstruction.
The filter operates as a 2D post-process on a conventional framebuffer
augmented with a screen-space velocity buffer.
We demonstrate results on video game scenes rendered and reconstructed in real-time on
NVIDIA GeForce 480 and Xbox 360 platforms, and show that the same filter
can be applied to cinematic post-processing of offline-rendered images
and real photographs.
The technique is fast and robust enough that we deployed it
in a production game engine used at Vicarious Visions.