As the quest for more realistic computer graphics marches steadily
on, the demand for rich and detailed imagery is greater than ever.
However, the current "sweet spot" in terms of price, power
consumption, and performance is in commodity hardware. If we desire
to render scenes with tens or hundreds of millions of polygons
as cheaply as possible, we need a way of doing so that maximizes
the use of the commodity hardware that we already have at our
disposal. We propose a distributed rendering architecture based on
message-passing that is designed to partition scene geometry across
a cluster of commodity machines, allowing the entire scene to remain
in-core and enabling parallel construction of hierarchical spatial
acceleration structures. The design uses feed-forward, asynchronous
messages to allow distributed traversal of acceleration structures
and evaluation of shaders without maintaining any suspended shader
execution state. We also provide a simple method for throttling work
generation to keep message queueing overhead small. The results of
our implementation show roughly an order of magnitude speedup in
rendering time compared to image plane decomposition, while keeping
memory overhead for message queuing around 1%.
To appear at GRAPP 2013.