Abstract
Ray tracing naturally supports high-quality global illumination effects, but it is computationally costly. Traversal and intersection operations dominate the computation of ray tracing. To accelerate these two operations, we propose a hardware architecture integrating three novel approaches. First, we present an ordered depth-first layout and a traversal architecture using this layout to reduce the required memory bandwidth. Second, we propose a three-phase ray-triangle intersection architecture that takes advantage of early exit. Third, we propose a latency hiding architecture defined as the ray accumulation unit. Cycle-accurate simulation results indicate our architecture can achieve interactive distributed ray tracing.
- Aila, T., and Karras, T. 2010. Architecture considerations for tracing incoherent rays. In HPG' 10: Proceedings of the Conference on High Performance Graphics, 113--122. Google ScholarDigital Library
- Aila, T., and Laine, S. 2009. Understanding the efficiency of ray traversal on GPUs. In HPG '09: Proceedings of the Conference on High Performance Graphics, 145--149. http://www.tml.tkk.fi/timo/HPG2009/index.html. Google ScholarDigital Library
- Bakhoda, A., Yuan, G., Fung, W., Wong, H., and Aamodt, T. 2009. Analyzing CUDA workloads using a detailed GPU simulator. In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software 2009, 163--174.Google Scholar
- Benthin, C., Wald, I., Scherbaum, M., and Friedrich, H. 2006. Ray tracing on the cell processor. In Proceedings of the 2006 IEEE/EG Symposium on Interactive Ray Tracing, 15--23.Google Scholar
- Benthin, C. 2006. Realtime Ray Tracing on Current CPU Architectures. PhD thesis, Sarrland University.Google Scholar
- Caustic Graphics. 2009. Introduction to CausticRT. Tech. rep. http://www.caustic.com/pdf/Introduction to CausticRT.pdf.Google Scholar
- Cook, R. L., Porter, T., and Carpenter, L. 1984. Distributed ray tracing. In SIGGRAPH '84: Proceedings of the 11th annual conference on Computer graphics and interactive techniques, ACM, 137--145. Google ScholarDigital Library
- Davidovic, T., Marsalek, L., and Slusallek, P. 2011. Performance considerations when using a dedicated ray traversal engine. In WSCG 2011 Full Paper Proceedings, 65--72.Google Scholar
- Dennis, J. 1980. Data flow supercomputers. Computer 13, 11, 48--56. Google ScholarDigital Library
- Fatahalian, K., and Houston, M. 2008. GPUs: A closer look. ACM Queue 6, 2, 18--28. Google ScholarDigital Library
- Garanzha, K., and Loop, C. 2010. Fast ray sorting and breadth-first packet traversal for GPU ray tracing. Computer Graphics Forum 29, 2, 289--298. http://garanzha.com/Documents/GPU-RayTracing.ppt.Google ScholarCross Ref
- Govindaraju, V., Djeu, P., Sankaralingam, K., Vernon, M., and Mark, W. R. 2008. Toward a multicore architecture for real-time ray-tracing. In MICRO 41: Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture, 176--187. Google ScholarDigital Library
- Gribble, C., and Ramani, K. 2008. Coherent ray tracing via stream filtering. In Proceedings of the 2008 IEEE/EG Symposium on Interactive Ray Tracing, 59--66.Google Scholar
- Havel, J., and Herout, A. 2010. Yet faster ray-triangle intersection (using SSE4). IEEE Transactions on Visualization and Computer Graphics 16, 3, 434--438. Google ScholarDigital Library
- Horn, D. R., Sugerman, J., Houston, M., and Hanrahan, P. 2007. Interactive k-d tree GPU raytracing. In I3D '07: Proceedings of the 2007 symposium on Interactive 3D graphics and games, ACM, 167--174. Google ScholarDigital Library
- Hou, Q., Sun, X., Zhou, K., Lauterbach, C., and Manocha, D. 2011. Memory-scalable GPU spatial hierarchy construction. IEEE Transactions on Visualization and Computer Graphics 17, 3, 466--474. Google ScholarDigital Library
- Kensler, A., and Shirley, P. 2006. Optimizing ray-triangle intersection via automated search. In Proceedings of the 2006 IEEE/EG Symposium on Interactive Ray Tracing, 33--38.Google Scholar
- Kim, S.-s., Nam, S.-w., Kim, D.-h., and Lee, I.-h. 2007. Hardware-accelerated ray-triangle intersection testing for high-performance collision detection. Journal of WSCG 15, 17--24.Google Scholar
- Kopta, D., Spjut, J., Brunvand, E., and Davis, A. 2010. Efficient MIMD architectures for high-performance ray tracing. In ICCD 2010: Proceedings of the 28th IEEE International Conference on Computer Design, 9--16.Google Scholar
- Laine, S., and Karras, T. 2010. Two methods for fast ray-cast ambient occlusion. Computer Graphics Forum (Proceedings of Eurographics Symposium on Rendering 2010) 29, 4, 1325--1333. Google ScholarDigital Library
- MacDonald, D. J., and Booth, K. S. 1990. Heuristics for ray tracing using space subdivision. The Visual Computer 6, 3, 153--166. Google ScholarDigital Library
- Mahesri, A., Johnson, D., Crago, N., and Patel, S. J. 2008. Tradeoffs in designing accelerator architectures for visual computing. In MICRO 41: Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture, 164--175. Google ScholarDigital Library
- Möller, T., and Trumbore, B. 1997. Fast, minimum storage ray-triangle intersection. Journal of Graphics Tools 2, 1, 21--28. Google ScholarDigital Library
- Muralimanohar, N., Balasubramonian, R., and Jouppi, N. 2007. Optimizing NUCA organizations and wiring alternatives for large caches with CACTI 6.0. In MICRO 40: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, 3--14. Google ScholarDigital Library
- Nah, J.-H., Park, J.-S., Kim, J.-W., Park, C., and Han, T.-D. 2010. Ordered depth-first layouts for ray tracing. In ACM SIGGRAPH ASIA 2010 Sketches, 55:1--55:2. Google ScholarDigital Library
- Parker, S. G., Bigler, J., Dietrich, A., Friedrich, H., Hoberock, J., Luebke, D., McAllister, D., McGuire, M., Morley, K., Robison, A., and Stich, M. 2010. OptiX: a general purpose ray tracing engine. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2010) 29, 4, 1--13. Google ScholarDigital Library
- Pharr, M., and Humphreys, G. 2010. Physically Based Rendering, second ed. Morgan Kaufmann. Google ScholarDigital Library
- Rixner, S., Dally, W., Kapasi, U., Mattson, P., and Owens, J. 2000. Memory access scheduling. In Proceedings of the 27th International Symposium on Computer Architecture, 128--138. Google ScholarDigital Library
- Schmittler, J., Woop, S., Wagner, D., Paul, W. J., and Slusallek, P. 2004. Realtime ray tracing of dynamic scenes on an FPGA chip. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, 95--106. Google ScholarDigital Library
- Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., and Hanrahan, P. 2008. Larrabee: a many-core x86 architecture for visual computing. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2008) 27, 3, 18:1--18:15. Google ScholarDigital Library
- Shevtsov, M., Soupikov, A., and Kapustin, A. 2007. Raytriangle intersection algorithm for modern CPU architecture. In Proceedings of GraphiCon 2007, 33--39.Google Scholar
- Soupikov, A., Shevtsov, M., and Kapustin, A. 2008. Improving kd-tree quality at a reasonable construction cost. In Proceedings of the 2008 IEEE/EG Symposium on Interactive Ray Tracing, 67--72.Google Scholar
- Spjut, J., Kensler, A., Kopta, D., and Brunvand, E. 2009. TRaX: a multicore hardware architecture for real-time ray tracing. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 28, 12, 1802--1815. Google ScholarDigital Library
- Tsakok, J. A. 2009. Faster incoherent rays: Multi-BVH ray stream tracing. In HPG '09: Proceedings of the Conference on High Performance Graphics, 151--158. Google ScholarDigital Library
- Wald, I., Slusallek, P., Benthin, C., and Wagner, M. 2001. Interactive rendering with coherent ray tracing. Computer Graphics Forum (Proceedings of EUROGRAPHICS 2001) 20, 3, 153--164.Google Scholar
- Wald, I., Boulos, S., and Shirley, P. 2007. Ray tracing deformable scenes using dynamic bounding volume hierarchies. ACM Transactions on Graphics 26, 1, 6:1--6:18. Google ScholarDigital Library
- Wald, I. 2004. Realtime Ray Tracing and Interactive Global Illumination. PhD thesis, Sarrland University.Google Scholar
- Wald, I. 2010. Fast Construction of SAH BVHs on the Intel Many Integrated Core (MIC) Architecture. IEEE Transactions on Visualization and Computer Graphics. (to appear). Google ScholarDigital Library
- Whitted, T. 1980. An improved illumination model for shaded display. Communications of the ACM 23, 6, 343--349. Google ScholarDigital Library
- Woop, S., Schmittler, J., and Slusallek, P. 2005. RPU: a programmable ray processing unit for realtime ray tracing. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2005) 24, 3, 434--444. Google ScholarDigital Library
- Woop, S., Brunvand, E., and Slusallek, P. 2006. Estimating performance of a ray-tracing ASIC design. In Proceedings of the 2006 IEEE/EG Symposium on Interactive Ray Tracing, 7--14.Google Scholar
- Woop, S., Marmitt, G., and Slusallek, P. 2006. B-KD trees for hardware accelerated ray tracing of dynamic scenes. In GH '06: Proceedings of the 21st ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware, 67--77. Google ScholarDigital Library
- Woop, S. 2004. A Ray Tracing Hardware Architecture for Dynamic Scenes. Master's thesis, Sarrland University.Google Scholar
- Woop, S. 2007. A Programmable Hardware Architecture for Realtime Ray Tracing of Coherent Dynamic Scenes. PhD thesis, Sarrland University.Google Scholar
Index Terms
- T&I engine: traversal and intersection engine for hardware accelerated ray tracing
Recommendations
T&I engine: traversal and intersection engine for hardware accelerated ray tracing
SA '11: Proceedings of the 2011 SIGGRAPH Asia ConferenceRay tracing naturally supports high-quality global illumination effects, but it is computationally costly. Traversal and intersection operations dominate the computation of ray tracing. To accelerate these two operations, we propose a hardware ...
Ray tracing-based interactive diffuse indirect illumination
Despite great efforts in recent years to accelerate global illumination computation, the real-time ray tracing of fully dynamic scenes to support photorealistic indirect illumination effects has yet to be achieved in computer graphics. In this paper, we ...
Complex Luminaires: Illumination and Appearance Rendering
Simulating a complex luminaire such as a chandelier is expensive and slow, even using state-of-the-art algorithms. A more practical alternative is to use precomputation to accelerate rendering. Prior approaches cached information on an aperture surface ...
Comments