skip to main content
research-article

T&I engine: traversal and intersection engine for hardware accelerated ray tracing

Published:12 December 2011Publication History
Skip Abstract Section

Abstract

Ray tracing naturally supports high-quality global illumination effects, but it is computationally costly. Traversal and intersection operations dominate the computation of ray tracing. To accelerate these two operations, we propose a hardware architecture integrating three novel approaches. First, we present an ordered depth-first layout and a traversal architecture using this layout to reduce the required memory bandwidth. Second, we propose a three-phase ray-triangle intersection architecture that takes advantage of early exit. Third, we propose a latency hiding architecture defined as the ray accumulation unit. Cycle-accurate simulation results indicate our architecture can achieve interactive distributed ray tracing.

References

  1. Aila, T., and Karras, T. 2010. Architecture considerations for tracing incoherent rays. In HPG' 10: Proceedings of the Conference on High Performance Graphics, 113--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aila, T., and Laine, S. 2009. Understanding the efficiency of ray traversal on GPUs. In HPG '09: Proceedings of the Conference on High Performance Graphics, 145--149. http://www.tml.tkk.fi/timo/HPG2009/index.html. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bakhoda, A., Yuan, G., Fung, W., Wong, H., and Aamodt, T. 2009. Analyzing CUDA workloads using a detailed GPU simulator. In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software 2009, 163--174.Google ScholarGoogle Scholar
  4. Benthin, C., Wald, I., Scherbaum, M., and Friedrich, H. 2006. Ray tracing on the cell processor. In Proceedings of the 2006 IEEE/EG Symposium on Interactive Ray Tracing, 15--23.Google ScholarGoogle Scholar
  5. Benthin, C. 2006. Realtime Ray Tracing on Current CPU Architectures. PhD thesis, Sarrland University.Google ScholarGoogle Scholar
  6. Caustic Graphics. 2009. Introduction to CausticRT. Tech. rep. http://www.caustic.com/pdf/Introduction to CausticRT.pdf.Google ScholarGoogle Scholar
  7. Cook, R. L., Porter, T., and Carpenter, L. 1984. Distributed ray tracing. In SIGGRAPH '84: Proceedings of the 11th annual conference on Computer graphics and interactive techniques, ACM, 137--145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Davidovic, T., Marsalek, L., and Slusallek, P. 2011. Performance considerations when using a dedicated ray traversal engine. In WSCG 2011 Full Paper Proceedings, 65--72.Google ScholarGoogle Scholar
  9. Dennis, J. 1980. Data flow supercomputers. Computer 13, 11, 48--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Fatahalian, K., and Houston, M. 2008. GPUs: A closer look. ACM Queue 6, 2, 18--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Garanzha, K., and Loop, C. 2010. Fast ray sorting and breadth-first packet traversal for GPU ray tracing. Computer Graphics Forum 29, 2, 289--298. http://garanzha.com/Documents/GPU-RayTracing.ppt.Google ScholarGoogle ScholarCross RefCross Ref
  12. Govindaraju, V., Djeu, P., Sankaralingam, K., Vernon, M., and Mark, W. R. 2008. Toward a multicore architecture for real-time ray-tracing. In MICRO 41: Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture, 176--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Gribble, C., and Ramani, K. 2008. Coherent ray tracing via stream filtering. In Proceedings of the 2008 IEEE/EG Symposium on Interactive Ray Tracing, 59--66.Google ScholarGoogle Scholar
  14. Havel, J., and Herout, A. 2010. Yet faster ray-triangle intersection (using SSE4). IEEE Transactions on Visualization and Computer Graphics 16, 3, 434--438. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Horn, D. R., Sugerman, J., Houston, M., and Hanrahan, P. 2007. Interactive k-d tree GPU raytracing. In I3D '07: Proceedings of the 2007 symposium on Interactive 3D graphics and games, ACM, 167--174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hou, Q., Sun, X., Zhou, K., Lauterbach, C., and Manocha, D. 2011. Memory-scalable GPU spatial hierarchy construction. IEEE Transactions on Visualization and Computer Graphics 17, 3, 466--474. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kensler, A., and Shirley, P. 2006. Optimizing ray-triangle intersection via automated search. In Proceedings of the 2006 IEEE/EG Symposium on Interactive Ray Tracing, 33--38.Google ScholarGoogle Scholar
  18. Kim, S.-s., Nam, S.-w., Kim, D.-h., and Lee, I.-h. 2007. Hardware-accelerated ray-triangle intersection testing for high-performance collision detection. Journal of WSCG 15, 17--24.Google ScholarGoogle Scholar
  19. Kopta, D., Spjut, J., Brunvand, E., and Davis, A. 2010. Efficient MIMD architectures for high-performance ray tracing. In ICCD 2010: Proceedings of the 28th IEEE International Conference on Computer Design, 9--16.Google ScholarGoogle Scholar
  20. Laine, S., and Karras, T. 2010. Two methods for fast ray-cast ambient occlusion. Computer Graphics Forum (Proceedings of Eurographics Symposium on Rendering 2010) 29, 4, 1325--1333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. MacDonald, D. J., and Booth, K. S. 1990. Heuristics for ray tracing using space subdivision. The Visual Computer 6, 3, 153--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Mahesri, A., Johnson, D., Crago, N., and Patel, S. J. 2008. Tradeoffs in designing accelerator architectures for visual computing. In MICRO 41: Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture, 164--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Möller, T., and Trumbore, B. 1997. Fast, minimum storage ray-triangle intersection. Journal of Graphics Tools 2, 1, 21--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Muralimanohar, N., Balasubramonian, R., and Jouppi, N. 2007. Optimizing NUCA organizations and wiring alternatives for large caches with CACTI 6.0. In MICRO 40: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Nah, J.-H., Park, J.-S., Kim, J.-W., Park, C., and Han, T.-D. 2010. Ordered depth-first layouts for ray tracing. In ACM SIGGRAPH ASIA 2010 Sketches, 55:1--55:2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Parker, S. G., Bigler, J., Dietrich, A., Friedrich, H., Hoberock, J., Luebke, D., McAllister, D., McGuire, M., Morley, K., Robison, A., and Stich, M. 2010. OptiX: a general purpose ray tracing engine. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2010) 29, 4, 1--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Pharr, M., and Humphreys, G. 2010. Physically Based Rendering, second ed. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Rixner, S., Dally, W., Kapasi, U., Mattson, P., and Owens, J. 2000. Memory access scheduling. In Proceedings of the 27th International Symposium on Computer Architecture, 128--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Schmittler, J., Woop, S., Wagner, D., Paul, W. J., and Slusallek, P. 2004. Realtime ray tracing of dynamic scenes on an FPGA chip. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, 95--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., and Hanrahan, P. 2008. Larrabee: a many-core x86 architecture for visual computing. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2008) 27, 3, 18:1--18:15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Shevtsov, M., Soupikov, A., and Kapustin, A. 2007. Raytriangle intersection algorithm for modern CPU architecture. In Proceedings of GraphiCon 2007, 33--39.Google ScholarGoogle Scholar
  32. Soupikov, A., Shevtsov, M., and Kapustin, A. 2008. Improving kd-tree quality at a reasonable construction cost. In Proceedings of the 2008 IEEE/EG Symposium on Interactive Ray Tracing, 67--72.Google ScholarGoogle Scholar
  33. Spjut, J., Kensler, A., Kopta, D., and Brunvand, E. 2009. TRaX: a multicore hardware architecture for real-time ray tracing. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 28, 12, 1802--1815. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Tsakok, J. A. 2009. Faster incoherent rays: Multi-BVH ray stream tracing. In HPG '09: Proceedings of the Conference on High Performance Graphics, 151--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Wald, I., Slusallek, P., Benthin, C., and Wagner, M. 2001. Interactive rendering with coherent ray tracing. Computer Graphics Forum (Proceedings of EUROGRAPHICS 2001) 20, 3, 153--164.Google ScholarGoogle Scholar
  36. Wald, I., Boulos, S., and Shirley, P. 2007. Ray tracing deformable scenes using dynamic bounding volume hierarchies. ACM Transactions on Graphics 26, 1, 6:1--6:18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Wald, I. 2004. Realtime Ray Tracing and Interactive Global Illumination. PhD thesis, Sarrland University.Google ScholarGoogle Scholar
  38. Wald, I. 2010. Fast Construction of SAH BVHs on the Intel Many Integrated Core (MIC) Architecture. IEEE Transactions on Visualization and Computer Graphics. (to appear). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Whitted, T. 1980. An improved illumination model for shaded display. Communications of the ACM 23, 6, 343--349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Woop, S., Schmittler, J., and Slusallek, P. 2005. RPU: a programmable ray processing unit for realtime ray tracing. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2005) 24, 3, 434--444. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Woop, S., Brunvand, E., and Slusallek, P. 2006. Estimating performance of a ray-tracing ASIC design. In Proceedings of the 2006 IEEE/EG Symposium on Interactive Ray Tracing, 7--14.Google ScholarGoogle Scholar
  42. Woop, S., Marmitt, G., and Slusallek, P. 2006. B-KD trees for hardware accelerated ray tracing of dynamic scenes. In GH '06: Proceedings of the 21st ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware, 67--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Woop, S. 2004. A Ray Tracing Hardware Architecture for Dynamic Scenes. Master's thesis, Sarrland University.Google ScholarGoogle Scholar
  44. Woop, S. 2007. A Programmable Hardware Architecture for Realtime Ray Tracing of Coherent Dynamic Scenes. PhD thesis, Sarrland University.Google ScholarGoogle Scholar

Index Terms

  1. T&I engine: traversal and intersection engine for hardware accelerated ray tracing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 30, Issue 6
      December 2011
      678 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/2070781
      Issue’s Table of Contents

      Copyright © 2011 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 December 2011
      Published in tog Volume 30, Issue 6

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader