research-article

Efficient depth peeling via bucket sort

Authors:
Fang Liu

Chinese Academy of Sciences

Chinese Academy of Sciences
View Profile

,
Meng-Cheng Huang

Chinese Academy of Sciences

Chinese Academy of Sciences
View Profile

,
Xue-Hui Liu

Chinese Academy of Sciences

Chinese Academy of Sciences
View Profile

,
En-Hua Wu

Chinese Academy of Sciences and University of Macau

Chinese Academy of Sciences and University of Macau
View Profile

HPG '09: Proceedings of the Conference on High Performance Graphics 2009August 2009Pages 51–57https://doi.org/10.1145/1572769.1572779

Published:01 August 2009Publication History

HPG '09: Proceedings of the Conference on High Performance Graphics 2009

Pages 51–57

ABSTRACT

In this paper we present an efficient algorithm for multi-layer depth peeling via bucket sort of fragments on GPU, which makes it possible to capture up to 32 layers simultaneously with correct depth ordering in a single geometry pass. We exploit multiple render targets (MRT) as storage and construct a bucket array of size 32 per pixel. Each bucket is capable of holding only one fragment, and can be concurrently updated using the MAX/MIN blending operation. During the rasterization, the depth range of each pixel location is divided into consecutive subintervals uniformly, and a linear bucket sort is performed so that fragments within each subintervals will be routed into the corresponding buckets. In a following fullscreen shader pass, the bucket array can be sequentially accessed to get the sorted fragments for further applications. Collisions will happen when more than one fragment is routed to the same bucket, which can be alleviated by multi-pass approach. We also develop a two-pass approach to further reduce the collisions, namely adaptive bucket depth peeling. In the first geometry pass, the depth range is redivided into non-uniform subintervals according to the depth distribution to make sure that there is only one fragment within each subinterval. In the following bucket sorting pass, there will be only one fragment routed into each bucket and collisions will be substantially reduced. Our algorithm shows up to 32 times speedup to the classical depth peeling especially for large scenes with high depth complexity, and the experimental results are visually faithful to the ground truth. Also it has no requirement of pre-sorting geometries or post-sorting fragments, and is free of read-modify-write (RMW) hazards.

References

Bavoil, L., and Myers, K. 2008. Order independent transparency with dual depth peeling. Tech. rep., NVIDIA Corporation.Google Scholar
Bavoil, L., Callahan, S. P., Lefohn, A., ao L. D. Comba, J., and Silva, C. T. 2007. Multi-fragment effects on the gpu using the k-buffer. In Proceedings of the 2007 symposium on Interactive 3D graphics and games, 97--104. Google ScholarDigital Library
Carpenter, L. 1984. The a-buffer, an antialiased hidden surface method. In Proceedings of the 11th annual conference on Computer graphics and interactive techniques, 103--108. Google ScholarDigital Library
Carr, N., Mech, R., and Miller, G. 2008. Coherent layer peeling for transparent high-depth-complexity scenes. In Proceedings of the 23rd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware, 33--40. Google ScholarDigital Library
Catmull, E. E. 1974. A Subdivision Algorithm for Computer Display of Curved Surfaces. PhD thesis, University of Utah. Google ScholarDigital Library
Eisemann, E., and D&#233;coret, X. 2006. Fast scene voxelization and applications. In SIGGRAPH 2006 Sketches. Google ScholarDigital Library
Everitt, C. 2001. Interactive order-independent transparency. Tech. rep., NVIDIA Corporation.Google Scholar
Govindaraju, N. K., Henson, M., Lin, M. C., and Manocha, D. 2005. Interactive visibility ordering and transparency computations among geometric primitives in complex environments. In Proceedings of the 2005 symposium on Interactive 3D graphics and games, 49--56. Google ScholarDigital Library
Houston, M., Preetham, A., and Segal, M. 2005. A hardware f-buffer implementation. Tech. rep., Stanford University.Google Scholar
Jouppi, N. P., and Chang, C.-F. 1999. z ³: an economical hardware technique for high-quality antialiasing and transparency. 85--93.Google Scholar
Liu, B.-Q., Wei, L.-Y., and Xu, Y.-Q. 2006. Multi-layer depth peeling via fragment sort. Tech. rep., Microsoft Research Asia.Google Scholar
Mammen, A. 1989. Transparency and antialiasing algorithms implemented with the virtual pixel maps technique. IEEE Computer Graphics and Applications 9, 4, 43--55. Google ScholarDigital Library
Mark, W. R., and Proudfoot, K. 2001. The f-buffer: a rasterization-order fifo buffer for multi-pass rendering. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware, 57--64. Google ScholarDigital Library
Myers, K., and Bavoil, L. 2007. Stencil routed a-buffer. ACM SIGGRAPH 2007 Technical Sketch Program. Google ScholarDigital Library
NVIDIA. 2005. Gpu programming exposed: the naked truth behind nvidia's demos. Tech. rep., NVIDIA Corporation.Google Scholar
Wexler, D., Gritz, L., Enderton, E., and Rice, J. 2005. Gpu-accelerated high-quality hidden surface removal. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, 7--14. Google ScholarDigital Library
Wittenbrink, C. M. 2001. R-buffer: a pointerless a-buffer hardware architecture. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware, 73--80. Google ScholarDigital Library

Index Terms

Efficient depth peeling via bucket sort
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
        3D imaging
  2. Computer graphics
    1. Animation

Recommendations

FreePipe: a programmable parallel rendering architecture for efficient multi-fragment effects
I3D '10: Proceedings of the 2010 ACM SIGGRAPH symposium on Interactive 3D Graphics and Games

In the past decade, modern GPUs have provided increasing programmability with vertex, geometry and fragment shaders. However, many classical problems have not been efficiently solved using the current graphics pipeline where some stages are still fixed ...
Read More
Bucket depth peeling
SIGGRAPH '09: SIGGRAPH 2009: Talks

Efficient rendering of multi-fragment effects has long been a great challenge in computer graphics. The classical depth peeling algorithm [Everitt 2001] provides a simple but robust solution by peeling off one layer per pass, but multi rasterizations ...
Read More
Single pass depth peeling via CUDA rasterizer
SIGGRAPH '09: SIGGRAPH 2009: Talks

Multi-fragment effects play important roles on many graphics applications, which require operations on more than one fragment per pixel. The classical depth peeling algorithm [Everitt 2001] peels off one layer each pass, but the performance degrades for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
HPG '09: Proceedings of the Conference on High Performance Graphics 2009
August 2009
185 pages
ISBN:9781605586038
DOI:10.1145/1572769
Editors:
Stephen N. Spencer
University of Washington
,
David McAllister
NVIDIA
,
Matt Pharr
Intel
,
Ingo Wald
Intel
,
General Chairs:
David Luebke
NVIDIA
,
Philipp Slusallek
DFKI & Saarland University
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 August 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
bucket sort
depth peeling
graphics hardware
histogram equalization
max/min blending
multiple render target (MRT)
order independent transparency (OIT)
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate15of44submissions,34%
Upcoming Conference
HPG '24

Sponsor:

siggraph

High-Performance Graphics

July 26 - 28, 2024

Denver , CO , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 44
  Total Citations
  View Citations
- 1,345
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient depth peeling via bucket sort

HPG '09: Proceedings of the Conference on High Performance Graphics 2009

ABSTRACT

References

Cited By

Index Terms

Recommendations

FreePipe: a programmable parallel rendering architecture for efficient multi-fragment effects

Bucket depth peeling

Single pass depth peeling via CUDA rasterizer