Voxel Ray Traversal Research

Rendering 1 trillion voxels at 120fps through pseudo-octree optimization

The Challenge

I came across this video about the DDA ray traversal technique for rendering voxels at high framerates. The creator mentioned planning a follow-up about implementing octrees to render trillions of voxels at 100+ fps. I took that as a personal challenge.

Starting from the original implementation, I spent a month diving into graphics programming, trying different approaches, and learning a ton along the way.

The Journey (and All The Bugs)

The path to the solution was filled with trippy bugs and visual glitches. Here are some of the weird and wonderful things that happened while I was figuring this out:

The Breakthrough

After implementing traditional octrees, I hit a wall. The performance increase was minimal and still nowhere near my goal. After tweaking and optimizing a hundred inconsequential part of my code, I finally figured out the crux of the problem, GPUs don't handle recursion and stacks well. Every pixel had to shoot a ray through millions of voxels to hit the surface, making it O(n) complexity, but the traditional oct-tree solution was too much overhead.

The solution was ditching traditional octrees and hard coding the layers directly. Instead of a recursive structure, I built a "pseudo-octree" with layers:

This flattened structure reduced ray traversal from O(n) to O(log n), fixing the bottleneck completely. Each additional layer gave exponential performance improvements.

The Results

Final result 1
Final result 2
Final result 3
Final result 4

Final Benchmarks

  • 1 trillion voxels @ 120+ fps
  • 8 trillion voxels @ 30+ fps
  • 4-5GB total VRAM & RAM usage
  • O(log n) ray traversal complexity

Acknowledgements

This personal research project wouldn't have been possible without the foundation laid by the the original implementation. I learned a massive amount about graphics programming, GPU optimization, and spatial data structures, but I definitely didn't do this on my own. My proficiency in Rust and Vulkan weren't up to speed for such a complex project so I relied on Claude Code to help test implementations and new designs. Even with assistance, this project took over a month to yield results as the high-level technical design and decisions took lots of trial and error to sort out. I learned a ton and came out with the results I dreamed of!

Check out the original repo and the DDA paper that started it all.