Technical Deep DiveGPU Algorithms

Ray Marching Explained: Complete Guide to Sphere Tracing & SDF Rendering

Master the fundamentals of ray marching algorithms, sphere tracing optimization techniques, and signed distance functions for high-performance volumetric GPU rendering.

Ray Marching vs Ray Tracing

Ray marching and ray tracing are two distinct approaches to rendering 3D scenes, each with unique strengths and computational characteristics. Understanding their differences is crucial for GPU benchmark development and performance analysis.

Ray Tracing: Intersection-Based Rendering

Traditional ray tracing calculates exact intersection points between rays and geometric primitives (triangles, spheres, etc.) using analytical mathematics. For a sphere, this involves solving a quadratic equation to find where the ray intersects the surface. This approach is extremely efficient for scenes composed of explicit geometry but struggles with complex implicit surfaces like fractals, clouds, or volumetric effects.

Ray Marching: Iterative Sampling

Ray marching, by contrast, uses an iterative sampling approach. Instead of calculating exact intersections, it "marches" along the ray in small steps, checking at each step whether it has hit a surface. This makes ray marching ideal for rendering implicit surfaces defined by mathematical functions rather than explicit geometry.

The key advantage of ray marching for GPU benchmarking is its computational intensity and predictability. Each pixel requires hundreds of iterations, making it an excellent stress test for shader processing units and floating-point arithmetic performance.

Key Differences Summary

Ray Tracing

  • • Exact intersection calculations
  • • Best for explicit geometry (triangles)
  • • Fixed computational cost per ray
  • • Requires acceleration structures (BVH)
  • • Industry standard for production rendering

Ray Marching

  • • Iterative sampling approach
  • • Ideal for implicit surfaces (SDFs)
  • • Variable iterations per ray
  • • No acceleration structure needed
  • • Perfect for GPU stress testing

Sphere Tracing Fundamentals

Sphere tracing is a sophisticated optimization of basic ray marching that uses signed distance functions to take larger, safer steps along the ray. Instead of using fixed small steps, sphere tracing queries the distance field at each point and uses that distance as the maximum safe step size.

The Sphere Tracing Concept

Imagine you're walking through fog trying to find a wall. If someone tells you "the nearest wall is at least 10 meters away," you can safely walk 10 meters in any direction without hitting anything. Sphere tracing uses this principle: at each point, the signed distance function tells us the minimum distance to the nearest surface, allowing us to "march" that entire distance in a single step.

This dramatically reduces the number of iterations required compared to fixed-step ray marching. While basic ray marching might need 1000+ small steps, sphere tracing typically converges in 50-200 iterations for complex scenes.

Convergence and Accuracy

Sphere tracing continues until one of three conditions is met:

  • Surface hit: The distance function returns a value below a threshold (typically 0.0001), indicating we've reached the surface
  • Maximum iterations: We've exceeded the iteration limit without finding a surface (ray missed all geometry)
  • Maximum distance: We've traveled beyond the scene bounds (background/sky)

The accuracy threshold is critical for GPU benchmarking. Smaller thresholds (0.00001) provide sharper detail but require more iterations and computational power. Our benchmark uses 0.0001 as a balance between visual quality and consistent stress testing across different GPU architectures.

Signed Distance Functions (SDF)

Signed Distance Functions are mathematical functions that return the shortest distance from any point in 3D space to a surface. The "signed" aspect means the function returns positive values outside the object, negative values inside, and zero exactly on the surface.

SDF Mathematical Properties

For a perfect SDF, the gradient magnitude is always exactly 1.0. This Lipschitz continuity property is what makes sphere tracing safe and efficient: we can trust that marching by the distance value won't overshoot the surface.

Example: Sphere SDF
float sdSphere(vec3 p, float radius) {
  return length(p) - radius;
}

// For a sphere at origin with radius 1.0:
// Point at (2, 0, 0) → distance = 1.0 (outside)
// Point at (0.5, 0, 0) → distance = -0.5 (inside)
// Point at (1, 0, 0) → distance = 0.0 (on surface)

Complex SDFs and Distance Estimation

While primitive shapes have exact SDFs, complex objects like fractals use Distance Estimators (DE) that provide conservative estimates rather than exact distances. The Mandelbulb fractal in our benchmark uses a DE function that approximates the distance to the fractal surface.

Distance estimators must be conservative (never overestimate distance) to prevent the ray from stepping through surfaces. This is why fractal rendering is more expensive than simple primitives: the DE function itself requires dozens of iterations to compute, and sphere tracing needs smaller safety factors to account for estimation error.

SDF Operations and Combinations

One powerful feature of SDFs is their composability. You can combine multiple SDFs using simple mathematical operations:

SDF Combination Operations
// Union (combine objects)
float opUnion(float d1, float d2) {
  return min(d1, d2);
}

// Subtraction (cut out shapes)
float opSubtraction(float d1, float d2) {
  return max(-d1, d2);
}

// Intersection (only overlapping parts)
float opIntersection(float d1, float d2) {
  return max(d1, d2);
}

// Smooth union (blended transition)
float opSmoothUnion(float d1, float d2, float k) {
  float h = clamp(0.5 + 0.5*(d2-d1)/k, 0.0, 1.0);
  return mix(d2, d1, h) - k*h*(1.0-h);
}

Ray Marching Algorithm Implementation

Let's examine a complete sphere tracing implementation, breaking down each component and its GPU performance implications.

Complete Sphere Tracing GLSL Code
struct RayMarchResult {
  float distance;      // Total distance traveled
  int iterations;      // Number of steps taken
  bool hit;           // Did we hit a surface?
  vec3 position;      // Final ray position
};

RayMarchResult sphereTrace(vec3 origin, vec3 direction, float maxDist) {
  RayMarchResult result;
  result.distance = 0.0;
  result.iterations = 0;
  result.hit = false;
  result.position = origin;

  const int MAX_STEPS = 200;
  const float SURFACE_THRESHOLD = 0.0001;
  const float SAFETY_FACTOR = 0.95; // Conservative stepping

  for (int i = 0; i < MAX_STEPS; i++) {
    result.iterations = i;

    // Current position along ray
    vec3 pos = origin + direction * result.distance;
    result.position = pos;

    // Query distance field (this is the expensive part)
    float dist = sceneSDF(pos);

    // Check for surface hit
    if (dist < SURFACE_THRESHOLD) {
      result.hit = true;
      break;
    }

    // Check if we've gone too far
    if (result.distance > maxDist) {
      break;
    }

    // March forward by distance (with safety factor)
    result.distance += dist * SAFETY_FACTOR;
  }

  return result;
}

Algorithm Performance Characteristics

Each component of the sphere tracing algorithm has specific GPU performance implications:

  • MAX_STEPS: Higher values allow finding more distant/detailed surfaces but increase worst-case execution time. Modern GPUs handle up to 500 steps reasonably, but 200 provides good balance.
  • SURFACE_THRESHOLD: Smaller thresholds provide sharper detail but may require more iterations near complex surfaces. 0.0001 is standard for real-time applications.
  • SAFETY_FACTOR: Values less than 1.0 prevent overshooting when using distance estimators rather than exact SDFs. 0.95 is conservative for fractal rendering.

Normal Calculation

Once we've found a surface hit, we need to calculate the surface normal for lighting. This uses finite differences to approximate the gradient of the distance field:

Normal Calculation via Gradient
vec3 calculateNormal(vec3 pos) {
  const float h = 0.0001; // Small offset

  // Sample distance field at 6 offset points
  vec3 normal = vec3(
    sceneSDF(vec3(pos.x + h, pos.y, pos.z)) -
    sceneSDF(vec3(pos.x - h, pos.y, pos.z)),

    sceneSDF(vec3(pos.x, pos.y + h, pos.z)) -
    sceneSDF(vec3(pos.x, pos.y - h, pos.z)),

    sceneSDF(vec3(pos.x, pos.y, pos.z + h)) -
    sceneSDF(vec3(pos.x, pos.y, pos.z - h))
  );

  return normalize(normal);
}
This requires 6 additional SDF evaluations per surface hit, adding significant computational cost for lighting calculations.

Optimization Techniques

Adaptive Step Sizing

Advanced implementations use adaptive step sizing based on scene complexity. In areas with high detail (like fractal crevices), use smaller safety factors. In open space, take larger steps to converge faster. This can reduce average iteration count by 20-30% without sacrificing quality.

Bounding Volumes

Wrapping complex objects in simple bounding volumes (spheres, boxes) allows early rejection. If the ray doesn't hit the bounding sphere, skip expensive fractal DE calculations entirely. Our benchmark uses a bounding sphere to cull rays that will never hit the Mandelbulb.

Binary Search Refinement

After sphere tracing converges, use binary search to refine the surface intersection point. This is especially important for fractal rendering where distance estimators may have error. Our implementation performs 8 binary search iterations for sub-threshold precision.

Binary Search Refinement
// After sphere tracing converges
float refinedDistance = result.distance;
float step = lastStepSize * 0.5;

for (int i = 0; i < 8; i++) {
  vec3 pos = origin + direction * refinedDistance;
  float dist = sceneSDF(pos);

  if (dist < 0.0) {
    // Overshot, step back
    refinedDistance -= step;
  } else {
    // Undershot, step forward
    refinedDistance += step;
  }

  step *= 0.5; // Halve step size
}

Parallel Ray Processing

Modern GPUs process pixels in parallel warps (32-64 threads). For optimal performance, rays in the same warp should have similar iteration counts. This is why ray marching can show performance variations across the frame: complex regions cause warp divergence where some threads finish early and wait for others.

GPU Implementation Considerations

Fragment Shader Execution Model

Ray marching is typically implemented in fragment shaders, with each pixel executing the full algorithm independently. This parallelism is both a strength (massive throughput) and a challenge (warp divergence when rays have different iteration counts).

Precision Considerations

GPUs use single-precision (float32) floating-point arithmetic. For deep ray marches or extreme fractal zooms, precision loss becomes problematic. Distance values may round to zero or lose accuracy, preventing convergence. This is why our benchmark limits zoom depth and uses conservative thresholds.

Memory and Bandwidth

Ray marching is compute-bound rather than memory-bound. The entire algorithm works with a handful of local variables, making it ideal for GPU shader cores. Texture fetches would kill performance, which is why procedural distance fields are preferred for benchmarking.

Thermal and Power Characteristics

Ray marching provides consistent, predictable workload perfect for thermal testing. Unlike traditional game rendering with variable scene complexity, every frame executes nearly identical instruction counts. This makes it ideal for detecting thermal throttling over 10-30 minute stress tests.