Skip to content

tiw302/mandelbrot-c

Repository files navigation

Mandelbrot-C

Linux macOS Windows CodeQL Memory Check Formatting

License: MIT Language Platform SIMD WebAssembly Renderer GitHub repo size GitHub last commit

A high-performance, multi-threaded Mandelbrot and Julia set explorer written in C99. This project utilizes an Engine-Centric Architecture targeting Native Desktop (CPU/AVX2), Web (WebAssembly/SIMD128), and hardware-accelerated GPU rendering (WebGL/Sokol GFX).

Live Web Demo: tiw302.github.io/mandelbrot-c/


Table of Contents

Overview & UX Engineering & Math Dev & Ops Project Lifecycle
Quick Start The Mathematics Prerequisites Roadmap
Introduction Technical Architecture Build & Installation Contributing
Technical Preview Platform Implementations Configuration Development Methodology & AI Assistance
Core Features Performance Benchmarks Running Tests Author's Note
Interactive Controls Project Structure License

Quick Start

# 1. Clone the repository
git clone https://github.com/tiw302/mandelbrot-c.git && cd mandelbrot-c

# 2. Build (interactive menu — pick CPU, GPU, or Web)
./build.sh

# 3. Run
./build_cpu/mandelbrot_cpu   # CPU engine
./build_gpu/mandelbrot_gpu   # GPU engine (requires OpenGL 3.3+)

For the Web build, see Build & Installation.


Introduction

Mandelbrot-C is an exploratory project focused on the intersection of low-level C programming and high-performance graphics. This journey began as a deep dive into C99 to understand pointers, memory management, and hardware acceleration. What started as a simple SDL2 experiment has evolved into a production-grade fractal engine.

Throughout the development process, I have explored advanced topics including SIMD intrinsics, multi-threaded load balancing, WebAssembly porting, and shader-based 64-bit precision emulation.


Technical Preview

Mandelbrot

Mandelbrot Screenshot Mandelbrot Screenshot Mandelbrot Screenshot

Julia Set Exploration

Julia Screenshot Julia Screenshot Julia Screenshot


Core Features

  • Hybrid Rendering Pipeline: Choice between optimized multi-threaded CPU rendering or hardware-accelerated GPU rendering.
  • WASM Performance: Desktop-class performance in the browser via WebAssembly, SIMD128, and multi-threaded Web Workers.
  • Persistent State Sharing: Share mathematical discoveries via URL parameters that encode the full view state. Clicking "Copy Link" generates and copies the URL on demand without constant address bar updates.

The URL format encodes the following parameters:

Parameter Format Description
re / im 14 decimal places View center on the complex plane
z Exponential (6 sig figs) Zoom level
it Integer Iteration count
p Integer Palette index (0–5)
j 1 if active Julia mode flag
jre / jim 14 decimal places Julia set c-parameter (only present in Julia mode)

Example: ?re=-0.74364388797764&im=0.13182590414575&z=1.234568e+4&it=500&p=0

  • Hi-Lo Precision GPU Math: 64-bit precision emulation in GLSL shaders for deep-zoom exploration without pixelation artifacts.
  • Interactive Tour Mode: Automated exploration with two independent tour systems. The Mandelbrot tour cycles through 10 hand-picked deep-zoom coordinates using a three-phase sequence — Pan (1.8s), Zoom In (4.0s), Zoom Out (3.2s) — with smoothstep easing between phases and a zoom depth of 6000x. Both tours pick the next target randomly without repeating the previous one. On desktop, the Julia tour interpolates between 12 preset c-parameter keyframes (3.0s move, 1.2s dwell). On web, the Julia tour uses a continuous circular orbit (c = 0.7885 × e^(it)) for smooth real-time animation.
  • Professional Screenshot System: Deferred capture logic that ensures high-fidelity PNG exports by synchronizing with the GPU rendering cycle. Both desktop and web save screenshots as mandelbrot_YYYYMMDD_HHMMSS.png. On desktop, stb_image_write handles PNG encoding with automatic ARGB-to-RGBA conversion. On web, the browser generates and downloads the file directly from the canvas.
  • Dynamic HUD: A redesigned, responsive Heads-Up Display showing 14-decimal precision coordinates.

Interactive Controls

Action Desktop Key Web Key Web UI / Touch
Zoom In Left-Drag (Box) Left-Drag (Box) / Scroll Pinch-In
Pan Right-Drag Right-Drag Two-Finger Drag
Undo Ctrl + Z Ctrl + Z "Undo" Button
Screenshot S S "Screenshot" Button
Mega Screenshot (8K) X - -
Record Video V - -
Tour Mode T T "Tour" Button
GPU/CPU Toggle G G "GPU" Button
Precision Toggle E (CPU: 64/128-bit, GPU: 32/64-bit) E "32-bit / 64-bit" Button
Julia Toggle J J "Julia" Button
Burning Ship Toggle B B -
Palette Cycle P P "Palette" Button
Iterations Up/Down (Shift ×100) Up/Down Iter+/Iter-
Save Bookmark M - -
Load Bookmark L - -
Reset View R R "Reset" Button
Copy Link - - "Copy Link" Button
Quit Esc / Q - -

The Mathematics

The Mandelbrot set is defined as the set of complex numbers $c$ for which the function $f_c(z) = z^2 + c$ remains bounded when iterated from $z = 0$.

Optimization Strategies

To maintain high frame rates in dense regions, the engine implements several mathematical optimizations:

  • Main Cardioid Rejection: Points inside the main cardioid are detected using a vectorized check to skip expensive iterations.
  • Period-2 Bulb Check: Similar to the cardioid, points within the largest circular bulb are filtered out early.
  • Normalized Iteration Count: Prevents color banding by using a fractional iteration formula, resulting in smooth gradients.

Technical Architecture

Engine-Centric Design

The codebase strictly adheres to a modular architecture to ensure Separation of Concerns (SoC):

  • Core [SSOT]: Pure mathematical definitions (mandelbrot.c, julia.c) are the Single Source of Truth, agnostic to rendering APIs.
  • Engine Layer: Manages high-level rendering logic, thread-pools, and platform-agnostic graphics abstractions (via Sokol GFX).
  • Application Layer: Platform-specific entry points (SDL2 for Desktop, Emscripten for Web) handle input and OS-level interactions.

WebAssembly Subsystem

The WASM implementation utilizes SharedArrayBuffer to enable real multi-threading in the browser. The built-in scripts/server.py is configured to handle the required COOP/COEP security headers for local development.


Platform Implementations

Platform Support

Platform Renderer SIMD Status
Linux CPU / GPU (OpenGL) AVX-512 / AVX2 Supported
macOS CPU / GPU (OpenGL) AVX-512 / AVX2 Supported
Windows CPU / GPU (OpenGL) AVX-512 / AVX2 Supported
Web (Browser) CPU / GPU (WebGL 2.0) SIMD128 Supported

CPU Rendering (Native Desktop)

The native CPU engine is designed for maximum throughput on multi-core systems:

  • Dynamic Load Balancing: Instead of static partitioning, the engine uses an Atomic Row Counter. Threads dynamically "claim" the next available row of pixels, ensuring that no CPU core sits idle while others are stuck rendering dense "black" regions of the fractal.
  • AVX2 Vectorization: Utilizing 256-bit YMM registers, the engine processes 4 double-precision complex numbers in a single instruction cycle (SIMD). This provides a theoretical 4x performance boost over scalar C code.
  • Persistent Thread Pool: To avoid OS overhead, threads are spawned once at startup and managed via condition variables, ready to render new frames instantly as the user navigates. The thread count is capped at 64 regardless of core count. On WebAssembly, the engine always runs single-threaded due to platform constraints — multi-core Web Worker support is handled separately at the WASM subsystem level.

Web Rendering (WebAssembly & WASM-SIMD)

Bringing desktop-class performance to the browser required solving several engineering challenges:

  • Multithreading via Web Workers: By leveraging Emscripten's pthreads support, the C engine runs across multiple Web Workers. These workers communicate via a SharedArrayBuffer, allowing them to share the same pixel memory space as the main thread.
  • WASM-SIMD128: We utilize the modern WebAssembly SIMD proposal (128-bit) to process 2 double-precision points simultaneously, bridging the gap between browser and native performance.
  • Security & Headers: To enable SharedArrayBuffer, the environment must be "Cross-Origin Isolated." We implemented a specialized Service Worker (coi-serviceworker.js) to automatically inject COOP and COEP headers, ensuring the engine runs on standard static hosting without server-side configuration.

GPU Rendering (WebGL & Hi-Lo Precision)

The GPU path offloads all calculations to the graphics card for real-time smoothness. The shader is written in GLSL and compiled via Sokol's sokol-shdc annotation format (@vs, @fs, @program).

  • Hi-Lo Double Precision Emulation: Each coordinate is passed to the shader as two vec2 uniforms — center_hi and center_lo. The shader uses Dekker double-single arithmetic (ds_add + ds_mul) to perform full compensated addition and multiplication. This recovers ~48 mantissa bits from two 24-bit floats, achieving near-64-bit coordinate precision without hardware double support. Toggle between 32-bit and 64-bit mode at runtime with E on web.
  • Uniform Interface: The fragment shader receives center_hi, center_lo, zoom, iterations, aspect_ratio, palette_idx, julia_mode, julia_c_hi, julia_c_lo, and high_precision — giving the CPU full control over every rendering parameter per frame.
  • All 6 Palettes in Shader: The GLSL palette function exactly replicates the fractional iteration interpolation from color.c, ensuring GPU and CPU renders are visually identical when switching modes.
  • Cardioid and Period-2 Bulb Rejection: The shader performs the same early-exit checks as the CPU scalar path, skipping the iteration loop entirely for points confirmed inside the main set.
  • Julia Set Support: A julia_mode uniform switches the shader between Mandelbrot (z₀ = 0, c = pixel) and Julia (z₀ = pixel, c = fixed parameter passed as julia_c_hi + julia_c_lo).
  • Correct Escape Radius: The shader uses ESCAPE_RADIUS = 10.0 matching config.h, consistent with the CPU path.
  • Sokol GFX Integration: The same shader and pipeline logic runs on Native OpenGL (Desktop) and WebGL 2.0 (Browser) via Sokol GFX.
  • Deferred Readback: Screenshots in GPU mode utilize a "Deferred Capture" system, ensuring the pixel data is read back from the GPU memory only after the frame is fully validated.

Performance Benchmarks

The following numbers were measured on a Linux system with an Intel CPU (AVX2-capable) and an integrated GPU. Results will vary by hardware.

CPU Engine (Multi-threaded, 8 cores)

Mode Resolution Avg FPS Throughput
64-bit scalar (no SIMD) 1920×1080 ~30 fps ~62 Mpx/s
64-bit AVX2 (4× SIMD) 1920×1080 ~115 fps ~239 Mpx/s
128-bit simd-f128 (AVX2 double-double) 1920×1080 ~16 fps ~33 Mpx/s
64-bit AVX2 3840×2160 (4K) ~30 fps ~249 Mpx/s

Note

128-bit mode uses software-emulated double-double arithmetic via AVX2. The ~7× slowdown versus 64-bit is expected and still significantly faster than a naive __float128 implementation (~20–30× slower).

GPU Engine (Sokol GL / OpenGL 3.3)

Mode Resolution Avg FPS Throughput
32-bit shader (native float) 1920×1080 ~79 fps ~163 Mpx/s
64-bit emulation (Hi-Lo Dekker) 1920×1080 ~60 fps ~124 Mpx/s

Tip

To reproduce these numbers, build with -DBUILD_CPU=ON or -DBUILD_GPU=ON and run the benchmarks in benchmarks/cpu/ or benchmarks/gpu/ respectively.


Prerequisites

Before building, ensure the following tools and libraries are installed on your system.

Desktop — CPU Engine

Dependency Version Notes
GCC / Clang GCC 9+ / Clang 10+ C99 support required
CMake 3.10+ Build system
SDL2 2.0.14+ Windowing and input
SDL2_ttf 2.0+ Font rendering for HUD
libGL / OpenGL 3.3+ Required for Sokol GFX

Linux (Debian/Ubuntu):

sudo apt install cmake libsdl2-dev libsdl2-ttf-dev libgl1-mesa-dev

macOS (Homebrew):

brew install cmake sdl2 sdl2_ttf

Desktop — GPU Engine

Dependency Version Notes
GCC / Clang GCC 9+ / Clang 10+ C99 support required
CMake 3.10+ Build system
libGL / OpenGL 3.3+ Required for Sokol GFX

The GPU engine does not depend on SDL2 or SDL2_ttf.

Linux (Debian/Ubuntu):

sudo apt install cmake libgl1-mesa-dev

macOS (Homebrew):

brew install cmake

Web (WebAssembly)

Dependency Version Notes
Emscripten 3.1.0+ WASM compiler toolchain
Python 3.x Required for server.py

Follow the Emscripten installation guide and ensure emcmake is available in your PATH.


Build and Installation

Interactive TUI Build (Recommended)

Run ./build.sh without arguments for a numbered menu:

./build.sh

CLI Build

Pass a target directly to skip the menu:

Command Action
./build.sh cpu Build CPU engine only
./build.sh gpu Build GPU engine only
./build.sh web Build web (WASM) engine only
./build.sh all Build all three targets
./build.sh clean Remove all build directories

Manual Build

# Desktop — CPU engine
cmake -S . -B build_cpu -DBUILD_CPU=ON -DCMAKE_BUILD_TYPE=Release
cmake --build build_cpu
./build_cpu/mandelbrot_cpu

# Desktop — GPU engine
cmake -S . -B build_gpu -DBUILD_GPU=ON -DCMAKE_BUILD_TYPE=Release
cmake --build build_gpu
./build_gpu/mandelbrot_gpu

# Web (WASM)
emcmake cmake -S . -B build_web -DBUILD_WEB=ON
cmake --build build_web
# Output is automatically copied to the deploy/ folder

Running the Web Build Locally

The web build requires specific HTTP security headers (COOP/COEP) to enable SharedArrayBuffer. Use the included server script:

python3 scripts/server.py

Then open http://localhost:8081 in your browser.

Optional arguments:

Argument Default Description
--port 8081 Port to listen on
--dir web Directory to serve
# Example: serve the deploy/ folder on port 9000
python3 scripts/server.py --dir deploy --port 9000

Configuration

Rendering parameters can be tuned in include/config.h:

Parameter Default Description
WINDOW_WIDTH / WINDOW_HEIGHT 1024 / 768 Initial window resolution
DEFAULT_ITERATIONS 500 Initial iteration depth
MAX_ITERATIONS_LIMIT 10000 Upper bound for runtime adjustments
DEFAULT_THREAD_COUNT 4 Number of parallel threads (0 = auto-detect from CPU cores, max 64)
ESCAPE_RADIUS 10 Mathematical threshold for divergence
DEFAULT_PALETTE 0 Starting color palette index (see table below)
INITIAL_CENTER_RE / INITIAL_CENTER_IM -0.5 / 0.0 Initial view center (complex plane)
INITIAL_ZOOM 3.0 Initial zoom level
MAX_HISTORY_SIZE 100 Maximum undo history depth

Color Palettes

The engine ships with 6 built-in palettes, selectable at runtime with P or Iter+/Iter-, or set as default via DEFAULT_PALETTE in config.h:

Index Name Character
0 Sine Wave Smooth cycling colors using sine-wave interpolation
1 Grayscale Pure luminance, iteration count mapped to brightness
2 Fire Blue-to-white ramp, cool-to-hot gradient
3 Electric Red-dominant, high-contrast neon feel
4 Ocean Warm amber tones with subtle blue undertones
5 Inferno Deep blue-to-white, high-zoom detail emphasis

All palettes use fractional iteration interpolation to eliminate color banding at region boundaries.


Running Tests

The test suite covers core mathematical correctness, AVX2 vs scalar consistency, threading correctness, and I/O validation. Tests are integrated into the CMake build system and run via ctest.

cmake -S . -B build_cpu -DBUILD_CPU=ON
cmake --build build_cpu
ctest --test-dir build_cpu --output-on-failure
Test Description
test_math Verifies Mandelbrot/Julia/Burning Ship escape math, cardioid/period-2 bulb rejection, and AVX2 vs scalar consistency within 1e-7
test_renderer Validates the persistent thread pool dispatch — ensures pixel output is correctly produced across all worker threads
test_color Confirms all 6 palette functions produce valid ARGB values and gradient continuity
test_bookmark Tests bookmark serialization and round-trip load/save correctness
test_tour Validates tour phase state machine transitions and coordinate interpolation
test_config Verifies settings.txt parsing and default fallback values

AVX2 tests are compiled and run automatically if the host CPU supports it. On machines without AVX2, the scalar path is used and consistency tests are skipped.


Project Structure

.
├── include/             # Global configuration and platform headers
├── src/
│   ├── core/           # Pure Mathematical Engine (Single Source of Truth)
│   ├── engine/         # Platform-Agnostic Renderers, Tours, and Logic
│   └── app/            # Platform-Specific Entries (Desktop, Web)
├── shaders/             # GLSL shader source files
├── web/                 # Web Frontend (HTML, CSS, JS)
├── assets/              # Shared Typography and Media
├── tests/               # Automated Unit Testing Suite
├── benchmarks/
│   ├── cpu/            # CPU benchmarks (math kernels, renderer throughput, I/O)
│   └── gpu/            # GPU benchmarks (Sokol shader throughput)
├── third_party/         # Vendored external libraries
│   ├── sokol/          # Sokol headers (GFX, App, GL, Time, Fontstash)
│   ├── stb/            # stb_image_write for PNG/TGA export
│   ├── fons/           # Fontstash for HUD text rendering
│   └── simd-f128/      # AVX2-accelerated 128-bit double-double precision
├── scripts/             # Utility scripts (local dev server, etc.)
├── deploy/              # Generated by web build — ready-to-serve package
├── CMakeLists.txt       # Unified Cross-platform Build System
└── build.sh             # Interactive TUI Build Wrapper

Roadmap

Performance Optimization

  • Implement dynamic load balancing using atomic row-counters to maximize CPU utilization.
  • Integrate a pre-calculated Look-Up Table (LUT) for color mapping.
  • Implement smooth coloring algorithms using fractional iteration counts.
  • Deploy hardware-specific vectorization (AVX2 for Desktop, SIMD128 for WebAssembly).
  • Research and implement pure-shader fractal calculation for GPU rendering.
  • Optimize Julia set calculation using hardware-specific vectorization.

Features and Exploration

  • Add interactive runtime controls for iteration depth and palette switching.
  • Implement automated "camera path" and "tour" modes.
  • Connect HTML5 Frontend APIs to the web-engine for a responsive experience.
  • Implement URL-based state recovery and deep-linking for sharing discoveries.
  • Add mobile touch support (pinch-to-zoom and gesture-based panning).

Engineering and Quality

  • Establish a strict Engine-Centric Monorepo architecture.
  • Implement a high-performance CMake build system.
  • Expand unit testing coverage to ensure mathematical consistency (math, renderer, color, bookmark, tour, config).
  • Implement automatic CPU core detection for dynamic thread pool allocation.
  • Implement Hi-Lo 64-bit precision emulation for GPU shaders.
  • Implement 128-bit software double-double precision via simd-f128 (AVX2-accelerated) for deep CPU zoom.
  • Build a comprehensive benchmark suite covering CPU math kernels, multi-threaded renderer throughput (64-bit and 128-bit), image I/O, and GPU shader throughput.
  • Integrate automated performance benchmarks into all CI pipelines (Linux, macOS, Windows) with GitHub Step Summary reports.
  • Add Enterprise CI workflows: Code Formatter Enforcement (clang-format), Memory Safety (Valgrind), and Static Security Analysis (CodeQL).
  • Research and implement arbitrary-precision arithmetic for infinite zoom.

Contributing

Contributions, bug reports, and suggestions are welcome. Areas of particular interest include memory safety, SIMD optimization, and GPGPU improvements.

To contribute:

  1. Open an issue to discuss bugs or proposed changes.
  2. Fork the repository and open a pull request with your changes.
  3. Descriptive commit messages and clear explanations are appreciated.

Development Methodology & AI Assistance

Building a high-performance fractal engine in C involves navigating complex engineering tradeoffs — from SIMD vectorization strategies and IEEE 754 floating-point precision limits, to lock-free thread pool design and cross-platform shader compatibility.

To achieve this level of stability and performance, this project was architected and rigorously verified in collaboration with Advanced Agentic AI. AI was specifically utilized to:

  • Validate AVX2 intrinsic correctness and ensure scalar/SIMD result consistency within 1e-7 tolerance.
  • Assist in designing the persistent thread pool architecture (condition variable signalling, atomic row counter load balancing).
  • Verify Hi-Lo double-single arithmetic in GLSL shaders for 64-bit precision emulation without hardware double support.
  • Automate the generation of robust cross-platform CI/CD pipelines (Linux, macOS, Windows, WASM) including memory safety checks and static analysis.

However, human agency remains at the core of this project. Every line of code generated or suggested was manually inspected, audited, and verified. The core architecture, algorithms, and mathematical implementation were human-planned. This hybrid approach — combining human architectural vision with AI-driven debugging and verification — allowed this project to reach a level of engineering quality well beyond what a solo developer could achieve alone.


Author's Note

I'm just a kid building projects as a hobby. Thank you for showing interest in my little library! It really means a lot to me. :)


License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A high-performance, multi-threaded Mandelbrot and Julia set explorer written in C11. This project utilizes an Engine-Centric Architecture targeting Native Desktop (CPU/AVX2), Web (WebAssembly/SIMD128), and hardware-accelerated GPU rendering (WebGL/Sokol GFX).

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors