Fast Mandelbrot Rendering with GPU in C#

Using Nvidia graphics cards to plot fractals

Written by Guy Fernando

Created Nov 2024 - Last modified Nov 2024

Mandelbrot Set Screenshot

I recently upgraded to a new laptop capable of handling the increasingly resource-intensive tools required for modern software development. Although this laptop was primarily designed for gamers, thanks to its dedicated GPU, I initially paid little attention to this feature. However, my perspective changed upon discovering that it houses Nvidia's GeForce RTX 3060 GPU. This GPU, built on the Ampere architecture, offers 6GB of VRAM, 3,584 cores, and a boost clock speed of 1.8 GHz, providing a floating-point performance of 12,700 GFLOPS.

Let’s just reflect on this figure for a moment, in 1985, the Cray-2 supercomputer delivered just 1.9 GFLOPS, required 200 kW of power, and accounting for inflation cost approximately $50 million. Just a select few government organizations could afford it, and in the UK, only the Met Office and the Atomic Energy Research Establishment at Harwell were known to have one. Remarkably, the RTX-3060 GPU achieves floating-point performance approximately 7,000 times greater than that of the Cray-2!

The high-performance parallel processing capabilities of the GPU gave me the idea to write some code to render a Mandelbrot set fractal at speeds sufficient to produce a dynamic, moving image. By using the GPU’s thousands of cores, each working in parallel, in theory it should be possible to divide the fractal rendering task into smaller computations executed simultaneously across multiple threads.

Nvidia RTX-3060 GPU

"Remarkably, the RTX 3060 GPU achieves floating-point performance approximately 7,000 times greater than that of the Cray-2 !"

Cray-2 Supercomputer

Nvidia provides a comprehensive CUDA toolkit for developing GPU-accelerated applications in C and C++. CUDA (Compute Unified Device Architecture) is Nvidia's parallel computing platform and programming model that leverages the power of Nvidia GPUs to perform highly parallel computations. Using CUDA, developers can write C, C++, or Fortran code that runs directly on the GPU, significantly speeding up computations that benefit from parallel processing.

In addition to the native CUDA toolkit, there is also a .NET-based solution for GPU programming called ILGPU, which is a high-performance .NET compiler capable of transforming C# code into CUDA assembly code. By enabling C# developers to harness GPU power without needing to write in C or C++, ILGPU broadens access to GPU acceleration for .NET applications. We will be using ILGPU here to develop fast Mandelbrot rendering in C#.

The Code and How it Works

This program visualizes the Mandelbrot set a famous fractal that exhibits complex and infinite detail at every zoom level by rendering it within a C# WPF (Windows Presentation Foundation) application. The Mandelbrot set is generated by evaluating a mathematical formula iteratively at each point in a 2D plane, determining whether each point diverges to infinity or remains bounded. The number of iterations required for each point to diverge is used to assign colours, which produces the intricate patterns and colours that characterize the Mandelbrot fractal.

Using ILGPU in combination with WPF enables this application to leverage GPU parallel processing for generating the Mandelbrot set efficiently. ILGPU allows the program to run thousands of calculations simultaneously on the GPU, with each pixel computed in parallel, resulting in a performance boost compared to CPU-only calculations. The WPF framework handles the UI and displays the rendered fractal as a colour bitmap in real-time, and its event-driven nature allows for dynamic user interactions like zooming and panning. Together, ILGPU's parallel processing and WPF’s graphics capabilities enable the application to render the Mandelbrot set interactively and in high detail, supporting smooth real-time navigation even at extreme zoom levels.

Click here for the full source code on GitHub.

Main Window Code Behind

In this code, Context.Create initializes an ILGPU context, providing an environment for GPU computations and configuring the devices available to the program. When creating this context with builder => builder.Cuda(), the program specifies that it should use CUDA-compatible devices, such as NVIDIA GPUs, allowing ILGPU to leverage CUDA’s processing capabilities. Once the context is set, context.GetPreferredDevice(preferCPU: false) selects the most suitable device within the context for GPU computations, prioritizing a GPU over the CPU, which is typically faster for parallel operations like those needed for rendering the Mandelbrot set. This selected device is used to create an Accelerator, which will handle all GPU-specific tasks.

Next, accelerator.LoadAutoGroupedStreamKernel loads the ComputeMandelbrotFrame function as a kernel, or parallel processing function, on the GPU. Using LoadAutoGroupedStreamKernel allows ILGPU to automatically group and distribute this kernel’s workload across the GPU’s cores, optimizing its parallel processing power. The kernel itself is designed to compute Mandelbrot values in parallel, with each thread handling the calculations for one pixel. To store the results, accelerator.Allocate1D(width * height) allocates a 1D buffer on the GPU with a size equal to the number of pixels in the image. Each int in this buffer represents the iteration count or colour index for a specific pixel in the Mandelbrot set.

To ensure that all GPU computations finish before the CPU proceeds, accelerator.Synchronize() is called. Synchronization is crucial here to avoid retrieving incomplete data from the GPU. Once synchronized, buffer.GetAsArray1D() converts the GPU buffer into a standard 1D array of int values, bringing the results back to the CPU. Each value in this array corresponds to the iteration count of a pixel, which is then mapped to a colour using a colour-mapping function. This colour data is used to create a Bitmap in C#, where each pixel is assigned a colour based on its iteration count. This approach produces a full-colour Mandelbrot image, efficiently computed by leveraging the GPU.



MainWindows.xaml.cs
        
          
          
// Fast Mandelbrot Rendering with GPU in C#.
// Guy Fernando - i4cy (2024)

using System.Globalization;
using System.Windows;
using System.Windows.Input;
using System.Windows.Media;
using System.Windows.Media.Imaging;
using ILGPU;
using ILGPU.Runtime;
using ILGPU.Runtime.Cuda;
using Mandelbrot;

namespace Mandelbrot;

public static class MandelbrotConstants
{
    public const short MaxIterations = 1000;
}

public partial class MainWindow : Window
{
    private short width;
    private short height;

    private double centerX = -0.74;
    private double centerY = 0.15;
    private double scale = 2.5;

    private bool isPanning = false;
    private bool isZooming = false;
    private Point startPanPoint;

    private Context context;
    private Accelerator accelerator;
    private Action
        <Index1D, ArrayView1D<int, Stride1D.Dense>,
        double, double, double, short, short> kernel;

    public MainWindow()
    {
        InitializeComponent();

        // Initialize ILGPU context and accelerator.
        context = Context.Create(builder => builder.Cuda());
        accelerator =
            context.GetPreferredDevice(preferCPU: false).
            CreateAccelerator(context);

        // Load the kernel once during initialization.
        kernel =
            accelerator.LoadAutoGroupedStreamKernel
            <Index1D, ArrayView1D<int, Stride1D.Dense>,
            double, double, double, short, short>
            (MandelbrotKernel.ComputeMandelbrotFrame);

        // Add event handlers.
        this.KeyDown += MainWindow_KeyDown;
        this.SizeChanged += MainWindow_SizeChanged;

        // Generate the initial Mandelbrot set.
        GenerateMandelbrotFrame();
    }

    private void MainWindow_SizeChanged(object sender, SizeChangedEventArgs e)
    {
        // Update the width and height based on the new window size.
        width = (short)e.NewSize.Width;
        height = (short)e.NewSize.Height;

        // Regenerate the Mandelbrot set with the updated dimensions.
        GenerateMandelbrotFrame();
    }

    private void MainWindow_KeyDown(object sender, KeyEventArgs e)
    {
        if (e.Key == Key.Space)
        {
            if (!isZooming)
            {
                isZooming = true;
                StartAutoZoom();
            }
            else
            {
                isZooming = false;
            }
        }
    }

    private async void StartAutoZoom()
    {
        // Set fixed coordinates for auto-zoom, near a point of interest.
        centerX = -0.74335165531181;
        centerY = +0.13138323820835;

        // Zoom speed multiplier.
        const double zoomFactorIncrement = 0.95;

        // Stop when zoom factor is extremely high.
        while (isZooming && scale > 1e-13)
        {
            // Reduce the zoom scale.
            scale *= zoomFactorIncrement;

            // Render the Mandelbrot set at the new zoom level.
            GenerateMandelbrotFrame();

            // Small delay ensuring UI responsiveness.
            await Task.Delay(1);
        }
    }

    private void UpdateStatusBar()
    {
        CenterXText.Text = $"Center X: {centerX:F14}";
        CenterYText.Text = $"Center Y: {centerY:F14}";

        // Display zoom factor in engineering format
        string zoomFormatted = 
            (1 / scale).ToString("F1", CultureInfo.InvariantCulture);
        ZoomFactorText.Text = $"Zoom: {zoomFormatted}";
    }

    protected override void OnClosed(EventArgs e)
    {
        // Cleanup resources on window close.
        base.OnClosed(e);
        accelerator.Dispose();
        context.Dispose();
    }

    private void GenerateMandelbrotFrame()
    {
        if (width <= 0 || height <= 0)
            return; // Skip rendering if dimensions are invalid.

        // Update Status Bar.
        UpdateStatusBar();

        // Calculate the aspect ratio
        double aspectRatio = (double)width / height;

        // Determine the scaling factors to maintain aspect ratio.
        double adjustedScaleX, adjustedScaleY;
        if (aspectRatio >= 1.0)
        {
            adjustedScaleX = scale * aspectRatio;
            adjustedScaleY = scale;
        }
        else
        {
            adjustedScaleX = scale;
            adjustedScaleY = scale / aspectRatio;
        }

        // Allocate memory on the GPU.
        using var buffer = accelerator.Allocate1D<int>(width * height);

        // Execute the GPU kernel with the current parameters.
        kernel(
            (int)(width * height), buffer.View, 
            centerX, centerY, scale, width, height);

        // Wait for all GPU kernel processes to complete.
        accelerator.Synchronize();

        // Retrieve the results from GPU
        int[] result = buffer.GetAsArray1D();

        // Set the Image control source to display the Mandelbrot set.
        MandelbrotImage.Source = CreateFrameBitmap(result);
    }

    private WriteableBitmap CreateFrameBitmap(int[] pixels)
    {
        // Create a WriteableBitmap filled with the Mandelbrot set image.
        WriteableBitmap bitmap = 
            new WriteableBitmap(
                width, height, 96, 96, PixelFormats.Bgra32, null);

        bitmap.Lock();
        unsafe
        {
            IntPtr pBackBuffer = bitmap.BackBuffer;
            for (short y = 0; y < height; y++)
            {
                for (short x = 0; x < width; x++)
                {
                    Color color = GetPixelColor(pixels[y * width + x]);

                    *((uint*)pBackBuffer + y * width + x) = (uint)(
                        (color.A << 24) | 
                        (color.R << 16) | 
                        (color.G << 8)  | 
                        (color.B << 0) );
                }
            }
        }
        bitmap.AddDirtyRect(new Int32Rect(0, 0, width, height));
        bitmap.Unlock();

        return bitmap;
    }

    private static Color GetPixelColor(int iterations)
    {
        if (iterations >= MandelbrotConstants.MaxIterations)
        {
            return Colors.Black;
        }
        else
        {
            // Convert HSV to RGB for a more colour pleasing image.
            return ColorFromHSV(
                ((double)(iterations)) / 
                MandelbrotConstants.MaxIterations * 360.0,
                1.0,
                1.0
                );
        }
    }

    public static Color ColorFromHSV(
        double hue, double saturation, double value)
    {
        sbyte hi = Convert.ToSByte(Math.Floor(hue / 60) % 6);
        double f = hue / 60 - Math.Floor(hue / 60);

        value = value * 255;
        byte v = Convert.ToByte(value);
        byte p = Convert.ToByte(value * (1 - saturation));
        byte q = Convert.ToByte(value * (1 - f * saturation));
        byte t = Convert.ToByte(value * (1 - (1 - f) * saturation));

        if (hi == 0)
            return Color.FromArgb(255, v, t, p);
        else if (hi == 1)
            return Color.FromArgb(255, q, v, p);
        else if (hi == 2)
            return Color.FromArgb(255, p, v, t);
        else if (hi == 3)
            return Color.FromArgb(255, p, q, v);
        else if (hi == 4)
            return Color.FromArgb(255, t, p, v);
        else
            return Color.FromArgb(255, v, p, q);
    }
}

        
      

Kernel Code

The kernel code running on the GPU is responsible for calculating each pixel in the Mandelbrot set in parallel, making full use of the GPU’s processing power. The GPU kernel calculates each pixel of the Mandelbrot set in parallel. It takes Index1D as input and ArrayView1D as output. The Index1D parameter is a unique 1D identifier for each thread, representing a specific pixel in a flattened image array. This allows each thread to know which pixel to calculate independently.

The ArrayView1D is a 1D view of an integer array in GPU memory, where each element stores the Mandelbrot result for a pixel. The dense stride ensures contiguous memory access, optimizing GPU performance. Using Index1D, each thread maps to an x and y coordinate in the image, defining a complex plane position represented as a Complex object. The kernel performs iterations on this complex coordinate according to the Mandelbrot formula shown below, where each thread iterates until the point escapes or the maximum iterations are reached.

Each thread then stores its iteration count in the ArrayView1D array at the position defined by Index1D, representing the computed value for that pixel. Once all threads complete their calculations, this array holds the data needed for rendering the Mandelbrot set, with each entry in the array reflecting the colour or shading level based on the iteration count at each coordinate in the complex plane.



MandelbrotKernel.cs
        
          
          
// Fast Mandelbrot Rendering with GPU in C#.
// Guy Fernando - i4cy (2024)

using ILGPU;
using ILGPU.Runtime;
using System.Numerics;

namespace Mandelbrot;

public static class MandelbrotKernel
{
    public static void ComputeMandelbrotFrame(
        Index1D index, ArrayView1D<int, Stride1D.Dense> output,
        double centerX, double centerY,
        double scale, short width, short height)
    {
        int x = index % width;
        int y = index / width;

        // Calculate aspect ratio and scaling factors.
        double aspectRatio = (double)width / height;
        double adjustedScaleX, adjustedScaleY;
        if (aspectRatio >= 1.0)
        {
            adjustedScaleX = scale * aspectRatio;
            adjustedScaleY = scale;
        }
        else
        {
            adjustedScaleX = scale;
            adjustedScaleY = scale / aspectRatio;
        }

        // Calculate the complex coordinate.
        Complex c = GetComplexCoordinate(
            x, y, adjustedScaleX, adjustedScaleY, 
            centerX, centerY, width, height);

        // Perform Mandelbrot iteration.
        short iterations = CalculateMandelbrotPixel(c);

        // Write result to output.
        output[index] = iterations;
    }

    private static Complex GetComplexCoordinate(
        int x, int y, 
        double scaleX, double scaleY, 
        double centerX, double centerY, 
        short width, short height)
    {
        double real = (x * scaleX / width) - (scaleX / 2) + centerX;
        double imaginary = (y * scaleY / height) - (scaleY / 2) + centerY;

        return new Complex(real, imaginary);
    }

    private static short CalculateMandelbrotPixel(Complex c)
    {
        Complex z = Complex.Zero;
        short iterations = 0;

        while (iterations < MandelbrotConstants.MaxIterations && 
            (z.Real * z.Real + z.Imaginary * z.Imaginary) <= 4.0)
        {
            // z based on Mandelbrot iteration formula z = z^2 + c.
            z = z * z + c;
            iterations++;
        }

        return iterations;
    }
}

        
      
UML Sequence Diagram

Sequence Diagram

A UML sequence diagram helps break down the logic of method calls, responses, and dependencies between objects, showing a clear picture of how different parts of the program work together in a sequential, step-by-step manner.

The sequence diagram here represents the overall interaction between the MainWindow object, the MandelbrotKernel object and ILGPU objects and in particular the call sequence between object methods. Notice that the use of an UML Interaction Frame (rectangle with notched "par" descriptor box in top left corner) to illustrate that the kernel ComputeMandelbrotFrame is a parallel processing function, executed on the GPU to perform each pixel computation in parallel.

The Program in Action

If you are unable to build and run the program, here is a short video of the program in action. In a real-time video capture of the Mandelbrot fractal program, the recording begins with a broad view of the iconic, symmetrical shaped Mandelbrot set, centred against a vast complex plane. As the video progresses, it zooms in on various intriguing regions, drawing the viewer deeper into the fractal's mesmerizing intricacies.

Conclusion

In this article, we achieved a detailed exploration of parallel programming in C# using ILGPU, focusing on implementing and optimizing a Mandelbrot fractal rendering program. Through a combination of C# and WPF for the UI and ILGPU for GPU-accelerated computations, we developed a high-performance interactive fractal visualizer that harnesses the power of parallel processing to demonstrate the beauty and infinite complexity of the Mandelbrot set in real time.

We began by setting up the GPU context and accelerator, explaining each component’s role in managing parallel workloads, and discussed how ILGPU’s kernel functions allow thousands of pixel calculations to be performed simultaneously on the GPU. This parallelization enabled the efficient generation of complex fractal images, overcoming the computational demands that make fractal rendering traditionally slow on CPUs.