The rise of 3D Gaussian Splatting: what is it and how is it changing the immersive media industry?

The landscape of immersive media is advancing at an unprecedented rate, with 3D Gaussian Splatting emerging as a pivotal breakthrough. This technique looks very promising for a wide range of applications and could revolutionize the way we create and interact with digital environments in the future. 

In this article, we are going to explore the depth and potential of 3D Gaussian Splatting, by comparing it with predecessors like photogrammetry and NeRF, and to explore some of the tools available on the market today.

What is 3D Gaussian Splatting?

3D Gaussian Splatting is a sophisticated technique in computer graphics that creates high-fidelity, photorealistic 3D scenes by projecting points, or "splats," from a point cloud onto a 3D space, using Gaussian functions for each splat. This technique supports complex view-dependent visual effects and surpasses traditional point cloud rendering by producing dynamic and lifelike visualizations.

One of the most popular research papers that explains in detail the technicalities around 3D Gaussian splatting for Real-time radiance field rendering was published this year at SIGGRAPH 2023 (full paper and git repository are publicly available online), triggering a renewed interest in this subject.

Understanding the Basics: A Guide for Everyone

To better understand how 3D Gaussian splatting compares to the traditional ways of visualizing point clouds, let's experiment. Think of yourself as an artist, but instead of painting on a regular canvas, you're creating in a space that surrounds you. You paint colored dots for each object in the scene, generating a collection of points, which we refer to as a “point cloud”.

Pointillism painting by Georges Pierre Seurat, “La Seine à la Grande-Jatte. Source

When we visualize a plain point cloud on a traditional viewer, the scene is made up using tiny dots to form a picture, similar to Pointillism in art. Pointillism artists create images using small, separate dots. While these pictures look good from a distance, up close, they're just individual dots. This same issue occurs in most of the traditional point cloud visualization techniques.

Impressionist painting by Claude Monet, “Woman with a Parasol - Madame Monet and Her Son. Source

Now, imagine painting like the Impressionist artists. They didn’t just use dots; they blended colors on the canvas, making the picture smoother and more realistic. This is the idea behind 3D Gaussian Splatting: instead of only using dots, it uses 'splats' that softly blend together. Each splat is like a gentle dot that has its own color and can be transparent.

TILT BRUSH GIF, extracted from Tilt Brush: Painting from a new perspective

To understand this better, think about fog. With the traditional methods to visualize point clouds, it's like seeing each tiny droplet of fog. Instead, with Gaussian Splatting it is like looking at the smooth, whole effect of fog – where each part blends into the next.

To do it, 3D Gaussian Splatting uses a mathematical function called the Gaussian, which makes this “continuous visualization” in space possible. Thanks to this formula, the scene looks more real, giving it depth and a natural appearance, instead of looking discretized and pixelated.

What Does the Gaussian Function Look Like?

The Gaussian function, fundamental to 3D Gaussian Splatting, resembles a bell curve, and it is crucial for transforming individual points into a vivid, continuous scene.

3D Gaussian splats use it to represent the following information:

  • Position (XYZ): Determines where each point is in the 3D space.

  • Covariance (3x3 matrix): Dictates how each point stretches or scales, affecting its shape and size.

  • Color (RGB): Decides the hue of each point, adding to the visual richness.

  • Alpha (α): Controls the transparency, making the scene more lifelike.

3D Gaussian Splatting vs. Photogrammetry vs NeRF

When comparing 3D Gaussian Splatting with photogrammetry and NeRF (which are other established techniques to digitize real places or objects into digital twins), it is important to understand their differences.

Their workflow all shares a common starting point: taking multiple overlapping photographs of an object or environment from various angles. These are then processed differently by each technique to produce a digital version of the captured environment.

Photogrammetry:

The images are used to construct a 3D mesh based on the camera position detected on each picture.

Advantages: Photogrammetry is known for its relatively low computational footprint and the direct output of a 3D mesh. This mesh is readily usable in traditional game engine render pipelines. Additionally, these meshes can be skinned for animation, making them ideal for applications in game development and animated simulations.

Disadvantages:  This technique has limitations when dealing with shiny or transparent surfaces, which can cause holes and errors in the shape of the generated 3D model.

Use Cases: Photogrammetry is suitable when resource efficiency is key, and the end product needs to integrate seamlessly with standard game engines or animation tools, in the form of a 3D mesh model.

Here is a sample 3D model I reconstructed from photos using photogrammetry and Reality Capture:

Neural Radiance Fields (NeRF):

Processes the images using artificial intelligence and neural networks to generate any viewing angle of a scene, filling any gaps or missing photos by blending the information of the existing ones.

Advantages: NeRF excels in its AI-driven ability to generate any viewing angle of a scene, filling gaps or missing photos by blending existing images. This makes it particularly effective for complex scenes where photogrammetry might struggle. Additionally, NeRF does not require as many images from every angle as photogrammetry does, thanks to its learned intelligence.

Disadvantages: While NeRF is adept at handling the shortcomings of photogrammetry, it can be more computationally demanding and slower to render compared to photogrammetry and 3D Gaussian Splatting.

Use Cases: It's ideal for applications requiring high flexibility in viewpoint generation and for scenes where dealing with incomplete data is a challenge.

Here is a sample video of a NeRF capture I made using Nerfstudio:

3D Gaussian Splatting:

Uses a rasterization technique that allows real-time rendering of photorealistic scenes from small samples of images. It begins with estimating a point cloud from the initial set of images using the Structure from Motion method. Each point is then converted to a Gaussian, which is described by parameters such as position, covariance, color, and transparency.

Advantages: Compared to NeRF (as of today), this technique is noted for its fast, real-time rasterization and the ability to create high-quality, photorealistic scenes. This technique is also particularly adept at rendering thin surfaces like hair convincingly, providing high-quality, real-time visualizations.

Disadvantages: Gaussian Splatting (as of today) is noted for its high VRAM usage, and is not yet fully compatible with existing rendering pipelines.

Use Cases: Gaussian Splatting is beneficial in scenarios where real-time rendering and the visualization of intricate details (like hair or thin structures) are crucial, such as in virtual reality applications or high-end visualizations.

It is worth noting that since all three techniques use a dataset of input photos at the beginning of their workflow, it is relatively easy to interchange them while using the same dataset of images, to produce different visualizations according to the desired use case.

Here is a video comparing all three techniques, starting from the same dataset of input images:

And here is a video focusing on the differences between NeRF and Gaussian Splatting captures:

Real-World Applications and Industry Impact

3D Gaussian Splatting is not just a theoretical marvel, its practical applications are vast. Among them are:

  • Real Estate: Enhances virtual property tours, offering potential buyers a realistic experience. This can revolutionize how properties are showcased and explored remotely.

  • Urban Planning: Assists in creating digital twins of cities, aiding in better planning and management. By offering high-fidelity, real-time renderings of urban spaces, it contributes significantly to more effective urban development strategies.

  • Virtual Reality (VR) and Augmented Reality (AR): Gaussian Splatting is particularly effective in creating highly realistic VR backdrops (virtual environments or scenes that are created as a background setting for various applications), offering a new level of immersion in virtual environments.

  • E-Commerce and Graphic Design: It could revolutionize online shopping experiences by enabling high-quality, real-time 3D rendering of products, leading to more interactive and immersive shopping experiences. In graphic design, it can help create more realistic 3D models and animations, enhancing the quality and speed of the design process.

  • Photorealistic Avatars in Telepresence: Meta's experimentation with Gaussian Splatting for Codec Avatars demonstrates its potential in achieving photorealistic telepresence in VR environments. This application enhances the realism and lighting of avatars, a crucial element for immersive virtual communication

GIF extracted from Meta’s video showcased in https://shunsukesaito.github.io/rgca/

  • Camera tracking and 3D reconstruction: SplaTAM is an advanced application in dense RGB-D SLAM that uses Gaussian Splatting for precise camera tracking and high-fidelity reconstruction in real-world scenarios, showcasing its utility in complex spatial mapping and 3D reconstruction.

GIF extracted from the video showcased in https://spla-tam.github.io/

  • Gaming: Elevates the gaming experience with more immersive and realistic environments. Plugins for Gaussian Splatting are already available for major game engines like Unity and Unreal Engine, enhancing the visual quality of game worlds

Its ability to handle large datasets and its versatility is catching the eye of various sectors, signaling a potential revolution in how we interact with virtual worlds.

Integration in Gaming Engines and Collaborative 3D Platforms

The inclusion of 3D Gaussian Splatting in gaming engines like Unity and Unreal Engine marks a significant stride in game development. Plugins and packages are now available, bringing this technology to a wider audience and unlocking new possibilities in interactive gaming and simulation. 

For Unity, the GaussianSplatting package (Unity Asset Store) and for Unreal Engine, the UEGaussianSplatting plugin (Unreal Engine Marketplace) are examples of such developments.

Recently, another solution was implemented for the collaborative 3D platform Spline , which showcased their support for 3D Gaussian splatting with a very exciting demo, discussed further here: https://designmusketeer.com/spline-new-update-gaussian-splatting.

We can expect more plugins and tools to be released in the future, improving even further the support of Gaussian splats in digital products.

Gaussian Splatting on Android

While companies like Luma AI and Polycam targeted iOS and Web for their Gaussian splatting pipelines, very recently the support for Android was announced by the Kiri Innovations team in their new Kiri Engine 3.0, which is a 3D scanning app for both Android and iOS devices.

Kiri Engine initially was released in 2022 as a photogrammetry tool, reconstructing 3D models from source photos. The inclusion of 3D Gaussian Splatting represents a significant advancement, allowing the creation and viewing of high-quality, potentially fast-rendering 3D representations of objects or scenes.

Examples of 3D Gaussian Splats Viewers

Luma AI and other platforms offer interactive examples of Gaussian Splatting, turning everyday scenes into immersive experiences.

Some WebGL-based examples of 3D Gaussian Splatting captures are hosted here: https://gsplat.tech/

It is incredible to see how these tools can also be used with old images to generate appealing Gaussian Splats of past events. This is the case of the “MTV Movie Awards” in 2009, which featured a “Fashion 360 rig” stage on the red carpet that was recently used to generate captures of several known artists.

As an example, check this lifelike Miley Cyrus’s Gaussian Splats capture: https://poly.cam/tools/gaussian-splatting?capture=89f4003f-d23b-4da1-b158-6030b24ec8af

More captures from the same user for other celebrities can be found here: https://poly.cam/@tmp123

Community Engagement and Support

A vibrant community of developers, researchers, and enthusiasts is forming around 3D Gaussian Splatting, with GitHub repositories and social media groups buzzing with activity. These platforms facilitate the sharing of knowledge and foster collaboration (GitHub Topics on Gaussian Splatting).

How to Get Started

To get started using this technology, you can check the following beginner-friendly introduction by Reshot.ai: https://www.reshot.ai/3d-gaussian-splatting

Resources for Further Learning

For those seeking to delve deeper, resources like Hugging Face Blog, WangFeng18's GitHub Repository, and educational content from CableLabs offer a wealth of information. Tutorials, online courses, and extensive documentation are available for beginners and advanced users.

Challenges and Future Directions

Despite its promising advantages, 3D Gaussian Splatting faces challenges such as computational intensity and implementation complexity. However, the future looks bright as research delves into overcoming these hurdles, with discussions at forums and conferences like SIGGRAPH fueling further innovation. Among the recent solutions, there have been attempts to reduce the size of Gaussian Splats files, such as the ones from programmer Aras Pranckevičius

I am personally and professionally very excited about 3D Gaussian Splatting and the opportunities it offers. As the technology matures and the community grows, its applications will only expand, promising to redefine immersive media.

As a final bonus for reaching the end of the article, here is a great video from “Bad Decisions Studio”, showcasing how they turned movie shots into 3D Gaussian Splats:

As always, I hope you will be successful in your journey and that my contribution will speed your learning up!

Previous
Previous

A decade of people doing great work together!

Next
Next

How we render extremely large point clouds