Kevin Mapstone - Voxel Destruction

About

Voxel art (or voxel-based meshes) is a style of 3D modeling that uses voxels (uniformly-sized cubes) to compose a 3D model. You've seen multiple examples of this already in this portfolio, but here are some more examples in case you're interested:

Video games will often opt for pre-calculated fracture meshes for objects that need to be able to be destroyed in-game, since generating them in real-time is expensive. But voxel meshes have some unique characteristics that make real-time procedural destruction viable, such as:

The presence of internal information
Normal meshes use UV texture coordinates to define the color (and other physical information) of their surface. Voxel models, however, not only contain the voxels on their surface, but also the voxels that are on the "inside", which would not normally be visible. This means that when a fracture mesh includes a part of the model that was originally on the inside (imagine cutting a layered cake in half), it doesn't need to guess about what that face looks like, since all of the information already exists.
A finite smallest unit of volume
There is an effectively infinite number of ways to slice a normal mesh, which makes calculating the new faces created by a fracture complicated. Voxel models are simpler since they can be represented as a 3D grid, limiting the number of possible fracture points to be processed. This 3D grid structure is also optimal for SIMD operations, meaning that many of the required steps can be done in parallel for each voxel.
All faces have the same shape
Not only are all faces of a voxel model rectangles; they are also restricted to only 6 directions, since all voxels have the same orientation. This makes the generation of fracture meshes much easier, since as long as you know which voxels are visible on the surface, you can simply build the surface out of a series of quads.

The fractured object shown on the left is the result of determining two random fracture points. As you can see, internal information is revealed by the fracture meshes (notice how the inside pages of the book are visible from the side). This is possible because the full voxel grid is stored - a feature that wouldn’t be possible with normal meshes without manually creating a high-resolution 3D texture of some kind.

The destruction algorithm has three major steps: Partition, Merge, and Mesh. The Partition step separates all voxels into groups to determine the content of the new fracture meshes. The Merge step occurs for each fracture group to combine nearby voxels that share the same color. The purpose of this step is to simplify the fracture meshes for optimal rendering. The Mesh step generates the fracture meshes from the merged bounds calculated previously.

Development

Octrees

My initial idea was to use an octree for merging and storing voxel information efficiently, since voxels are well-suited for them, being uniformly-sized cubes. While subdividing, the octree would check if all contained voxels are the same color. A subdivision where all voxels have the same color would stop subdividing and become a leaf node. This is simple to do because each voxel is simply three unsigned integers (x, y, z), identifying its place on the grid and allowing for simple for-loop iteration.

However, it turned out that this approach was both inefficient for storage and impossible to completely parallelize. Because the subdivisions of an octree are cubes, adjacent voxels that are only similar along one or two dimensions are not merged at all. Also, because each level of subdivided node requires its parent to be processed first, voxel merging cannot be parallelized and therefore cannot enjoy the benefits of SIMD processing.

In hindsight, I should have been able to predict these drawbacks and shouldn't have gone deep into the implementation details before confirming that it was the best option. I won't go into the weeds with this one, but here's the code if you're curious:

Full-Compute - X/Y/Z Merge

In order for this to be viable in real-time, we need a full-compute algorithm that utilizes SIMD parallelization at every step possible. In other words, we should keep the resources used by the algorithm as shader resources for the entire process, until the new mesh's vertex/index buffers are populated in CSVoxelMesh.

Instead of using octrees to merge similar voxels, a number of parallel loops are performed to merge voxels along the X-axis, then merge the result along the Y-axis, and then merge that result along the Z-axis. For example, if the voxel model is 12x8x4, an X-merge pass will consist of 8*4=32 compute threads, each of which iterate through 12 voxels and merge similar voxels as they go. This will be followed by a Y-merge pass on the result and then a Z-merge pass on the Y-merge's result. This approach succeeds where octrees fail, since one or two-dimensional strips of similar voxels are always merged, no matter which axis they align along.

Here's a high-level outline of the full-compute algorithm described:

Initialization
- Create all global buffers and shader resources necessary for compute shaders
  - CSVoxelPartition
    - Constant Buffer
      - uint4 points[MAX_PARTITIONS] - Fracture points, can be any number of points up to MAX_PARTITIONS
      - uint pointCount - Number of fracture points
    - Texture3D<uint4>[MAX_PARTITIONS] UAV - The voxel grid corresponding to each of the fracture points
  - CSVoxelMerge
    - Constant Buffer
      - uint mergeAxis - Enum indicating which axis to merge in
    - Texture3D<uint4> UAV - Color region bounds (taken from C++ after interpreting output from CSVoxelPartition step)
    - Texture3D<uint4> UAV - Null region bounds (taken from C++ after interpreting output from CSVoxelPartition step)
  - CSVoxelMesh
    - Texture3D<uint4> SRV - Color region bounds (directly from the end of CSVoxelMerge step)
    - Texture3D<uint4> SRV - Null region bounds (directly from the end of CSVoxelMerge step)
    - Output vertex buffer/index buffer will be created per-entity
- Load the voxel grid(s) that will serve as the original model of the entities in the scene
On Fracture
- CSVoxelPartition (Execute for each voxel)
  - Of the points given (in constant buffer), find the point which is closest to this voxel
  - Select the 3D texture corresponding to the closest point
  - Add this voxel to the texture, indicating position and color index
- CSVoxelMerge (Execute once for each X-strip/Y-strip/Z-strip)
  - Merge all color regions in three passes (merge along X, merge along Y, merge along Z), including null region
  - Texture3D<uint4> maps voxel coords to bounds max, with w representing the index of the color region - all consecutive voxels with the same w coordinate are combined into a single bounding box
  - Put all empty bounds in the null regions texture for future reference
- CSVoxelMesh (Execute once for each color region bounds retrieved after merging)
  - Clip surface faces of this bounding box - for each of 6 faces,
    - Extend bounds in the face direction
    - Clip the extended bounds against all bounds in the null region
    - Perform simple merging of the resulting set of clipped bounds for this face
  - Add the clipped bounds to the output vertex buffer - for each clipped face,
    - Calculate the 4 vertices of the face
    - Optionally check for redundant vertices
    - Determine indices for this face (2 triangles)
    - Append vertices to vertex buffer
    - Append indices to index buffer

These steps are initiated in the function below. The Partition and Merge steps are both done in PartitionGrid() (the fractureCount parameter is hard-set to 2 in this case), and the Mesh step is done upon construction of the VoxelRenderer component after the fracture is complete. The CSVoxelCollider step, which is not completely relevant here, is done upon construction of the VoxelCollider component.

void VoxelDestructable::Fracture()
{

          std::unordered_map<IntVector, VoxelGridComponent*>
          fractureGroups = PartitionGrid(2);


          Transform* transform =
          entity->GetComponent<Transform>();

          for(auto& fractureGroup : fractureGroups)

          {

          
          Vector groupDir =
          voxelGrid->GridCoordsToNormalizedLocalSpace(fractureGroup.first);


          
          Entity* fractureEntity = new Entity();

          
          fractureEntity->AddComponent(new
          Transform(transform->GetLocation() + groupDir,
          transform->GetRotation(), transform->GetScale()));

          
          fractureEntity->AddComponent(fractureGroup.second);

          
          fractureEntity->AddComponent(new
          VoxelRenderer(fractureGroup.second,
          ResourceManager::GetMaterial(L"Palette")));

          
          fractureEntity->AddComponent(new
          VoxelCollider(fractureGroup.second));

          
          fractureEntity->AddComponent(new Rigidbody(Vector(0,
          -9.81f, 0)));

          
          fractureEntity->AddComponent(new
          VoxelDestructable(voxelGrid));

          
          EntityManager::AddEntity(fractureEntity);


          
          fractureEntity->GetComponent<Rigidbody>()->SetVelocity(groupDir);

          }


          EntityManager::DestroyEntity(entity);
}

Here's a more detailed look at the PartitionGrid() function:

std::unordered_map<IntVector, VoxelGridComponent*>
          VoxelDestructable::PartitionGrid(size_t fractureCount)
{

          VoxelGridComponent* voxelGridComponent =
          entity->GetComponent<VoxelGridComponent>();


          /* Generate random grid points and initialize point groups */


          std::vector<IntVector> points;

          for(size_t i = 0; i < fractureCount; i++)

          points.push_back(IntVector::Rand(IntVector::Zero(),
          voxelGridComponent->GetDimensions()));


          std::unordered_map<IntVector, VoxelGridComponent*>
          pointGroups;

          for(IntVector point : points)

          pointGroups.emplace(point, 

          new VoxelGridComponent(voxelGridComponent->width,
          voxelGridComponent->height, voxelGridComponent->depth));


          // Partition this grid into point groups

          CS_VoxelPartition(voxelGridComponent, pointGroups);


          // Merge the voxels of each of the partitioned grids

          for(auto& pointGroup : pointGroups)

          {

          VoxelGridComponent* voxelGridComponent = pointGroup.second;


          VoxelGridComponent::CS_VoxelMerge(

          voxelGridComponent->width, voxelGridComponent->height,
          voxelGridComponent->depth, 

          voxelGridComponent->boundsTextureUAV, 

          voxelGridComponent->colorBoundsBufferUAV, 

          voxelGridComponent->nullBoundsBufferUAV, 

          voxelGridComponent->colorBoundsCounterBuffer,
          voxelGridComponent->colorBoundsCount, 

          voxelGridComponent->nullBoundsCounterBuffer);

          }


          return pointGroups;
}

CS_VoxelPartition() is one of the functions that run a compute shader using DX11 API calls. All of these functions are mostly the same, so here's an example from Partition:

void VoxelDestructable::CS_VoxelPartition(VoxelGridComponent*
          voxelGridComponent, std::unordered_map<IntVector,
          VoxelGridComponent*> pointGroups)
{

          Microsoft::WRL::ComPtr<ID3D11ComputeShader>
          computeShader =
          ResourceManager::GetComputeShader(L"CS_VoxelPartition");

          Microsoft::WRL::ComPtr<ID3D11Buffer> constantBuffer =
          ResourceManager::GetConstantBuffer(L"CS_VoxelPartition");


          /* UAVs for the output textures are managed by their
          respective VoxelGridComponents */


          std::vector<IntVector> inputPoints;

          std::vector<ID3D11UnorderedAccessView*> outputUAVs;

          for(auto& pointGroup : pointGroups)

          {

          
          inputPoints.push_back(pointGroup.first);

          
          outputUAVs.push_back(pointGroup.second->GetBoundsTextureUAV().Get());

          }

          // Pad the rest of the vector with null UAVs

          while(outputUAVs.size() < MAX_PARTITIONS)

          
          outputUAVs.push_back(nullptr);


          Graphics::Context->CSSetShader(computeShader.Get(),
          nullptr, 0);


          Graphics::Context->CSSetShaderResources(0, 1,
          voxelGridComponent->boundsTextureSRV.GetAddressOf());


          Graphics::Context->CSSetUnorderedAccessViews(0,
          MAX_PARTITIONS, outputUAVs.data(), nullptr);


          Graphics::Context->CSSetConstantBuffers(0, 1,
          constantBuffer.GetAddressOf());


          /* Dispatch */

          {

          
          VoxelPartitionConstants constants = {};

          
          for(size_t i = 0; i < inputPoints.size(); i++)

          
          
          constants.points[i] = DirectX::XMUINT4(inputPoints[i].x,
          inputPoints[i].y, inputPoints[i].z, inputPoints[i].w);

          
          constants.pointCount = inputPoints.size();


          
          D3D11_MAPPED_SUBRESOURCE cbMapped = {};

          
          Graphics::Context->Map(constantBuffer.Get(), 0,
          D3D11_MAP_WRITE_DISCARD, 0, &cbMapped);

          
          memcpy(cbMapped.pData, &constants,
          sizeof(VoxelPartitionConstants));

          
          Graphics::Context->Unmap(constantBuffer.Get(), 0);


          
          Graphics::Context->Dispatch(voxelGridComponent->width,
          voxelGridComponent->height, voxelGridComponent->depth);

          }


          // Unbind

          ID3D11ShaderResourceView* nullSRVs[16] = {};

          Graphics::Context->CSSetShaderResources(0, 16,
          nullSRVs);

          ID3D11UnorderedAccessView* nullUAVs[MAX_PARTITIONS] = {};

          Graphics::Context->CSSetUnorderedAccessViews(0,
          MAX_PARTITIONS, nullUAVs, nullptr);
}

Future Work

Although I've achieved what I originally aimed for and plan to migrate this to Vulkan for my game engine (see this page for more info), there is more that can be done in terms of optimization. Specifically, collision interactions between the fractured meshes are still quite inaccurate and expensive. Even after adding a CSVoxelCollider stage, the approximated collision bounds look pretty awkward, and when there are many meshes close to each other (which is often the case immediately after fracturing), there's a noticable performance hit. I've read that this is a common problem in engines and is usually solved by adding a short period immediately after destruction where collision detection is disabled.

Demos

The model is broken into two meshes using random fracture points. Notice how the fracture occurs instantly, without any lag spikes.

Another fracture pattern. Notice how the surfaces of the new meshes account for all of the newly-visible surface voxels.

The fracture meshes are treated the same as the original, so they can be fractured as well. However, having too many fracture meshes can cause lag due to a suboptimal physics system and the frequency of collisions between the pieces.

Code

Compute Shaders

All of the main logic discussed here is found in the compute shaders; each step uses one compute shader and passes the resulting information to the next step via their shared shader resources on the GPU.

Procedural Voxel Destruction

About

Development

Demos

Code