All ComputeShader GPU

DirectCompute tutorial for Unity 6: Consume buffers

This tutorial will be covering how to use consume buffers in Direct Compute. This tutorial was originally going to be combined with the append buffer tutorial as consume and append buffers are kinda the same thing. I decided it was best to split it up because the tutorial would have been a bit too long. In the last tutorial I had to add a edit because it turns out that there are some issues with using append buffers in Unity. It looks like they do not work on some graphics cards. There has been a bug report submitted and hopefully this will be fixed some time in the future. To use consume buffers you need to use append buffers so the same issue applies to this tutorial. If the last one did not work on your card neither will this one.

(sonsume buffer也和append buffer一样,不是所有的硬件都支持很好,这点在下面的过程中要注意,遇到奇怪的问题可能是硬件造成的。)


I also want to point out that when using append or consume buffers if you make a mistake in the code it can cause unpredictable results when ran even if you later fix the code and run it again. If this happens, especially if the error caused the GPU to crash it is best to restart Unity to clear the GPU context.

(注意很多crush的情况下建议重启unity,清空GPU context。再来)


To get started you will need to add some data to a append buffer as you can only consume data from a append buffer. Create a new C# script and add this code.



public class ConsumeBufferExample : MonoBehaviour


    public Material material;

    public ComputeShader appendBufferShader;

    const int width = 32;

    const float size = 5.0f;

    ComputeBuffer buffer;

    ComputeBuffer argBuffer;

    void Start()


        buffer = new ComputeBuffer(width * width, sizeof(float) * 3, ComputeBufferType.Append);


        appendBufferShader.SetBuffer(0, "appendBuffer", buffer);

        appendBufferShader.SetFloat("size", size);

        appendBufferShader.SetFloat("width", width);

        appendBufferShader.Dispatch(0, width/8, width/8, 1);

        argBuffer = new ComputeBuffer(4, sizeof(int), ComputeBufferType.DrawIndirect);

        int[] args = new int[]{ 0, 1, 0, 0 };


        ComputeBuffer.CopyCount(buffer, argBuffer, 0);


        Debug.Log("vertex count " + args[0]);

        Debug.Log("instance count " + args[1]);

        Debug.Log("start vertex " + args[2]);

        Debug.Log("start instance " + args[3]);


    void OnPostRender ()



        material.SetBuffer ("buffer", buffer);


        Graphics.DrawProceduralIndirect(MeshTopology.Points, argBuffer, 0);


    void OnDestroy ()







Here we are simply creating a append buffer and then adding a position to it from the “appendBufferShader” for each thread that runs.

(创建一个append buffer,跑appendBufferShader加入position信息给这个buffer)


We also need a shader to render the results. The  “Custom/AppendExample/BufferShader” shader posted in the last tutorial can be used so I am not going to post the code again for that. You can find it in the append buffer tutorial or just download the project files (links at the end of this tutorial).



Now attach the script to the camera, bind the material and compute shader and run the scene. You should see a grid of red points.

(跑的结果就是看到a grid of red points)


We have appended some points to our buffer and next we will consume some. Add this variable to the script.



public ComputeShader consumeBufferShader;

Now add these two lines under the dispatch call to the append shader.



consumeBufferShader.SetBuffer(0, "consumeBuffer", buffer);

consumeBufferShader.Dispatch(0, width /8, width /8, 1);


This will run the compute shader that will consume the data from the append buffer. Create a new compute shader and then add this code to it.



#pragma kernel CSMain

ConsumeStructuredBuffer<float3> consumeBuffer;


void CSMain (uint3 id : SV_DispatchThreadID)


    float3 pos = consumeBuffer.Consume();



Now bind this shader to the script and run the scene. You should see nothing displayed. In the console you should see the vertex count as 0. So what happened to the data?

(shader挂到上面的c#代码上然后跑结果,看不到效果且vertex count显示0)


Its this line here that is responsible.


float3 pos = consumeBuffer.Consume();

This removes a element in the append buffer each time it is called. Since we ran the same amount of threads as there are elements in the append buffer in the end everything was removed. Also noticed that the consume function will return the value that was removed.

(原因:consumeBuffer每取出一个数据就会在自己的buffer里面删掉他,因此执行完这个shader后consume buffer就为空了。)


This is fairly simple but there are a few key steps to it. Notice that the buffer needs to be declared as a consume buffer in the compute shader like so…



ConsumeStructuredBuffer<float3> consumeBuffer;

But notice that in the script the buffer we bound to the uniform was not of the type consume. It was a append buffer. You can see so when it was created.



buffer = new ComputeBuffer(width * width, sizeof(float) * 3, ComputeBufferType.Append);

There is no type consume, there is only append. How the buffer is used depends on how you declare it in the compute shader. Declare it as “AppendStructuredBuffer”  to append data to it and declare it as a “ConsumeStructuredBuffer” to consume data from it.



Consuming data from a buffer is not without is risks. In the last tutorial I mentioned that appending more elements than the buffers size will cause the GPU to crash. What would happen if you consumed more elements than the buffer has? You guessed it. The GPU will crash. Always try and verify that your code is working as expected by printing out the number of elements in the buffer during testing.



Removing every element from the buffer is a good way to clear the append buffer (which also appears to be the only why to clear a buffer with out recreating it) but what happens if we only remove some of the elements?


Edit – Unity 5.4 has added a ‘SetCounterValue’ function to the buffer so you can now use that to clear a append or consume buffer.



Change the dispatch call to the consume shader to this…



consumeBufferShader.Dispatch(0, width/2 /8, width/2 /8, 1);

Here we are only running the shader for a quarter of the elements in the buffer. But the question is which elements will be removed? Run the scene. You will see the points displayed again but some will be missing. If you look at the console you will see that there are 768  elements in the buffer now. There was 1024 and a quarter (256) have been removed to leave 768. But there is problem. The elements removed seem to be determined at random and it will be (mostly) different each time you run the scene.



This fact revels how append buffers work and why consume buffers have limited use. These buffers are LIFO structures. The elements are added and removed in the order the kernel is ran by the GPU but as each kernel is ran on its own thread the GPU can never guarantee the order they will run. Every time you run the scene the order the elements are added and removed is different.



This does limit the use of consume buffers but does not mean they are useless. LIFO structures are something that have never been available on the GPU and as long as the elements exact order does not matter they will allow you to perform algorithms that where impossible to do so on the GPU in the past. Direct compute also adds the ability to have some control over how threads are ran by using thread synchronization, which will be covered in a later tutorial.

(这个问题确实影响了consume buffer的使用,注意避免这种问题的影响)

2 thoughts on “DirectCompute tutorial for Unity 6: Consume buffers”

  1. I do not know whether it’s just me or if everybody else encountering issues with your blog. It appears like some of the written text in your posts are running off the screen. Can somebody else please provide feedback and let me know if this is happening to them as well? This could be a issue with my internet browser because I’ve had this happen previously. Appreciate it

Comments are closed.