http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter14.html
Michael Bunnell
NVIDIA Corporation
In this chapter we describe a new technique for computing diffuse light transfer and show how it can be used to compute global illumination for animated scenes. Our technique is efficient enough when implemented on a fast GPU to calculate ambient occlusion and indirect lighting data on the fly for each rendered frame. It does not have the limitations of precomputed radiance transfer(光辐射传输) (PRT) or precomputed ambient occlusion techniques, which are limited to rigid objects that do not move relative to one another (Sloan 2002). Figure 14-1 illustrates how ambient occlusion and indirect lighting enhance environment lighting.
【这里介绍一种高效的基于GPU运算的ambient occlusion技术。这里突破了一般预计算方式的只可应用于静态对象的局限。】
Figure 14-1 Adding Realism with Ambient Occlusion and Indirect Lighting
Our technique works by treating polygon meshes as a set of surface elements that can emit, transmit, or reflect light and that can shadow each other. This method is so efficient because it works without calculating the visibility of one element to another. Instead, it uses a much simpler and faster technique based on approximate shadowing to account for occluding (blocking) geometry.
【我们的技术把多边形表面看作是一组表面的单元集合，他们之间可以emit, transmit, reflect shadow，通过这样的近似可以简单快速的获得起到阻塞效果的几何的形状。】
14.1 Surface Elements
The first step in our algorithm is to convert the polygonal data to surface elements to make it easy to calculate how much one part of a surface shadows or illuminates another.
【这里算法的第一步就是将多边形数据转化成surface elements。】
Figure 14-2 illustrates the basic concept. We define a surface element as an oriented disk with a position, normal, and area. An element has a front face and a back face. Light is emitted and reflected from the front-facing side. Light is transmitted and shadows are cast from the back. We create one element per vertex of the mesh. Assuming that the vertices are defined with a position and normal already, we just need to calculate the area of each element. We calculate the area at a vertex as the sum of one-third of the area of the triangles that share the vertex (or one-fourth of the area for quads). Heron’s formula for the area of a triangle with sides of length a, b, and c is:
where s is half the perimeter of the triangle: (a + b + c)/2.
【下图展示的就是这一步的概念示意，surface element定义成圆形表面包含位置/法向/area信息。 surface包含正反面，光线从正面emit/reflect，反面形成transmit/shadow。
对于多边形的每一个顶点生成一个surface element. 顶点的位置法线直接赋予surface element，area的计算由使用到这个顶点的三角形的面积总和的三分之一，计算公式如上。】
Figure 14-2 Converting a Polygonal Mesh to Elements
We store element data (position, normal, and area) in texture maps because we will be using a fragment program (that is, a pixel shader) to do all the ambient occlusion calculations. Assuming that vertex positions and normals will change for each frame, we need to be able to change the values in the texture map quickly.
One option is to keep vertex data in a texture map from the start and to do all the animation and transformation from object space to eye (or world) space with fragment programs instead of vertex programs. We can use render-to-vertex-array to create the array of vertices to be sent down the regular pipeline, and then use a simple pass-through vertex shader.
Another, less efficient option is to do the animation and transformation on the CPU and load a texture with the vertex data each frame.
【我们需要把surface element的position/normal/area的信息存储到texture用于pixel shader. 假设顶点的位置法线是每个frame都变化的，因此我们需要快速改变texture的值。
一种可行的方案是一直保持一开始的时候的顶点信息，之后动画的变化完全由eye space/pixel shader来替代object space/vertex shader的处理，然后render to vertex array生成顶点数组，再交由正常的流水线再处理，之后就是一个简单的vertex shader可以搞定了。
另外一种低效的解决方案是在CPU上面处理动画变化生成texture的方式。】
14.2 Ambient Occlusion
Ambient occlusion is a useful technique for adding shadowing to diffuse objects lit with environment lighting. Without shadows, diffuse objects lit from many directions look flat and unrealistic. Ambient occlusion provides soft shadows by darkening surfaces that are partially visible to the environment. It involves calculating the accessibility value, which is the percentage of the hemisphere above each surface point not occluded by geometry (Landis 2002). In addition to accessibility, it is also useful to calculate the direction of least occlusion, commonly known as the bent normal. The bent normal is used in place of the regular normal when shading the surface for more accurate environment lighting.
【AO解释，在对象表面生成软阴影可以有效的提高真实感。】
We can calculate the accessibility(辅助) value at each element as 1 minus the amount by which all the other elements shadow the element. We refer to the element that is shadowed as the receiver and to the element that casts the shadow as the emitter. We use an approximation based on the solid angle of an oriented disk to calculate the amount by which an emitter element shadows a receiver element. Given that A is the area of the emitter, the amount of shadow can be approximated by:
Equation 14-1 Shadow Approximation
【计算辅助值：1减去所有其他element在此的阴影。Element 作为接收者 shadowed，作为发光者造成阴影。因为发光者和接收阴影着的角度都是已知的，，我们采用上面的公式来估算，配合下面的示意图。A是emitter的面积。】
As illustrated in Figure 14-3, qE is the angle between the emitter’s normal and the vector from the emitter to the receiver. qR is the corresponding angle for the receiver element. The max(1, 4 x cos qR ) term is added to the disk solid angle formula to ignore emitters that do not lie in the hemisphere above the receiver without causing rendering artifacts for elements that lie near the horizon.
【这一段在解释变量含义】
Figure 14-3 The Relationship Between Receiver and Emitter Elements
Here is the fragment program function to approximate the element-to-element occlusion:
【下面是计算函数的实现】
14.2.1 The Multipass Shadowing Algorithm
We calculate the accessibility values(辅助值) in two passes.
【这里计算包含两个pass】
In the first pass, we approximate the accessibility for each element by summing the fraction(分数) of the hemisphere(半球) subtended(对着) by every other element and subtracting(减法) the result from 1.
【第一个pass是根据上面的公式来近似计算每一个element的分数】
After the first pass, some elements will generally be too dark because other elements that are in shadow are themselves casting shadows. So we use a second pass to do the same calculation, but this time we multiply each form factor by the emitter element’s accessibility from the last pass.
【经过第一步会导致有些elements太暗了，原因在于存在投影的过度叠加。因此第二个pass做同样的计算，但是这里我们乘上每一个emitter elements的上一步计算出来的辅助值。】
The effect is that elements that are in shadow will cast fewer shadows on other elements, as illustrated in Figure 14-4. After the second pass, we have removed any double shadowing.
【效果如下图所示，通过第二步我们解决的是double shadowing导致的太暗的问题】
However, surfaces that are triple shadowed or more will end up being too light. We can use more passes to get a better approximation, but we can approximate the same answer by using a weighted average of the combined results of the first and second passes. Figure 14-5 shows the results after each pass, as well as a ray-traced solution for comparison. The bent normal calculation is done during the second pass. We compute the bent normal by first multiplying the normalized vector between elements and the form factor. Then we subtract this result from the original element normal.
【其实通过上面的两步还是得不到很好的结果，比如第二步只去除的是双重叠加的效果，如果是三重叠加我们还需要更进一步的 pass来去除叠加效果，这是个无底洞。 因此我们采用对第二步的结果再设置权重值的方式来获得更好的近似效果，下下图就是结果展示。】
Figure 14-4 Correcting for Occlusion by Overlapping Objects
Figure 14-5 Comparing Models Rendered with Our Technique to Reference Images
We calculate the occlusion result by rendering a single quad (or two triangles) so that one pixel is rendered for each surface element. The shader calculates the amount of shadow received at each element and writes it as the alpha component of the color of the pixel. The results are rendered to a texture map so the second pass can be performed with another render. In this pass, the bent normal is calculated and written as the RGB value of the color with a new shadow value that is written in the alpha component.
【每一个pass，一个surface element当作一个pixel来处理，这样shader将每个element计算得到的阴影值作为这个pixel的alpha值，结果渲染到texture map，这样就可以用于下一个pass。normal值当作texture的RGB分量参与计算。】
14.2.2 Improving Performance
Even though the element-to-element shadow calculation is very fast (a GeForce 6800 can do 150 million of these calculations per second), we need to improve our algorithm to work on more than a couple of thousand elements in real time. We can reduce the amount of work by using simplified geometry for distant surfaces. This approach works well for diffuse lighting environments because the shadows are so soft that those cast by details in distant geometry are not visible. Fortunately, because we do not use the polygons themselves in our technique, we can create surface elements to represent simplified geometry without needing to create alternate polygonal models. We simply group elements whose vertices are neighbors in the original mesh and represent them with a single, larger element. We can do the same thing with the larger elements, creating fewer and even larger elements, forming a hierarchy. Now instead of traversing every single element for each pixel we render, we traverse the hierarchy of elements. If the receiver element is far enough away from the emitter—say, four times the radius of the emitter—we use it for our calculation. Only if the receiver is close to an emitter do we need to traverse its children (if it has any). See Figure 14-6. By traversing a hierarchy in this way, we can improve the performance of our algorithm from O(n 2) to O(n log n) in practice. The chart in Figure 14-7 shows that the performance per vertex stays consistent as the number of vertices in the hierarchy increases.
【其实这样的element to element(pixel to pixel)的计算已经很快了。我们要增强我们的算法来尽可能多的支持顶点(element/pixel)数。这里的想法就是通过空间几何关系，相邻的一些定点可以组合当作一个element group（计算的时候当作一个element）来处理，然后起作用再细分，就是一般层次化的方法。】
Figure 14-6 Hierarchical Elements
Figure 14-7 Ambient Occlusion Shader Performance for Meshes of Different Densities
【性能图示】
We calculate a parent element’s data using its direct descendants in the hierarchy. We calculate the position and normal of a parent element by averaging the positions and normals of its children. We calculate its area as the sum of its children’s areas. We can use a shader for these calculations by making one pass of the shader for each level in the hierarchy, propagating the values from the leaf nodes up. We can then use the same technique to average the results of an occlusion pass that are needed for a following pass or simply treat parent nodes the same as children and avoid the averaging step. It is worth noting that the area of most animated elements varies little, if at all, even for nonrigid objects; therefore, the area does not have to be recalculated for each frame.
【这里交代父节点（高层次）的数据来源】
The ambient occlusion fragment shader appears in Listing 14-1.
【下面是完整的shader】
Example 14-1. Ambient Occlusion Shader
14.3 Indirect Lighting and Area Lights
We can add an extra level of realism to rendered images by adding indirect lighting caused by light reflecting off diffuse surfaces (Tabellion 2004). We can add a single bounce of indirect light using a slight variation of the ambient occlusion shader. We replace the solid angle function with a disk-to-disk radiance transfer function. We use one pass of the shader to transfer the reflected or emitted light and two passes to shadow the light.
【直接光照和间接光照的阴影结果我们通过一个shader将结果合到一起。】
For indirect lighting, first we need to calculate the amount of light to reflect off the front face of each surface element. If the reflected light comes from environment lighting, then we compute the ambient occlusion data first and use it to compute the environment light that reaches each vertex. If we are using direct lighting from point or directional lights, we compute the light at each element just as if we are shading the surface, including shadow mapping. We can also do both environment lighting and direct lighting and sum the two results. We then multiply the light values by the color of the surface element, so that red surfaces reflect red, yellow surfaces reflect yellow, and so on. Area lights are handled just like light-reflective diffuse surfaces except that they are initialized with a light value to emit.
【这里解释怎么合兵：首先我们要得到直接光照的结果和OSAO的结果，直接光照结果的计算来自于一般的光照计算方法方法shadow map。亮度就是两种光照结果只和，颜色就是光线颜色。面积光就当作是发光表面来处理。】
Here is the fragment program function to calculate element-to-element radiance transfer:
【 element-to-element radiance transfer处理的代码片段】
Equation 14-2 Disk-to-Disk Form Factor Approximation
We calculate the amount of light transferred from one surface element to another using the geometric term of the disk-to-disk form factor given in Equation 14-2. We leave off the visibility factor, which takes into account blocking (occluding) geometry. Instead we use a shadowing technique like the one we used for calculating ambient occlusion—only this time we use the same form factor that we used to transfer the light. Also, we multiply the shadowing element’s form factor by the three-component light value instead of a single-component accessibility value.
【我们使用上面的公式来计算光线从一个element transfer 到另一个。也就是说我们这里用了OSAO那种思想来做光线的传播。】
We now run one pass of our radiance-transfer shader to calculate the maximum amount of reflected or emitted light that can reach any element. Then we run a shadow pass that subtracts from the total light at each element based on how much light reaches the shadowing elements. Just as with ambient occlusion, we can run another pass to improve the lighting by removing double shadowing. Figure 14-8 shows a scene lit with direct lighting plus one and two bounces of indirect lighting.
【我们首先用一个pass来跑radiance-transfer shader来计算element之间的光线的发出和反射来得到每一个element的光线总和，接着跑shadow pass：从到达element的光线总和的结果再减去这个pass计算的结果就是AO的结果，处理多重阴影的覆盖问题就是通过多个pass和参数解，见上面的讲解。下图展示结果】
Figure 14-8 Combining Direct and Indirect Lighting
14.4 Conclusion
Global illumination techniques such as ambient occlusion and indirect lighting greatly enhance the quality of rendered diffuse surfaces. We have presented a new technique for calculating light transfer to and from diffuse surfaces using the GPU. This technique is suitable for implementing various global illumination effects in dynamic scenes with deformable geometry.
【废话不解释】
14.5 References
Landis, Hayden. 2002. “Production-Ready Global Illumination.” Course 16 notes, SIGGRAPH 2002.
Pharr, Matt, and Simon Green. 2004. “Ambient Occlusion.” In GPU Gems, edited by Randima Fernando, pp. 279–292. Addison-Wesley.
Sloan, Peter-Pike, Jan Kautz, and John Snyder. 2002. “Precomputed Radiance Transfer for Real-Time Rendering in Dynamic, Low-Frequency Lighting Environments.” ACM Transactions on Graphics (Proceedings of SIGGRAPH 2002) 21(3), pp. 527–536.
Tabellion, Eric, and Arnauld Lamorlette. 2004. “An Approximate Global Illumination System for Computer Generated Films.” ACM Transactions on Graphics (Proceedings of SIGGRAPH 2004) 23(3), pp. 469–476.