All

网格生成优化

基于Greedy Mesh算法来生成网格,减少网格点的数量。

https://github.com/roboleary/GreedyMesh

https://github.com/barraudf/Unity-Voxel

 
 

方法的效果视频:

https://www.youtube.com/watch?v=0OZxZZCea8I

 
 

 
 

当前的网格生成渲染流程如下图所示,明显的就是两个方法,首先是生成网格,然后是渲染网格。

 
 

NormalBlockMeshGen.cpp 里面实现了基于体素块生成网格的方法。

void NormalBlockMeshGen::GenMeshImpl(AGE::uint16 *normalblocks, AGE::uint32 *pointlightblocks, const Vector3i& tileid, const Vector3 positionOffset, CubeChunk *pCubeChunk, bool isSecction)

 
 

风险在于贴图映射的时候,由于点的缺失,是否还能保留乐高的特征。

 
 

先来看一下算法实现:

https://0fps.net/2012/06/30/meshing-in-a-minecraft-game/

 
 

 
 

The last post I wrote on Minecraft-like engines got a lot of attention, and so I figured it might be interesting to write a follow up.  This time I’ll try to tackle a different problem:【因为上一篇获得了很多关注,才有了这一篇文章,我们来讲一不同地问题。】

 
 

Meshing

One of the main innovations in Infiniminer (and Minecraft) is that they use polygons instead of raycasting to render their volumes.  The main challenge in using polygons is figuring out how to convert the voxels into polygons efficiently.  This process is called meshing, and in this post I’m going to discuss three different algorithms for solving this problem.   But before doing this, we first need to say a bit more precisely what exactly meshing is (in the context of Minecraft maps) and what it means to do meshing well.  For the sake of exposition, we’ll consider meshing in a very simplified setting:【类似我的世界的体素游戏的创新点在于,他们采用了多边形来替代光线投射来渲染体素,这里带来的一个技术难点就是怎样将体素有效地转化为多边形。这个过程叫做meshing,这里我们将将三种不同的算法来实现这一步,但在这之前我们首先需要解释一下什么叫做好的meshing。为了便于说明,我们将考虑在非常简化的环境中进行网格划分:

  • Instead of having multiple block types with different textures and lighting conditions, we’ll suppose that our world is just a binary 0-1 voxel map.我们假设我们的世界只是一个二进制0-1体素图,而不是具有不同纹理和光照条件的多个块类型。
  • Similarly we won’t make any assumptions about the position of the camera, nor will we consider any level of detail type optimizations.同样,我们不会对相机的位置做任何假设,也不会考虑任何级别的细节类型优化。
  • Finally, we shall suppose that our chunks are stored naively in a flat bitmap (array).最后,我们假设我们的块被天真存储在一个平面位图(数组)中。

 
 

I hope you are convinced that making these assumptions does not appreciably change the core of the meshing problem (and if not, well I’ll try to sketch out in the conclusion how the naive version of these algorithms could be extended to handle these special cases.)【我希望你确信做出这些假设,这就没有明显改变这里讲的meshing核心问题

In a typical Minecraft game chunks do not get modified that often compared to how frequently they are drawn.  As a result, it is quite sensible to cache the results from a mesher and only ever call it when the geometry changes.  Thus, over the lifetime of a mesh, the total amount of computational work it consumes is asymptotically dominated by the cost of rendering.【在体素游戏中,相比于块的渲染,块的修改是没有那么频繁的,因此一个普遍考虑的做法就是需要存储meshing的结果,这个结果只有在集合数据发生变化的时候变化。因此在网格的生命周期中,其消耗的计算量主要集中在渲染开销。】

Recall that at a high level polygon rendering breaks down into two parts: 1.) primitive assembly, and 2.) scan conversion (aka polygon filling).  In general, there really isn’t any way to change the number of pixels/fragments which need to be drawn without changing either the resolution, framerate or scene geometry; and so for the purposes of rendering we can regard this as a fixed cost (this is our second assumption). That leaves primitive assembly, and the cost of this operation is strictly a function of the number of faces and vertices in the mesh.  The only way this cost can be reduced is by reducing the total number of polygons in the mesh, and so we make the following blanket statement about Minecraft meshes:【回想一下,在高级别多边形渲染分为两部分:primitive assembly 和 scan conversion。通常实际上没有任何方法可以在不改变分辨率,帧速率或场景几何的情况下改变需要绘制的 pixels/fragments 的数量,因此scan conversion优化空间有限,可以看作是渲染的固定成本。而 primitive assembly 的成本严格地取决于网格中的面和顶点的数量,因此降低此成本的唯一方法是减少网格中多边形的总数,因此有了网格评估的第一标准:

Criteria 1: Smaller meshes are better meshes.【标准一:mesh越少越好】

Of course, the above analysis is at least a little naive.  While it is true that chunk updates are rare, when they do happen we would like to respond to them quickly.  It would be unacceptable to wait for more than a frame or two to update the displayed geometry in response to some user change.  Therefore, we come to our second (and final) criteria for assessing mesh algorithms:【虽然chunk的更新频率不高,但也是必定存在也不可忽略的。用户对chunk改变的等待超过一两帧是不可接受的,因此我们得出了评估meshing好坏的第二个标准:】

Criteria 2: The latency of meshing cannot be too high.【标准二:meshing的延迟要低】

Speed vs. Quality

Intuitively, it seems like there should be some tradeoff here.  One could imagine a super-mesher that does a big brute force search for the best mesh compared to a faster, but dumber method that generates a suboptimal mesh.  Supposing that we were given two such algorithms, it is not a-priori clear which method we should prefer.  Over the long term, maybe we would end up getting a better FPS with the sophisticated method, while in the short term the dumb algorithm might be more responsive (but we’d pay for it down the road).【速度和质量是一个悖论,一方面想要获得最佳的网格质量需要大开销的蛮力搜索,一方面快速算法只能获得有瑕疵的网格质量。假设我们给出了两个这样的算法,那么我们应该优先选择哪种方法。 从长远来看,也许我们最终会使用复杂的方法获得更好的FPS,而在短期内,快速算法可能会更适合(但我们会为此付出代价)。

Fortunately, there is a simple solution to this impasse.  If we have a fast, but dumb, mesher; we can just run that on any meshes that have changed recently, and mark the geometry that changed for remeshing later on.  Then, when we have some idle cycles (or alternatively in another thread) we can run the better mesher on them at some later point.  Once the high quality mesh is built, the low quality mesh can then be replaced.  In this way, one could conceivably achieve the best of both worlds.【幸运的是这个问题有一个简单的解决方案,首先我们有快速的次优解决方案 ,当发生改变的时候我们首先使用它得到结果;然后当我们有一些空闲周期(或者在另一个线程中)时,我们可以运行更好的解决方案,一旦获得了高质量的解决方案结果,就可以替换低质量的结果。

Some Meshing Algorithms

 
 

The conclusion of this digression is that of the two of these criteria, it is item 1 which is ultimately the most important (and unfortunately also the most difficult) to address.  Thus our main focus in the analysis of meshing algorithms will be on how many quads they spit out.  So with the preliminaries settles, let’s investigate some possible approaches:【首先声明一下的是,上面讲述的两个原则,第一条始终是最重要的。因此我们主要关注在算法将生成的网格数量,下面是一些可行的meshing方法:】

The Stupid Method:

The absolute dumbest way to generate a Minecraft mesh is to just iterate over each voxel and generate one 6-sided cube per solid voxel.  For example, here is the mesh you would get using this approach on a 8x8x8 solid region:【最愚蠢的生成方法就是对于每个体素生成一个立方体,比如下图所示是一个8*8*8的体素单元生成的立方体数量。】

 
 

The time complexity of this method is linear in the number of voxels.  Similarly, the number of quads used is 6 * number of filled voxels.  For example, in this case the number of quads in the mesh is 8*8*8*6 = 3072.  Clearly this is pretty terrible, and not something you would ever want to do.  In a moment, we’ll see several easy ways to improve on this, though it is perhaps worth noting that for suitably small geometries it can actually be workable.【这个方法的时间复杂度与体素数量呈线性关系,也就是体素数量的六倍即是网格数量。比如8*8*8的网格对应的就是8*8*8*6的面数,很明显这个方法相当的可怕,稍后我们将介绍一些方法来改进它。但是对于体素数量不多的情况下这种方法也可以说是可行的。】

The only thing good about this approach is that it is almost impossible to screw up.  For example, I was able to code it up correctly in minutes and it worked the first time (compared to the culling method which took me a few iterations to get the edge cases right.)  Even if you don’t plan on using this approach it can still be handy to have it around as a reference implementation for debugging other algorithms.【这种方法的好处就是实现简单且不易出错,它可以作为我们后续实现方法的参考和基准。】

Culling:

Clearly the above method is pretty bad.  One obvious improvement is to just cull out the interior faces, leaving only quads on the surface.  For example, here is the mesh you get from culling the same 8x8x8 solid cube:【上面的方法的一个很明显的可改进方法就是剔除内部的面,只保留体素整体外部的面,如上面的8x8x8网格提出结果如下图所示:】

 
 

Practically speaking, this method is pretty much the same as the previous, except we have to do a bit of extra book keeping when we traverse the volume.  The reason for this is that now we not only have to read each voxel, but we we also have to scan through their neighbors.  This requires a little more thought when coding, but time complexity-wise the cost of generating this mesh is asymptotically the same as before.  The real improvement comes in the reduction of quads.  Unlike the stupid method, which generates a number of quads proportional to the boundary, culling produces quads proportional to the surface area.  In this case, the total number of faces generated is 8*8*6 = 384 — a factor of 8x better!【实际上这个方法与上面的第一种方法做法几乎相同,区别就是额外做了一遍遍历来记录哪些块是内部的。但是这种方法在时间空间复杂度的度量上面差别不大,这种方法可以有效的减少需要被网格表示的体素数量,例如上面的8*8*8网格,我们得到的面数是8*8*6 = 384,效率提升了8倍。】

It is not hard to see that the culled mesh is strictly smaller than the stupid mesh (and often by quite a lot!).  We can try to estimate how much smaller using a dimensional analysis argument:  assuming that our geometry is more-or-less spherical, let n be its radius.  Then, in the stupid method we would expect to generate about O(n^3) quads, while in the culled  version we’d get only O(n^2).  This gives an improvement of a factor of about O(n), which is pretty good!  So, if our chunks were say 16x16x16, then we might expect the culled volume to have about 16x fewer faces than the naive method (heuristically).  Of course, spherical geometry is in some sense the best possible case.  If your world is really jagged and has lots of holes, you wouldn’t expect much improvement at all.  In the worst case (namely a checkerboard world) the size of the meshes produced would be identical!  In practice, one would usually expect something somewhere in between.【这段解释一下剔除网格的复杂度较小,且生成网格数量严格小于等于第一种方法,废话较多不翻译了。】

Greedy Meshing:

The previous two approaches are probably the most frequently cited methods for generating Minecraft-like meshes, but they are quite far from optimal.  The last method that I will talk about today is a greedy algorithm which merges adjacent quads together into larger regions to reduce the total size of the geometry.  For example, we could try to mesh the previous cube by fusing all the faces along each side together:前两种方法可能是最常被引用的用于生成类似Minecraft的网格的方法,但它们远非最佳。 我今天要讨论的最后一种方法是一种贪婪算法,它将相邻的四边形合并为更大的区域,以减小几何数据的总量。 例如,在立方体网格化的时候我们可以尝试将每侧的所有面融合在一起成为一个面,得到的结果将如下图所示:

 
 

This gives a mesh with only 6 quads (which is actually optimal in this case!)  Clearly this improvement by a factor of 64 is quite significant.  To explain how this works, we need to use two ideas.  First, observe that that it suffices to consider only the problem of generating a quad mesh for some 2D cross section.  That is we can just sweep the volume across once in each direction and mesh each cross section separately:【这个例子的结果将只剩下6个面,这里改进了64倍,为了解释着如何做到的,我们要使用两个想法:首先我们观察到所有的合并仅发生在同一个2d横截面,也就是说我们可以分别对每个横截面来扫描做合并。】

 
 

 
 

This reduces the 3D problem to 2D.  The next step (and the hard one!) is to figure out how to mesh each of these 2D slices.  Doing this optimally (that is with the fewest quads) is quite hard.  So instead, let’s reformulate the problem as a different type of optimization.  The idea is that we are going to define some type of total order on the set of all possible quadrangulations, and then pick the minimal element of this set as our mesh.  To do this, we will first define an order on each quad, which we will then extend to an order on the set of all meshes.  One simple way to order two quads is to sort them top-to-bottom, left-to-right and then compare by their length.【这将3D问题降维到了2D,下一步要做的就是怎样将这2d切片网格化,这是非常难的,因此我们可以将问题转化为不同类型的优化。我们的想法是在所有可能的四边形集合上定义某种类型的总顺序,然后选择该集合的最小元素作为我们的网格。顺序的定义的一种简单方法是从上到下,从左到右排序,然后按长度进行比较。】

 
 

【公式化,我们定义一个四元组(xywh),其中x,y是左上角的点的位置,w,h是这个体素块的宽高,这意思的公式定义如第一个式子。这里我们定义S是一组Q的集合,那么这些Q的顺序可以由上面的第二个公式来确定,这个相对应的伪代码就如上面的最后一个函数所示。】

 
 

 
 

【我们要做的下一件事是将Q的这种排序扩展到网格上的排序,我们以一种非常简单的方式做到这一点。
已有两个排好顺序的Q,顺序规则如上面截图中的表述所示,我们就可以简单的按照字典顺序比较。同样这个新的顺序实际上是一个总顺序(total order),这意味着它具有唯一的最小元素。
在贪婪的网格划分中,我们将把这个最小元素作为我们的网格,我们称之为贪婪网格。 以下是这些按字典顺序排列的最小网格之一的示例:】

 
 

 
 

You may be skeptical that finding the least element in this new order always produces a better quad mesh.  For example, there is no guarantee that the least element in this new order is also the mesh with the fewest quads.  Consider what you might get when quad meshing a T-shape using this method:【你可能会怀疑新顺序中最小元素总会产生更好的四边形网格这件事。
例如,无法保证此新顺序中的最小元素也是具有最少四边形的网格(这里就是说新序列中会产生更好的,但不能确定是最好的)。 考虑使用此方法对T形四边形网格化时可能会得到的结果:】

 
 

 
 

However, we can say a few things.  For example, it always true that each quad in the greedy mesh completely covers at least one edge on the perimeter of the region.  Therefore, we can prove the following:但是我们可以确定的是。
例如,贪婪网格中的每个四边形总是完全覆盖该区域周边的至少一个边缘。 因此我们可以证明以下内容:

 
 

Proposition: The number of quads in the greedy mesh is strictly less than the number of edges in the perimter of the region.【命题:贪婪网格中的四边形数量严格小于该区域的边界中的边数。】

 
 

This means that in the worst case, we should expect the size of the greedy mesh to be roughly proportional to the number of edges.  By a heuristic/dimensional analysis, it would be reasonable to estimate that the greedy mesh is typically about O(Sqrt(n)) times smaller than the culled mesh for spherical geometries (where n is the radius of the sphere as before).  But we can be more precise about how good the greedy mesh actually is:这意味着在最坏的情况下,我们应该期望贪婪网格的大小与边缘的数量大致成比例。
通过启发式/维度分析,估计贪婪网格通常比球形几何的剔除网格小约OSqrtn))倍是合理的(其中n是球体的半径,如前所述)。 但我们可以更精确地了解贪婪网格实际上有多好:

Theorem: The greedy mesh has no more than 8x as many quads than the optimal mesh.定理:贪婪网格的四边形数量不超过最优网格数量的8倍。

That’s within a constant factor of optimality! To prove this, we’ll need to introduce a bit of nomenclature first. We’ll call a vertex on the perimeter of a region convex if it turns to the right when walking around the mesh clockwise, and otherwise we’ll call it concave. Here is a picture, with the convex vertices colored in red and the concave vertices colored in green:这是一个恒定的最优因素!
为了证明这一点,我们首先需要引入一些命名法。 我们将在一个区域的周边调用一个顶点凸起,如果它顺时针绕网格向右转,否则我们将其称为凹面。 这是一张图片,凸顶点为红色,凹顶点为绿色:

 
 

 
 

 
 

There’s a neat fact about these numbers.  It turns out that if you sum them up, you always get the same thing:【根据这些内容你可以总结到的是】

 
 

【对于任何的有凹凸顶点的连通区域,凸点凹点 = 4

 
 

This can be proven using the winding number theorem from calculus. But we’re going to apply this theorem to get a lower bound on the number of quads in an arbitrary mesh:【这可以使用微积分的匝数定理来证明。
但是我们将应用这个定理来获得任意网格中四边形数量的下限:】

 
 

【任何有E条边的连通区域G至少需要
(E+2G+4/2 四边形网格】

 
 

【详细证明】

 
 

To prove our main theorem, we just combine these two lemmas together, using the fact that G >= 0.  In summary, we have the following progression:为了证明我们的主要定理,我们只是将这两个引理组合在一起,并且使用了
G >= 0 这个事实。总之我们有以下进展:

【最优网格
优于等于 贪婪网格 优于等于 Culled网格 优于等于 Stupid网格】

 
 

 
 

So now that I’ve hopefully convinced you that there are some good theoretical reasons why greedy meshing outperforms more naive methods, I’ll try to say a few words about how you would go about implementing it. The idea (as in all greedy algorithms) is to grow the mesh by making locally optimal decisions. To do this, we start from an empty mesh and the find the quad which comes lexicographically first. Once we’ve located this, we remove it from the region and continue. When the region is empty, we’re done! The complexity of this method is again linear in the size of the volume, though in practice it is a bit more expensive since we end up doing multiple passes. If you are the type who understands code better than prose, you can take a look at this javascript implementation which goes through all the details.【所以现在我希望你确信理论上为什么贪婪的网格划分优于前面的方法,我会试着谈谈你将如何实现它。
这个想法(如同所有贪婪算法一样)是通过制定局部最优决策来增长网格。 为此,我们从一个空网格开始,找到首先按字典顺序排列的四边形。 一旦我们找到了这个,我们将其从该地区删除并继续。 当该地区空无一人时,我们就完成了! 这种方法的复杂性在体积大小上也是线性的,但实际上由于我们最终做了多次传递,因此它有点贵。 你可以看看这个javascript实现,它贯穿所有细节。】

 
 

Demo and Experiments

 
 

To try out some of these different methods, I put together a quick little three.js demo that you can run in your browser:【这边做了个demo】

 
 

Here are some pictures comparing naive meshing with the greedy alternative:【这边结果对比】

 
 

As you can see, greedy meshing strictly outperforms naive meshing despite taking asymptotically the same amount of time.  However, the performance gets worse as the curvature and number of components of the data increase.  This is because the number of edges in each slice increases.  In the extreme case, for a collection of disconnected singleton cubes, they are identical (and optimal).正如您所看到的,尽管采用渐近相同的时间,但贪婪网格划分严格优于天真网格划分。
然而,随着数据的曲率和组件数量的增加,性能变差。 这是因为每个切片中的边缘数量增加。 在极端情况下,对于断开连接的单个立方体的集合,它们是相同的(并且是最佳的)。

 
 

 
 

Conclusion and some conjectures:

 
 

 
 

Well, that’s it for now.  Of course this is only the simplest version of meshing in a Minecraft game, and in practice you’d want to deal with textures, ambient occlusion and so on (which adds a lot of coding headaches, but not much which is too surprising).  Also I would ask the reader to excuse the fairly rough complexity analysis here.  I’d actually conjecture that the true approximation factor for the greedy algorithm much less than 8, but I’ve spent enough time writing this post so I guess I’ll just leave it as is for now.  I think the real weakness in my analysis lies in the computation of the upper bound on the size of greedy mesh.  I’d actually guess the following is true:好吧现在就是这样。
当然这只是Minecraft游戏中最简单的网格划分版本,在实践中你需要处理纹理,环境遮挡等等(这会增加许多编码问题,但不会太多,这太令人惊讶了)。 另外,我想请读者原谅这里相当粗略的复杂性分析。 我实际上猜想贪婪算法的真正近似因子远小于8,但我已经花了足够的时间写这篇文章,所以我想我现在就把它留下来。 我认为我的分析中真正的弱点在于计算贪婪网格大小的上限。 我实际上猜测以下是真的:

 
 

【贪婪网格大小的上限是边数量的一半】

 
 

 
 

If this conjecture holds true, then it would reduce the current bounds on the approximation factor to 4x instead of 8x.  It’s also not clear that this is the best known greedy approximation.  I’d be interested to learn how Minecraft and other voxel engines out there solve this problem.  From my time playing the game, I got the impression that Minecraft was doing some form of mesh optimization beyond naive culling, though it was not clear to me exactly what was going on (maybe they’re already doing some form greedy meshing?)如果这个猜想成立,那么它会将近似因子的当前界限减小到4x而不是8x。 目前还不清楚这是最好的贪婪近似。 我有兴趣了解Minecraft和其他体素引擎如何解决这个问题。 从我玩游戏的那一刻起,我得到的印象是Minecraft正在进行某种形式的网格优化,而不是简单的剔除,虽然我不清楚到底发生了什么(也许他们已经做了一些形式贪婪的网格划分?)

 
 

It is also worth pointing out that the time complexity of each of these algorithms is optimal (ie linear) for a voxel world which is encoded as a bitmap.  However, it would be interesting to investigate using some other type of data structure.  In the previous post, I talked about using run-length encoding as an alternative in-memory storage of voxels.  But greedy meshing seems to suggest that we could do much better:  why not store the voxels themselves as a 3D cuboid mesh?  Doing this could drastically reduce the amount of memory, and it is plausible that such a representation could also be used to speed up meshing substantially.  Of course the additional complexity associated with implementing something like this is likely to be quite high.  I’ll now end this blog post with the following open problem:还值得指出的是,对于被编码为位图的体素世界,这些算法中的每一个的时间复杂度是最佳的(即线性的)。
但是,使用其他类型的数据结构进行调查会很有趣。 在上一篇文章中,我谈到了使用行程编码作为体素内存存储的替代方法。 但贪婪的网格化似乎表明我们可以做得更好:为什么不将体素自身存储为3D长方体网格? 这样做可以大大减少内存量,并且这种表示也可以用于大幅加速网格划分,这似乎是合理的。 当然,与实现这样的事情相关的额外复杂性可能非常高。 我现在将结束这篇博文,其中包含以下未解决的问题:

 
 

Open Problem: Do there exist efficient data structures for dynamically maintaining greedy meshes?【开放问题:是否存在用于动态维护贪婪网格的高效数据结构?】

 
 

 
 

https://0fps.net/2012/07/07/meshing-minecraft-part-2/【第二部分】