Fade2D is a numerically robust and extremely fast C++ Delaunay triangulation library. Certainly you can verify that on your own hardware with the below Delaunay triangulation benchmark. You can find this code in examples2D/ex1_benchmark.cpp.
Benchmark Code
// * Init * std::vector<std::pair<std::string,int> > vNumPoints; vNumPoints.push_back(make_pair("numPoints: 1k",1000)); vNumPoints.push_back(make_pair("numPoints: 50k",50000)); vNumPoints.push_back(make_pair("numPoints: 100k",100000)); vNumPoints.push_back(make_pair("numPoints: 500k",500000)); vNumPoints.push_back(make_pair("numPoints: 1 mio",1000000)); vNumPoints.push_back(make_pair("numPoints: 10 mio",10000000)); //vNumPoints.push_back(make_pair("numPoints: 50 mio (10.7 GB)",50000000)); //vNumPoints.push_back(make_pair("numPoints: 100 mio (21.3 GB)",100000000)); //vNumPoints.push_back(make_pair("numPoints: 250 mio (53 GB)",250000000)); //vNumPoints.push_back(make_pair("numPoints: 500 mio (109 GB)",500000000)); // * Test * for(size_t i=0;i<vNumPoints.size();++i) { std::string label(vNumPoints[i].first); int numPoints(vNumPoints[i].second); // * Step 1 * Create a Fade object, make performance settings Fade_2D dt; dt.setFastMode(false); // Use 'true' for raster points (performance!) int numUsedCPU=dt.setNumCPU(0); // 0 means: autodetect // * Step 2 * Prepare a vector of random points vector<Point2> vInPoints; generateRandomPoints(numPoints,0,100,vInPoints,1); // * Step 3 * Measure the triangulation time cout<<"\n"<<label<<": start"<<endl; double elapsed=dt.measureTriangulationTime(vInPoints); cout<<"Elapsed time (cores: "<<numUsedCPU<<"): "<<elapsed<<endl<<endl; }
- Step 1 in the above code creates an empty Fade_2D object and makes two important performance settings:
Fade_2D::setFastMode(bool b)
enables/disables a triangulation mode that greatly speeds up the triangulation of raster points.Fade_2D::setNumCPU(int numCPU)
sets the number of CPU cores to be used. The special valuenumCPU=0
means autodetect and use the available cores.
- Step 2 prepares a vector of random points
- Step 3 inserts the points and measures the triangulation time.
“The fast triangulation mode avoids computationally expensive predicates, which are especially frequent for points of a regular grid. These are rare for other settings and thus only triangulation of raster data is accelerated by this mode.”
“By default Fade runs single-threaded. This avoids nested multithreading in parallel applications. Thus, if you want multithreading, you need to activate it explicitly using
Fade_2D::setNumCPU(..)
.”
Single- and multithreaded Delaunay triangulation benchmark
The below diagram shows benchmark results for up to 250 million points. This is equivalent to 500 million output triangles. The observed runtime is almost linear to point cloud’s size. More precisely, the upper bound on the algorithm’s runtime is O(n*log(n)). The multithreading-speedup on a 6-core CPU is 4.39.

Raspberry PI2 – 1 Million Points on an ARM Cortex A7
Fade2D also runs on Raspberry PI computers. The RPI2 uses a Broadcom BCM2836 SoC, which contains a 32-bit ARM Cortex-A7 CPU. This small system gets by with 3 watts of power. The more astonishing is that this small board could triangulate one million points (2 million triangles) in 7 seconds.

Get true Delaunay Benchmark Results
- Compile the Delaunay triangulation benchmark as x64 application in Release mode. For better performance and because 32-bit applications suffer from the (MSVC 32-bit) 2GB memory limit.
- Run the test directly in the command prompt, not in a debug window of your IDE.
- Do you have enough RAM for the benchmark? On 64-bit systems you need 212 MB per million points which correspond to 2 million triangles.
- Prevent cpu throttling: Disable energy saving options during the test. Furthermore, use the Performance Governor under Linux and if you use a laptop, connect the power cord.
- Fade still supports very old environments (VS2010, CentOS6.4) but multithreading is disabled there.
The next article Example2 – Traversing teaches you the important basics you need to access specific elements in a triangulation.
2 replies on “2D Delaunay Triangulation Benchmark – Example1”
Hey there!
Have you ever thought about parallelizing your algorithm on the gpu?
Making Fade2D even faster would be nice. But it would also add dependencies and thereby worsen the usability and compatiblity. Another thought is that Fade2D creates already ~15 mio. triangles per second (on recent hardware) and I think only very few projects have so much data that they require a new algorithm.