Comparing performance of GDAL and Proj.NET

There are two major free/open source projection libraries available for use under .NET: GDAL and Proj.NET.

GDAL has a long history, and is written in C++, so it’s compiled to machine code. The C# bindings are generated by a tool. Calling C++ from C# uses a process called “pinvoke” (Platform Invoke), and a certain amount of overhead is involved (between 10 and 30 instructions per call).

Proj.NET is written entirely in managed code, so there is almost no overhead per call. It’s a much younger project, but has an active development community.

We are interested in comparing the performance of these two projection libraries since my application needs to do spatial transforms, so I did the following experiment.

Preliminaries

  • I wrote a wrapper library in C# with appropriate interfaces that could call both Proj.NET, GDAL, and some of my own hand-rolled projection classes
  • I downloaded free topographic data for Canada from GeoGratis and GeoBase.
    • The data consists of 20 meter contours, roads, lakes, rivers, cities and other basic geographic features for all of Canada.
    • Each file has a variable number of points, lines and polygons of various sizes
  • The data is in the Geographic projection, or EPSG:4326. For display I projected it into EPSG:3857
  • I wrote a C# program to load and convert a subset of 166 files containing 6.8 million items and 211 million points
    • each file contained between a few hundred to 4.2 million items
      • mean number of points: 1.2 million per file
      • mean number of items: 41,000 per file
    • we did the timing on a per-file basis
      • time-per-item derived as an average on a per-file basis
    • we started the timer just before projection started, and stopped after it ended
      • file load times were not included
    • This gives us a time/item value for each file
      • mean time/file was 6.7 seconds
  • We use the System.Diagnostics.Stopwatch class to do the timing.
    • the resolution of the timer is in the ticks or milliseconds range,
    • projections took from 1000 to 6000 milliseconds

The data represents a real-world problem, that of projecting a file in one SRS (spatial reference system) to another. The number items in each file, the type of item, their spatial variability, and the number of points per item all are from a real-world map.

For the GDAL library, we loaded all the points for a single item into an array, and marshalled for pinvoke. When executing a platform invoke, it’s obviously better to have fewer calls with more marshalled data, but I exerted no special effort to optimize this. We called the projection function on a per-item basis, so if an item has one or 1 million points, it would be a single function call. The same is true of the Proj.NET library.

Results

This graph isn’t too surprising. GDAL has overall better performance, as a numerically intensive, but machine-coded library should. At first glance  it may seem that the difference in performance is greater with a larger number of points, but the above is an absolute difference.

This chart compares the ratio of the difference in time between Proj.NET and GDAL. When expressed this way you can see that GDAL’s performance is variable, but always better (positive). In fact, the mean performance increase is 20.9%

I suspect that the variability in performance for smaller number of points is attributable to the number of points per item. As I mentioned in the intro, the points are batched for projection per item. When charted there appears to be more variability for items with from 15 to 25 points. Above this number, the % difference in speed between GDAL and Proj.NET appear so be fairly stable. I am not too interested in the cause of the variability since it is all positive (faster) with some projections happening almost twice as fast as with Proj.NET

Conclusion

The performance of GDAL and Proj.NET was tested against a real world data set. We projected 166 files with 6 million items and a total of 211 million points. We compared the time to project points on a per-item basis. The comparison shows that GDAL is consistently about 20% faster than Proj.NET, with some projections being much faster for items that have between 15 to 25 points per item.

More info:

The entire data set and all charts are available here.

Posted in Articles, Blog and tagged , , , , , , .

Leave a Reply

Your email address will not be published. Required fields are marked *