The Fallacies of Free Data

As a GIS data library publisher, you might expect us to favor paid subscriptions for maintained data libraries. You would be right, and there are some important reasons for this behind the scenes. Customers often at first try to compile the data themselves, but are frustrated by the variety of formats, coordinate systems, integration, cybersecurity issues and lack of seamlessness. Moreover, drifting data schemas can rapidly change from month to month making data updates a sisyphean task.

We are completely in favor of government organizations publishing all of their data as widely and as freely as possible, however we find download methodologies seem to come and go with disturbing frequency, particularly as resources are redeployed during the pandemic. WhiteStar spends time and effort keeping track of more than 3,142 US counties and county-equivalents in the USA as well as tens of federal government websites. We want customers to be able to reliably consume GIS data in a predictable format, coordinate system and database schema.

We also cheerfully research issues in the authoritative source data we provide. Customers often want to know the history and data collection processes used to compile the data. For example, why are attribute fields in well data not fully populated in some cases? How do the X Y coordinates within the data relate to longitude/latitude?  How was the raster map georeferenced and to which base?  Customers love having access to WhiteStar personnel willing to liaise with authoritative source data providers.  Do you have better data internally?  We can get that incorporated into the master data for you so you do not have to manage it. 

Public data often have issues of integration across jurisdictions as well. For example, a state’s authority stops at the border and may not cleanly transition to the adjacent state(s). As shown in the accompanying graphic, we find gaps and overlaps in land survey data that must be inspected and resolved before the GIS data can be used for project map generation.

At the Texas, Arkansas and Louisiana Triple point, land survey data sets do not come together from government sources (light green). The red lines show the integration work that WhiteStar performs to make the data seamless.

For better or for worse, governmental organizations publish data in formats ranging from PDF files to Access Tables to Esri File Geodatabases. WhiteStar knows that customers want to consume data in a curated and consistent format using a consistent coordinate system and ideally in web stream form that can easily be added to any GIS or CAD system.

Robert C. White, Jr.
President and CEO
WhiteStar Corporation

Previous
Previous

COVID-19: WhiteStar Status and Remote Data Access for Current Customers

Next
Next

The Hidden Taxes Inside Data Management