I'll point out that centroid records can have negative impacts on research; I went along on a 2-day field trip last year with a researcher who had extracted Arizona specimen records from SCAN online in order to map out the itinerary for the trip. As we went from site to site and failed to find even a trace of proper habitat or even road access for four of the first six places we went, I finally asked him if he could show me the records he had used to derive the extremely precise (to 8 decimal places) georeferences he was using. I saw immediately that the verbatim label data for the four troublesome spots gave nothing other than the name of the county. Clearly, these georeferences were county centroids, and they accounted for nearly half of all the 15 data points he was using. We wasted half a day and a lot of gas driving to places that had nothing at all to do with the insects we were looking for.

Yes, a more astute researcher who knew about databasing would have noticed that the verbatim label data for many of those records was skimpy, but there was no big red warning flag on any of those records, either, to alert people to their untrustworthiness.

I'll also note that simply using a large error radius for a georeferenced point is not enough, especially when some online sources don't include an error radius field. In the database I manage, we keep the verbatim label data in a separate field (a field we do NOT upload) from the actual location data (a field we DO upload), and any specimen record for which the label itself gives only a county (or province, or state, or country) has the location field left COMPLETELY BLANK, so when the data are served online, it is immediately obvious that there is no genuine location data, despite there being a georeference.

There are ways to be more pro-active about potential misapplication of data by data consumers.


