Section Three: Sources of Error in Data

The action of creating new data and of manipulating existing data are a source of potential error, both in field data collection and in the GIS. Now that we have looked at types of potential error, let’s examine the actions that can create those errors

8.3.1: Source of Data Error - Data Creation

Creating data from scratch, as we saw in Chapter Six, is performed in the field as well as on a computer and every step of the data creation process is a potential source of error. While data completely free from error is nearly impossible, being aware of the sources of this error will help minimize the problems which may arise at a later date.

Data from Field Collection

SourceResult
Measurement ErrorMeasurement error results from misuse of field equipment. If the data collection equipment is out of calibration, accurate measurements cannot be made, and this initial phase error will cascade to the final product.
Data Acquisition ErrorWhen data is collected by inexperienced technicians or when experienced techs are sloppy or lazy with their measurements, the resulting data will be inaccurate. This the degree of error is also extremely variable, since the skilled user might be more “on it” for one measurement or the inexperienced technician might get lucky with a measurement.
Attribution ErrorAttribution error results from field technicians not recording values properly in either the data sheet or while using an application such as ArcPad. When using technology, most software has a means to control values to reduce this error. Field attribution error is also a source for consistency and completeness errors.
Natural VariationNatural variation error comes from, exactly that, natural variation. Some equipment will change calibration with change in elevation, temperature, and humidity, thus understanding how to properly use equipment and knowing when to re calibrate is crucial to accurate and precise data collection.

Data Collected in the GIS

Projection ErrorWe learned in Chapter Two that choosing the correct projection for a project is based upon the idea of a “best fit” representation. We would not use a global projection when we are looking for local accuracy an vice versa. We know that a the accuracy of a global projection in a local area may be several meters off. Thus, when we digitize features in the wrong projection, the accuracy of our data is off in equal proportions.

ArcGIS has a “Project” tool that will mathematically move the nodes from one projection to another projection when a layer is input and the two projections are defined.
Data Acquisition ErrorSimilar to the data acquisition errors resulting from inexperienced or half-hearted field technicians, the same errors can be present when digitizing data. A new GIS technician may not yet have the skills to digitize properly or an experienced technician may rush through their work, making accuracy errors as they go.
Attribution ErrorAttribution error at the computer are similar the error from field collection. While field collection may result more from challenges in the process of entering values into the table, computer based error comes from mistyping values (you can use Field Calculator to help reduce error) or inserting values into the wrong cell.
Data Source ErrorWe’ve mentioned the idea of “cascading error” a few times now. Errors resulting from the use of a lower quality data as the source is an example of a cascading error. Whether you are creating data from external vector or raster layers, when the source data is from an unreliable source, the resulting data, too, will be unreliable.

8.3.2: Source of Data Error - Data Manipulation

So far, we’ve looked at the error which results from the collection process, both in the field and during the digitizing process. But the source of error doesn’t stop there. Every time you interact with your data by editing or running any sort of geoprocessing tool, you create a place to introduce error.

We defined in an earlier chapter that fundamental changes to data in ArcGIS is anytime the data is actually being changed - attributes, coordinate systems, features within a single feature class, etc. vs a visual change in the data - changing the symbology, the coordinate systems of a data frame to examine data projected on the fly, etc.  Fundamental changes are usually accompanied with a confirmation box making sure the user understands that the change they are about to make is a permanent one. When you change the coordinate system of a data frame, it only changes the way you view your data - no permanent change is occurring vs. running the Project tool, where the output layer being created is stored, not just displayed, in a different coordinate system. When you are editing your data, either the attributes or the features, you are making a fundamental change to that data and any time a fundamental change is made to any sort of data, you create an opening for error to creep in.

For example: you want to run the Project tool to create a new layer in a different projection. The last time you ran the tool, the geographic transformation box automatically populated, but this time, it’s not doing that. You don’t really know what a “geographic transformation” is, and you don’t really feel like looking it up, so you just pick the first one in the list. I mean, if ArcGIS put it first in the list, it must be the most correct option. You run the tool, it completes successfully, and you move on with your project. Months later, when the field team heads out to install the fence that you mapped, they arrive and find the actual location to be roughly 10 meters away from the map - the one you spent weeks working on. Was the error from the geographic transformation? You will probably never know, but what you do know is your map put the whole project on hold and cost the city thousands of dollars (another example of cascading error).  By understanding what a geographic transformation is and utilizing one correctly when needed is a key skill of a good GIS technician.

A geographic transformation is a means to change data from a projection that uses one geographic coordinate system, such as World Loximuthal, which uses WGS 84, to a projection that uses another, such as State Plane Colorado Central, which uses NAD27. Think of the geographic transformation as the way to tell the software exactly which set automagical calculus equations to use. And there is a way to look it up. There is a PDF stored in the “Documents” folder which loaded on the computer when ArcGIS was installed. ArcGIS doesn't list the "right" one first; it filters out the geographic transformations which could never be the right choice and presents the user, in alphabetical order, a list of the remaining possibilities.

Figure 8.2: Geographic Transformations
geographic_transformation_graphic
The NAD 83 to WGS 84 geographic transformation list includes several versions, labeled only by a number.  The geographic transformation list notes which areas to use each transformation.  This graphic is just illustrating some of the areas for which each transformation is correct.