Section Three: Proximity Analysis

Similar to Overlay Analysis as applied to both vector and raster datasets, Proximity Analysis tools are designed to examine spatial distribution relationships between datasets, answering questions such as: “How far is this feature from this other feature?”; “How many homes fall within this particular fire protection district?”; “What is the most cost efficient delivery area based upon our delivery fee?”; and “How long does it take to drive from Denver to Eldora Ski Resort?”.  In combination with Overlay tools, the tools found in the Proximity Analysis toolbox make up the majority of tools GIS technicians use per project.  

7.3.1: Vector Proximity Tools

Like we did for Overlay Analysis tools, we are going to take a look at some examples of some Proximity Analysis tools, not with the intention of memorizing how each tools works, but instead to better understand the category of these tools.  By understanding the categories of tools, you are better prepared to use tools that are not specifically introduced in class.

Buffer

Buffer is one of the easiest tools to understand and one of the most commonly used. It is a quick and easy way to to determine both if and how much/many features are found within a certain distance of another feature. For example, if you were trying to figure out how many homes fall within 10 miles of an ambulance bay, you could use a combination of buffer and select by location to answer the question.

The Buffer tool works with an input layer and a defined “buffer distance”, measuring away from each feature the designated distance, marking a point, then connecting all the points together, creating a new polygon output layer (Which, like all tools in GIS, must be renamed before the tool is run.).

Figure 7.4: The Inner Workings of the Buffer Tool
Buffer1Buffer2
Input feature. In this case a line feature.Measure away from each vertex the designated distance.
Buffer3buffer
“Connect the dots” to produce the new output polygon layerThe Buffer Tool is available for points, polylines, and polygons.

Near and Generate Near Table

One common question we ask in GIS is “What is near what?”, and often more specifically, “How many linear units is one feature from another?”. While Buffer finds all the features which fall within a distance of the input features, unless we use the measure tool to measure and manually record the distance, we do know the exact distances between those features. The Near tool will compare two layers and create a new output layer, complete with a new field in the attribute table expressing the exact distance between features (designated by using the FID of the input and near features).  The Near tool has a limitation that it can only find the nearest feature in another dataset, meaning that there is a one-to-one relationship in the output layer.  In contrast, the Generate Near Table can find the distance between all of the input and output features.  Since the Generate Near Table tool finds so many distances, the output is not a layer, but instead a table, as the result in layer form would be clogged and confusing.  The table allows the user to examine the FID of the input layer and the features in the near layer and find the distances they are looking for.  Once they have the data in table form, they are able to use their SQL skills within that table and find the FID(s) of the features they are looking for, taking those FIDs back to the feature layer.  The Generate Near Table tool can be more powerful than the Near tool, but the result takes a touch more work.

As we learn how more geoprocessing tools work, we can begin to develop a workflow, a series of tools used to answer the spatial question at hand. To find all of the homes within a 10 mile radius of an ambulance bay, the first step of the workflow would be to buffer all the bays at 10 miles, creating the output polygon layer. To eliminate all the homes outside that 10-mile buffer, we could Clip (next section; opposite of “Erase”, previous section) the Houses layer, then use the Generate Near Table tool to determine the exact distance between the ambulance bay and the homes.

Figure 7.5: The Near Tool
near

Use Near to calculate exact distances between the near and input features.

Make Route Layer

So far in our ambulance bay/homes workflow example, we have buffered the ambulance bays at 10 miles, produced the buffer polygon layer, clipped the houses layer to include only the homes inside the 10-mile buffer, and determined the exact distance between each remaining home and the ambulance bay.

Let’s say you’ve found an old folk’s home within the buffer that has several calls a week to a particular ambulance bay at various times of the day. The attribute table also shows the response times, and you notice when the ambulance responds at 9 am and 5 pm, the response time is 14 minutes, compared to 2 pm when the response time is only 9 minutes. Knowing that minutes save lives, you decided to determine if there is a faster route the ambulance can take for certain times of the day. (Heck, because you’re a GIS whiz kid, you decide to analyze the travel times for all minutes of the day and provide a best-route analysis to your supervisor. Do I hear “promotion”?)

In conjunction with some other, Make Route Layer will use a variety of inputs such as traffic flows, travel time per segment (section of road that is independent of other sections, such as a change in speed limit or change in same-direction lane count), and time of day to calculate the best-time route3 . After your analysis, you discover the ambulance can easily shave off minutes of travel time during peak-traffic if it cuts down just a block earlier. Good job! Thanks, GIS.

    • When you plug an address into your GPS navigation unit of choice, and the turn-by-turn directions return an option for “fastest” or “shortest”, these are two different types of Make Feature Route the software is calculating for you. Sometimes, the route is the same regardless of time or distance, and sometimes they are not.

Figure 7.6: Make Route Layer
make_route_layer
Use Make Route Layer to create the best-factor route based upon the desired inputs, such as time, traffic, saving gas, etc.

Other Vector Proximity Tools

ToolWhat it DoesIllustration
BufferCreates new feature data with feature boundaries at a specified distance from input featuresbuffer
NearAdds attribute fields to a point feature class containing distance, feature identifier, angle, and coordinates of the nearest point or line featurenear
Generate Near TableCalculates distances and other proximity information between features in one or more feature class or layer. Unlike the Near tool, which modifies the input, Generate Near Table writes results to a new stand-alone table and supports finding more than one near feature.generate near table
Select by LocationSelects features from a target feature class within a given distance of (or using other spatial relationships) the input featuresselect_by_location
Create Thiessen PolygonsThis tool is used to divide the area covered by the input point features into Thiessen or proximal zones. These zones represent full areas where any location within the zone is closer to its associated input point than to any other input point.thessian_polygons
Make Closest Facility LayerMakes a closest facility network analysis layer and sets its analysis properties. A closest facility analysis layer is useful in determining the closest facility or facilities to an incident based on a specified network cost.service_area
Make Route LayerMakes a route network analysis layer and sets its analysis properties. A route analysis layer is useful for determining the best route between a set of network locations based on a specified network cost.make_route_layer

7.3.2: Raster Proximity Tools

While most of the raster proximity tools are beyond the scope of a GIS 101 class, we will discuss them in a general manner. The first concept to understand when we are looking at raster proximity tools, as mentioned above, rasters are a series of evenly-spaced identically-sized grid cells, which allows additional proximity tools to be run based upon the fact they have this predetermined structure. If each cell is identical, the distance between each cell from center to center will also be identical (and the same as the length of one side of the pixel).

To illustrate this idea, let’s look at a raster with a spatial resolution definition of 30 meters. If each cell is 30 meters on each side, the distance from the center of one cell the center of the adjacent cell will also be 30 meters. If we attempt to apply this assumed relationship to adjacent vector features, it cannot be done, since vector features do not have to exist in any defined geometric relationship. Even if your vector layer resembles a raster layer, the assumption still cannot be made.

Figure 7.7: Review of Raster Structure
30m_Raster
The basic structure of a raster spatial file allows for assumptions to be made that are otherwise not possible with vector spatial files.

Euclidean Distances

The Elements, as series of 13 geometry books written by Euclid of Alexandria in 300 BCE, uses a series of axioms, or a statement or proposition that is regarded as being established, accepted, or  self-evidently true,  as the basis of all the established theorems, or  a general proposition not self-evident but proved by a chain of reasoning. These theorems create the base for all the ideas presented in the series, including Euclidean distance.

Euclidean distance is best understood by defining it as “the shortest distance between two points in a straight line”.  We established in Chapter Two that our understanding of place and location in the world is based upon geographic coordinates and the Cartesian Coordinate System, and if we add Euclidean distance and use the distance formula, we see the familiar Pythagorean Theorem. The short version of the story is Pythagoras defined the theory and Euclid proved it and then applied it to distance on a two dimensional plane and the three dimensional world.

Figure 7.8: Euclidean Distance and The Pythagorean Theorem
euclidian_distance
Euclidean Distance is most easily defined as the distance between two points is a straight line. The Pythagorean Theorem states that in regards to a triangle, Short Side A squared (the value times itself) plus Short Side B squared equals Long Side C squared (the hypotenuses or the angled side).
Euclidean Distances and GIS

Once we have established that the assumed structure of any raster image in the GIS is defined by cells being a square measurement and thus the geometric center of each cell to it’s right angle neighbor is equal to that same measurement (that is to say, raster pixels are equal in height, width, and center-to-center), and we understand that Euclidean distance is just the measured distance between two points in a geometric coordinate system (or in GIS, a geographic coordinate system) we can begin to apply the two ideas to raster proximity analysis tools.

Euclidean Distance, Euclidean Allocation, and Euclidean Direction Tools

Since we know that raster layers can show features such as building, roads, rivers, etc, we can measure from a given feature all the other cells in the image. The Euclidean Distance tool set is used for just this purpose. After defining the feature from which to measure, the tool uses the known cell measurement and returns a new raster layer with a measurement value associated with each cell.

    • Use the Euclidean Distance tool to find the distance from a feature (or features) to other places in the image.

    • Use the Euclidean Allocation tool to assign all the cells in a raster to features based on closest proximity.

      • In other words, the distance of all the cells in a raster is measured from all the designated features in the same raster. Cells are then assigned to a feature based on the shortest distances, or closest proximity.

      • If a distance is defined in the Euclidean Allocation tool (vs. allowing the tool to just do its thing), it is analogous to the vector buffer tool.

    • Use the Euclidean Direction tool to find the direction of each cell from a feature (or features) in a raster

Cost Distance

Cost Distance is used to find the cost of travel over a distance, whether that cost is time, fuel, or money. For example, if you are put in charge of finding the delivery area for the mom ‘n pop pizza shop where you work, you can use the cost distance tool to solve your spatial problem. If the delivery fee is $2.50 and you need to pay your driver’s gas and still make a small profit, the cost distance tool will create a series of area where the fee is most profitable, marginally profitable, break even, and not profitable. Overlaying this output onto a streets layer will visually define the area for you and your order takers.

Other Raster Proximity Analysis Tools

ToolWhat it DoesIllustration

Euclidean Distance

Calculates the distance to the nearest source for each cell.

euclidean_distance_illustration

Euclidean Allocation

Gives each cell the identifier of the closest source.

euclidean_allocation

Euclidean Direction

Calculates the direction to the nearest source for each cell.

euclidean_direction

Cost Distance

Calculates the distance to the nearest source for each cell, minimizing cost specified in a cost surface.

cost_distance

Cost Allocation

Gives each cell the identifier of the closest source, minimizing cost specified in a cost surface.

cost_allocation

Cost Path

Calculates the least-cost path from a source to a destination, minimizing cost specified in a cost surface.

cost_path

Cost Back Link

Identifies for each cell the neighboring cell that is on the least-cost path from a source to a destination, minimizing cost specified in a cost surface.

cost_back_link

Path Distance

Calculates the distance to the nearest source for each cell, minimizing horizontal cost specified in a cost surface, as well as the terrain-based costs of surface distance and vertical travel difficulty specified by a terrain raster and vertical cost parameters.

Path Distance Allocation

Gives each cell the identifier of the closest source, minimizing horizontal cost specified in a cost surface, as well as the terrain-based costs of surface distance and vertical travel difficulty specified by a terrain raster and vertical cost parameters.

Path Distance Back Link

Identifies for each cell the neighboring cell that is on the least-cost path from a source to a destination, minimizing horizontal cost specified in a cost surface, as well as the terrain-based costs of surface distance and vertical travel difficulty specified by a terrain raster and vertical cost parameters

Corridor

Calculates the sum of accumulative cost for two input cost distance rasters. The cells below a given threshold value define an area, or corridor, between sources where the two costs are minimized.

Surface Length

Calculates the length of line features across a surface, accounting for terrain.