Section Six: Select by Attribute Part One - Understanding SQL Expressions

When you visit a Denny’s diner, the experience tends to be the same each time. After you arrive, you’re greeted and shown to a table. The host provides you with a menu and informs you that the server will be right with you. You take some time to look over the menu, examining the pictures and reading the descriptions of each dish. If you’re super hungry, you’re more likely to pick a dish based on the pictures, and if you’re less hungry, you are more likely to take more time reading the descriptions. After a few minutes, the server greets you and asks you the name of the dish you’d like, to which you answer with the name of the dish (or if you're really, really, really hungry, you just point at the picture and grunt in an affirming way).  They write down the choice for each seat at the table on the ticket, presents the ticket to the kitchen, and return after a bit with your scrumptious choice.

Much like the menu at a diner, spatial data is a combination “pictures” (the points, polylines, and polygons we see in data view), item names (the layer name), and descriptions (the attributes). We add data to an ArcMap session, it is with the intent of examining the data and/or performing some sort of analysis of that data. It is also most likely that we need to examine the attributes for specific features in order to complete that analysis. Like we do when we order food at the diner, in GIS we look over the features, sometimes selecting a feature based on the picture (using Interactive Selection to select the feature's geometry on the map) and other times, selecting a feature based on the description  (using Select by Attribute to select features based on the attributes). In this section, we will begin to look at the process of how we go about picking, or selecting, those specific features in the attribute table.

Back at the diner, when it came time for the server to turn the food ticket over to the kitchen, they didn’t just write down just anything on the ticket. The diner has an established manner of taking orders so the kitchen, which is always in a rush, can understand exactly what each person ordered, included special requests. If each server used their own made-up shorthand for menu items, it would cause chaos in the kitchen, leading to frequent mistakes.

We learned last chapter that the key to the definition of a geodatabase is the ability to quickly and efficiently retrieve spatial data utilizing the established relationships shared by the data. Since all the data we encounter in GIS follows the structure of (non-spatial data tables; shapefiles), or is already a member of a geodatabase (feature classes), the rules which apply to databases apply to all GIS data.

Figure 5.11: Select by Attribute Option in the Table Options Menu
table_options_selections-display
Select by Attribute (and other related selection tools, like Clear Selection, Switch Selection, and Select All) are found in the Table Options menu of the attribute table from which you would like to make the selection.

Much like the diner uses an established “language” to take and make order quickly and correctly, GIS utilizes a database language called Structured Query Language, or SQL (pronounced: See-Kwel) for short to accomplish the “quick and efficient data access” part. Simply put, SQL is a computer language used to query, or ask questions of, a database via an established structure of expressions.

In GIS, we refer to the process of creating SQL queries to select features in the attribute table as Select by Attribute, or using the attributes in a table to build expressions to return desired values. Since the structure used has been established, it is our job as GIS technicians to learn the basics of creating proper query expressions to most efficiently do our job.

While GIS uses SQL, it is not a language unique to GIS databases. This universal database language is beneficial to GIS because it allows a technician to work with any RDMS written in SQL across several GIS and non-spatial software suites. Since SQL is a huge language with lots of rules written for numerous spatial and non-spatial applications, we will only focus on the rules and syntax (the established structure of a computer language needed to execute commands) needed to complete our job as introduction to GIS students.

Select by Attribute can be launched in three places in ArcGIS, with all three initiations resulting in the same dialog box.
✔ No need to memorize these right now, just be aware there are several places to launch the tool, which will be practiced in lab.
From the Table Options menu of the attribute table to be queried
    • Makes the assumption that the user is attempting to query the table from which the tool dialog box was launched from.  For example, if the user was after state names, they would launch the Select by Attribute dialog box from the "States" layer's attribute table, and not the "Waterways" layer.
SBA_table_options_menu-display
From the Select by Attribute Table Action button of the attribute table to be queried
    • A shortcut to the option in the Table Options menu
SBA_table_action_button-display
From the Selection Menu at the top of the ArcMap window
    • Opens a version of the tool dialog box where the user will need to pick the desired layer from a dropdown.
SBA_selection_menu-display

5.6.2: Simple SQL Expressions and the Select by Attributes Tool Dialog Box

All simple SQL table expressions follow the format: <using one of the four selection types> SELECT * FROM <layer/non-spatial data table name> WHERE: <field name><comparison operator><value> to query a table and return features.  Within ArcGIS, the Select by Attribute dialog box automatically populates the SELECT * FROM <layer/non-spatial data table name> WHERE: portion of the SQL equation, effectively eliminating all other SQL table query types in ArcGIS. This means that as a GIS technician, the only portions of a simple SQL expression you need to complete (and memorize) are <using one of the four selection types> and <field name><comparison operator><value>

Like many ArcGIS tool dialog boxes, the Select by Attribute dialog box is designed to be read and populated from top to bottom.  This makes it easy for new technicians to fill in the boxes properly, as the required information is presented in a logical and expected format.  In future classes, you will learn how to bypass the dialog boxes and create Python expressions to really amp up your GIS abilities in terms of efficiency throughout a project, but for now, we will focus on the user-friendly (I promise) dialog boxes.

Figure 5.12: An Expression in the Select by Attribute Dialog Box
SBA_built_in_display_half
In the screenshot, we see that ArcGIS fills in the first part of a simple SQL expression for us (SELECT * FROM <layer or table name> WHERE: and it's our job to populate the rest. In this specific screenshot, only the basic form of the SQL expression is shown (<field name> <comparison operator> <value>) vs showing a proper expression, like we will see coming up.

Knowing what we do so far (what is a spatial layer, the fact that SQL is like ordering off an established menu, the result of queries are selected/highlighted features on the map and in the table), let's look at the different parts of the Select by Attribute dialog box, not to memorize the steps, but to explore what goes into the tool and what the expected result of the tool might be.  As we work our way through this section, each part of the tool dialog box will be equated back to the Denny's diner example to help better explain the purpose.

Layer:

When launching the dialog box from the Selection menu, the first line allows you to select a Layer from all of the layers added to your MXD. If your goal is to select data from several attribute tables sequentially, this method is faster and more efficient than opening each table individually and using the table action button or table options menu. Otherwise, while you are still learning to software, using the table action button or table option menu within each individual table to launch a version of Select by Attribute that applies to only that particular table will result in less frustration and fewer mistakes.

✔ Non-spatial data tables will not be shown in this list. The Select by Attribute dialog box must be launched from the opened table launched from the table action buttons or the table options menu.  Non-spatial table look and act the same as attribute tables, just without "attached" features on the map. 
The SQL Diner: Layer
Occasionally, Denny’s will have special promotional menus separate from the regular menu that are provided by the host when you first sit down, showing limited time (often themed) dishes. In order to pick something to eat, you need to first decide which menu you’re going to look at: the classics or the promotional items.
SBA_Layer-display

Only show selectable layers in this list:

Now that we’ve looked at how to list layers in the Table of Contents by Selection, we can understand the basis of the Only show selectable layers in this list check box. To reduce the "Layer" list size, layers which do not need to participate in a Select by Attribute can be set to non-selectable, and when used in combination with the “only show...” check box, the list can be dramatically reduced in size.

The SQL Diner: Only Show Selectable Layers in this List
We all have “that food” that we will not eat - eggs, meat, dairy .... When you are looking over the menu, you automatically skip those dishes as not viable choices. They are set to “not selectable” simply because the main ingredient is something you will not eat.  Just like your automatic filter when scanning the menu, we have the ability to filter the layer list by only those layers we designate as important during a particular phase of a GIS project.
SBA_selectable_layers-display

Method:

The Selection Method dialog box offers us four options when it comes to selecting data:

selection_type_SBA-display

Create new selection

Selects only the feature for which the values fit the query criteria and clears any other features which may be currently selected

Most of the time, we are performing new selections, and that is what the dialog box defaults to.

Add to current selection

Selects additional features by means of another query and adds it to the list of selected features. Add to current selection can be used an endless amount of times within one table, well, at least until all the features are selected and there are no more to add to the list.

Remove from current selection

If there is a current selection (features highlighted in the table), remove from current selection will deselect any features which match a new query. This method would be used when you need to select some features, examine them, and decide which ones to move forward with by using another query. Multiple queries can be used sequentially to reduce the pool of selected features until only the final selection is left.

Select from current selectionSimilar and opposite to remove from current selection, Select from Current Selection re-selects features you do want based on a new query instead of removing them from the list.
The SQL Diner: Selection Method

When you are first seated, there is nothing to eat or drink on the table - nothing is selected.  You're presented with the menu, which you take a minute to look over while waiting for the server.  They will come over and initially take a drink order, a new selection you've made.

The sever brings out the drinks, and asks if you are ready to order the meal. You inform them that you are not quite ready to make the meal choice, but you’d like to order some mozzarella sticks. When the mozzarella sticks come out, they are placed in the center of the table, in addition to the drinks. The server doesn’t take the drinks away, they simply add the mozzarella sticks to your current selection, the drinks, and takes your order for the main dish.

Before the main dish arrives at the table, you’ve finished the mozzarella sticks and the server takes the plate away, removing the appetizer from the current selection, leaving just the drinks.

After the main dish arrives, which often is a plate consisting of two or three things, such as hash browns on the side of the omelet. Since it’s hardly possible (or polite) to eat everything at once, you need to pick what to eat first, or select something from the current selection. 

Field Name:

We have already learned that columns in GIS are called fields, and the column headers are called the Field Name. The structure of a simple query is <field name><comparison operator><value>, which means SQL requires us to state to which field our query applies.  The Select by Attribute tool dialog box addresses this portion of the expression with a list of all the field headers for the table, as defined by the "Layer" dropdown or from whatever table the dialog box was launched from, as we learned earlier that if the tool is launched from the table itself, the assumption is the query will apply to that table.

The SQL Diner: Field Name

Denny’s has defined sections in the menu: omelets, pancakes, burgers, sandwiches, etc. When you first look at it, you decide which category you feel like at this visit - eggs or a burger.

SBA_fields-display

Comparison Operator Buttons

Sticking with the structure for a simple SQL query, <field name><comparison operator><value>, we have looked at where the field name list comes from within the Select by Attribute box (from the available fields within a specific attribute or non-spatial data table), so the next part is the comparison operator.  Comparison operators are found in the Select by Attribute box as a series of buttons, along with the Logical Operators, a set of operators used when we create complex SQL expressions, the topic of the next section of the text.

With SQL expressions, comparison operators compare what you've asked the query to evaluate with what is actually in the table, aka look for some designated word within the table, and are broken into two main categories - those which query for exact values and those which query for approximate values.  For example, if you were looking for a feature in a State layer's attribute table with the exact name of Colorado (SQL: "Name" = 'Colorado'), you would be asking the Select by Attribute tool to query the table (as to query is to ask a question, the question you are asking is "hey table, do you have a feature by the exact name of Colorado?"), looking for a value in the 'Name' field which is spelled exactly like you typed it - C-o-l-o-r-a-d-o.

Approximate comparison operators look for, well, approximate values.  When you're not too sure of all the available values, but you're mostly sure of what might be in the table, instead of scrolling through the whole table and carefully looking for anything that might meet what you are looking for, you can use a wild card comparison operator instead.  For example, if you had a table with 19,000 records of historic sites, and you are only interested in those which might have the word "Roman" in the feature name, you could use a wild card SQL expression such as "Historic_Site_Name" LIKE '%Roman%', which would return all the Roman Villas and the Historic Roman Sites. 

OperatorButtonPurpose
EqualsSBA_equals-displayReturns all features with a value in the specified field which exactly matches the value defined in the expression. Record values are case sensitive and spelling dependent, meaning “california” is not the same value as “California” nor “Kalifornia”. An attribute table will see these as three separate values and each value must participate in a different expression. Equals works for numeric, text, and date/time type fields.
Not Equal ToSBA_not_equal-displayReturns all values except the defined value. The opposite of Equals. For numeric field types, the expression looks for values which are not equal the defined value; for text type fields, not equal to looks for all words that are not the defined word; for date/time type fields, not equal to looks for dates or times which occur any time other than the defined date/time.
Greater ThanSBA_greater_than-displayFinds values greater than the defined value not including the defined value. Works in numeric type fields, text type fields where greater than is based upon alphabetical order, for example, State Name greater than ‘California’ will return all the features with a state name Colorado to Wyoming, alphabetically; and date/time fields.
Greater Than or Equal ToSBA_greater_than_equals-displayFinds all the values greater then AND equal to the defined value. In our example, State Name Greater Than or Equal To ‘California’ will return California to Wyoming, alphabetically. Works for numeric, text, and date/time type fields.
Less ThanSBA_Less_then-displayFinds all values less then the defined value but not including the defined value. For example, State Name Less Than ‘California’ will return Alabama to Arkansas, alphabetically. Works for numeric, text, and date/time type fields.
Less Than or Equal ToSBA_Less_then_equals-displayFinds all the values less then AND equal to the defined value. For example, State Name Less Than or Equal To ‘California’ will return Alabama to California, alphabetically. Works for numeric, text, and date/time type fields.
LIKESBA_like-displayLIKE is used to find values within an attribute which are similar to, instead of exactly like, the value defined in the expression and is used in conjunction with the wild card values _ or %. In order to find all values that end in “alifornia”, an expression such as - "State_Name" LIKE ‘_alifornia’ - will return the features where the State Name attribute is “california”, “California”, or “Kalifornia”.  Another example might be - "Owner_Name" LIKE ‘Br%’ - which will return the values “Brett”, “Brent”, “Brandon”, “Brendan”, and “Brant”
Wildcard (One Character)SBA_wildcard_single_char-displayThe underscore _ is used as a wild card value when building SQL expressions, meaning it can replace a single character such as “C” when attempting to locate “california”, “California”, or “Kalifornia”.
Wildcard (Many Characters)SBA_wildcard_many_char-displayThe % sign is used as a wild card value for variables that contain two or more characters, meaning Br% can replace the the “ent” or “ant” when attempting to locate the man’s name “Brent” or “Brant”
SBA_operators-display
The SQL Diner: Comparison Operators

When you tell the server what you’d like to eat, it’s usually in some sort of English sentence form - "May I have the Swiss-Mushroom Burger?" - you really don't just look at them and say “Swiss-mushroom burger” and not another word.  You might get what you want that way, but the result may be a burger with pickles when you are allergic to pickles and a server who is way less likely to come over to your table and fill your drink, ask if you need dessert, and rush you along if they are busy.  SQL isn't really any different, except when your sentence structure isn't exactly like it should be, you get nothing in return.  When you tell your SQL speaking server "Swiss-Mushroom Burger", they reply with "There was a syntax error with your expression", and the server just stares at you.

Get Unique Values Button:

When we looked at the structure of attribute tables in an earlier section, we learned that for each cell in the attribute table (or non-spatial data table), there is only one value (or record).  This means what you see is what you get.  If a cell in the "Breakfast Items" field contains the value "Moons Over MyHammie", the only value in that cell is "Moons Over MyHammie", and nothing else.  There are no hidden values, no way to make the table show anything else in that cell, and no alternate spellings in that cell (although, another cell may contain an alternate spelling).  We call this value a unique value, meaning that it appears as that exact spelling, including capitals and spaces, at least once.

The Get Unique Values button looks for these unique spellings within a single field and lists them out for you in the Unique Values box.  Since the button is getting unique values, even if a value appears more than once in a field, it only appears once in the unique values box.  For example, if we were looking at the City_Name field for a USA cities layer, the city of Santa Fe would appear four times, since there are four Santa Fe's in the United States. However, if I had the Select by Attributes tool dialog box open and clicked the Get Unique Values button in relation to the City_Name field, the value "Santa Fe" would only appear once in the unique values box since the spelling, spacing, and capitalization is exactly the same for all four of the Santa Fe's in the United States.  One unique value, four locations in the United States.

To simplify searching though possibly thousands of unique values, the Unique Value search box (labeled "Go To:" and seen to the right of the Get Unique Values button) will jump to anything you type. For example, if you were looking at the unique values for US State names and entered a “C” in the search box, the list will jump to “California”, and if you type “Co”, it will jump to “Colorado”, and if you type “Con” it will jump to “Connecticut”. You can search by typing as many letters as you need to get to the desired value or just the first letter and scrolling through the list.

The SQL Diner: Unique Values

The job of the menu is to show exactly what you can get at that restaurant - you limited to those choices. However, everyone at your table could order the same thing. Just because the Swiss-mushroom burger is only listed one time in the menu, it available to be made more than once.  The menu lists the unique values one time, even though the kitchen is capable of making it an unlimited amount of times.

SBA_get_uniqe_values_window-display