Category Archives: GIS

ArcGIS Tutorial – Georeferencing Imagery

One of the limitations of computer mapping technology is that it is new. There is little historical imagery and data available as a result, although this has started to change. The integration of paper and imaged maps into computer mapping technology is possible, and this tutorial will walk through the process of georeferencing.

Georeferencing is the process of placing an image into two dimensional space. In essence, georeferencing pins a scanned map to particular geographical coordinates.

This tutorial will georeference a map of Durham County from 1955. In addition to the scanned map, we will use two current layers as referents: the Durham roads layer, and the Durham county boundary. Note that because the layers are more recent than the historical map, many roads will not exist in the image. Georeferencing historical imagery requires familiarity with geographic characteristics and changes.

 

Step 1: Enable Georeferencing

First, under the “Customize” Menu Bar option, navigate to “Toolbar” and select Georeferencing. The figure to the right displays the Georeferencing toolbar.

 

Step 2: Add Data and Image Layers

Next, add the shapefiles that you will use as referents for the image.

Once this is done, add the image to be georeferenced.  Note that you will almost certainly not see that image, as it lacks spatial coordinates. However, the image will appear in the Table of Contents.

In this example, I have added Durham County (blue polygon) and the Durham roads layer (blue lines).

 

Step 3: Fitting the Image to the Layers

The next step will relocate the image to the center of your current window and will expand the image only to the point where the entire image is visible. In this case, Durham County is taller than it is wide, so vertical space will be maximized.

First, it is a good idea to zoom, if necessary, so that your current view roughly matches where the image will be place. In this case, zooming to the full extent of the Durham county boundary will accomplish this.

Second, under the Georeferencing toolbar, click “Georeferencing” and select “Fit to Display.” The image should be roughly aligned to the data layers, though if not, this is not problematic.

As you can see from the image to the right, there is some distance between the county boundaries of today (red lines) to the hand-drawn county boundaries located in the image (white lines).

 

Step 4: Adjusting the Map

ArcGIS georeferences images through the addition of control points. The control points tool (to the right) operates through two mouse clicks: the first mouse click selects a point on the image, and the second mouse click pins that point to a location within a data layer.

For example, in the image to the right, I have selected a major intersection that likely has not changed in the last 60 years. After my first click, where I’ve selected a point near the top of the intersection, a green crosshair is placed. As I move the mouse, ArcGIS will pin my current crosshair to a proximate layer, in this case, the Durham roads layer.

Once you click a second time, the map will move to conform to the new control points. Control points work in combination, so as you add new control points, your image will (ideally) match more closely to your referents.

There is a limit to how much each subsequent control point will improve fit as more points are added. Generally, it’s a good idea to zoom in to improve accuracy and to create control points across the extent of the image.

After about 15 control points, we can compare the image to the included shapefiles. As you can see, if we assume that major roads have not changed, the green lines correspond well to the image, while the county boundary does to a lesser extent.

 

Step 5: Statistics and Transformations

Before saving the results, it is also a good idea to evaluate the results. Open the Table of Points to see each of your control points and the root mean squared error of all control points.

The Root Mean Square error (RMS) provides a rough guide to how consistent your control points are to one another with reference to the map.  Note that a low value does not mean that you’ve necessarily georeferenced the image well, it means you’ve georeferenced consistently.  High RMS errors indicate that your control points are less consistent with one another in comparison with a low RMS error.  One way to address this issue is to identify especially probelmatic control points and either replace or remove these points.  However, always reevaluate how well your image maps to the referent shapefiles.

You may delete control points or add new points at this stage. In addition, you may also try different transformations, although second- or third-order transformations are rarely needed.

 

Step 6: Saving the Results

Under the Georeferencing tab of the Georeferencing toolbar, select “Update Georeferencing.” Spatial information is saved in two new files that MUST accompany the image, an “.aux” file and a “.thw” file.

 

General Tips

– Zoom close to the layer resolution in order to improve accuracy

– Use more than 1 referent if possible. In this example, the county boundary provided a rough guide with respect to how far off the image initially is, but was not used to actually georeference the image.

– Georeference to accurate features. In this example, the county boundary was hand-drawn on the image and is not as precise as photographed features, like roads.

What’s new in ArcGIS 10?

Basemaps

Would you like to add aerial photography or a topographic map underneath map layers for visual appeal or context? With ArcGIS 10, you can add a basemap to your map project.

A basemap is a link to an online imagery data source. You must be connected to the Internet in order to see a basemap.

Basemaps contain imagery at different levels of detail. When zooming in or out, new imagery will replace old imagery, which provides an approprate level of detail at any zoom level and improves performance by limiting the amount of information to be downloaded and displayed.

 

Export Map Packages

Sharing maps and shapefiles with others can be a pain when a map is composed of many shapefiles and layers.  A map package bundles all shapefiles, layers, and map documents into a single file that can be opened by others with ArcGIS 10.

 

Background Processing

In ArcGIS 10, ArcToolbox tools default to background processing.  This allows you to continue to work while the tool processes your data.

To disable background processing, navigate to the “Geoprocessing Options…” choice under the Geoprocessing Menu Bar, and uncheck the “Enable” box.

 

Search Toolbox Feature

Got a tool you want to use but can’t remember what toolbox its in?  With the Search feature, you can easily locate what you need. Your search term can be the tool name or a close approximation of what you wish to do.

 

Easy to Use Time Data

Time series data became easier to use with ArcGIS 10. Version 10 recognizes time series data with the addition of a single time field.

For example, suppose you have annual precipitation for US cities.  Your data will contain an ID field, a point field, a time field containing the year, and a field containing the precipitation amount.

For more information, see this blog post.

 

How Do I Label Individual Items?

Have you ever wanted to label individudal items on a map, and avoid the cluttered appearance of labels for all features, such as that shown to the right?

ArcGIS 10 hides the tool that you use to label individual items, but it’s easy to get back.

  1. Turn on the “Labeling” toolbar under the Customize Menu Bar.
  2. At the top right corner of the toolbar, click the arrow pointed downward and click “Customize…”
  3. Select the “Commands” tab and select the “Label” category (left panel).
  4. In the right panel, drag the “Label” tool and drop it into any toolbar that you wish.

Time Series Visualizations in ArcGIS – An Introduction

Introduction

ArcGIS 10 makes it easy to manage and visualize time-series data to identify trends and create compelling visualizations.  Creating a visualization of time-series data requires only a few additional steps beyond those needed to produce any map.

Step 1: Data Formatting

Time-series data contains records, each of which is specific to both an individual and to a single point in time.  The following example uses employment data for the textile industry in North Carolina from 2000 through 2009.

In this example, “fips” corresponds to each county’s unique FIPS code, “industry” corresponds to the textile industry’s unique NAICS code representation, “t” denotes the year.  Establishments, employment, and annual pay, our data items, are stored in the fields “est”, “emp”, and “pay_ann”.  All missing values were coded ‘-1’.

Tip: Make sure each record has a value.  Records without values will not be drawn in ArcGIS.

Tip: Do not name the time field “year,” as it is a reserved name in ArcGIS.

We suggest based on experience that the storage of data in a Microsoft Access database provides the greatest degree of reliability.

Step 2: Add Data to Map in ArcGIS

Once the data is formatted, join the data to a geographic layer.  For help in finding a geographic layer, please consult the Perkins Data and GIS Services Department.

Tip: When joining layers, it is good practice to Verify the join selection before approving.  The program will inform you of any errors.

Step 3: Enabling Time

Once the data are joined to a layer, enter the layer properties by right-clicking the layer name in the Table of Contents pane.

Navigate to the Time tab and check the box.  ArcGIS will want to know which field contains time information, as well as the format.  If the join was successful, you will see the fields that represent the data joined to the geographic layer.  In this example, the time field is labeled “t”.

You must also specify the date/time format.  Available time formats are listed to the right.

Finally, you will have to enable time on the data table as well.  To do this, right-click the data table in the Table of Contents pane.  Follow the same steps as presented for the geographic layer.

Step 4: Enable Time Display

Now that ArcGIS understands the data structure, you may enable time visualization.  The “Tools” toolbar, which contains the most commonly used tools, contains the button highlighted below, “Open Time Slider Window”.  Select this button.

The time slider window (left) will appear.  The slider spans the time range of the data, identifies what point in this range is currently displayed on the map, and allows for access to a variety of playback and recording options.  To access these options, click the options button.

This button is the equivalent of “Play.”  It will display the data from the first time point to the last.

Buttons with both arrows and vertical lines are one-step increments.  This particular button moves forward one time increment, the other one moves back.

This button exports the display to video.  This is the final step.

Step 5: Configure Options and Visual Display

Before you export to video, you will want to configure the appearance of the map.  This example will focus on new options that come with time series data.

First, select “Options” in the Time Slider toolbar.  Under the “Time Display” tab, you can alter the format of the displayed date to conform to your data.  In this example, I selected 2011 (yyyy) because we are using annual data.

Second, under the “Playback” tab, you can specify a length of time for playback.  This example contains 10 years of data.  If I specify 5 seconds playback, each data year will be displayed for one-half second.  If I specify 10 second, each year will be visible for 1 second.

Third, I will display the year in order to make clear to the viewer the time point that is visible.  To do this, I will go to “Insert” “Dynamic Text” “Data Frame Time.”

Tip: Alternatively, you can insert the data frame time into the title or other display object by including the following in the text of the object: <dyn type=”dataFrame” name=”Layers” property=”time” emptyStr=”[off]”/>

After some trial and error, I successfully integrated the time currently visible into the title.  The image to the left shows its appearance.

Step 6: Export to Video

Once the appearance of the map is satisfactory, you can export the map to video or to sequential images.  Click the “Export to Video” button on the time slider window.

Tip: maximize the ArcGIS window, switch to Layout View, zoom the layout to 100%, and clear any toolbars that may obstruct the layout view to improve video appearance.

First, you will be asked for a file or folder location and the export format.  Videos are exported as AVI files, while sequential images are exported to a folder either as bitmaps or JPEGS.

Second, if you exported to video, you will be asked to select a codec, which essentially encodes and compresses the outputted video.  The codec selection depends on the individual machine, and some codecs work with ArcGIS better than others.

Finally, you may have to produce a video several times before it comes out as expected.  Be sure to watch for missing time points, as this frequently happens.  Fixing the video length to a specific play duration per time point (one-half second or one second) helps you watch for these missing time points.

The following example is a 5-second video that displays employment in the textiles industry in North Carolina from 2000 through 2009.  Note that declining employment is signified by colors that change from dark to light.

Where There’s Smoke …

A team of Duke undergraduates participating in the Global Health Capstone course was awarded the “Outstanding Capstone Research Project” for their examination of state and congressional district characteristics that might influence the outcome of legislative efforts to raise cigarette excise taxes in North Carolina, South Carolina, and Mississippi.  Sarah Chapin and Gregory Morrison used GIS mapping tools in the Library’s Data & GIS Services Department to illuminate the relationships between county demographics and state legislators’ votes for or against cigarette tax hikes. Brian Clement, Alexa Monroy, and Katherine Roemer were other members of the research group.  Congratulations!

Regional Focus
The recent cigarette excise tax increases Mississippi (2009), North Carolina (2009), and South Carolina (2010) served as case studies from which to draw components of successful strategies to develop a regional legislative toolkit for those wishing to increase cigarette excise taxes in the Southeast.  In all of these states, the tax increase was controversial. The Southeast in general is tax averse, which presents a systemic challenge to those who advocate raising taxes on cigarettes.

Senate Votes & Poverty by CountyThe researchers examined state characteristics which might influence the outcome of efforts to raise excise taxes, such as coalitions for and against proposed increases, the facts each side brought to bear and the nature of the discourse mobilized by different groups, the economic impact in each state of both smoking and the proposed excise taxes, and local political realities. The students restricted the area of interest to the Southeast because this region has a shared history and, consequently, similar challenges when it comes to race, poverty, and rural populations. They are also, broadly speaking, politically similar and have had a similar experience with both tobacco use and government regulation.

This multi-disciplinary analysis provides a reference point for state legislators or interest groups wishing to pass cigarette tax increases.  The deliverable provided a model of past voting trends, suggestions for framing political dimensions of the issue, and strategies to overcome opposition in state legislatures.

Comparing Legislative Districts and County Data
Senate Votes & Party AffiliationThe bulk of the research involved mapping the political landscape surrounding cigarette tax legislation.  In doing so, researchers looked at voting records, interest group politics, campaigns, and state ideology. Broadly, the research entailed charting the electoral geography by overlaying state house and senate districts with county-level data.  Districts were coded based on voting history, party affiliation, smoking rates, and constituent demographics.  State legislature websites were used to find representatives’ voting histories, allowing the researchers to match legislators by county when constructing a GIS dataset.  County party affiliations are available through the state board of elections.  Finally, county demographics came from the 2010 Census data.

Senate Votes & Percent Black by County

Overcoming Ideology
Besides using GIS mapping to illustrate these relationships, the researchers analyzed the involvement of major interest groups, specifically, lobbying expenditures and campaign contributions to map the involvement of both pro- and anti-tobacco interest groups.  Additionally, they examined the impact of state ideology on the framing of political dimensions, looking at editorials, opinion pieces, newspapers, and committee markups, as well as interviews (both previous interviews and ones they conducted) with state legislators and interest groups.  Overcoming state ideology, both political and social, is a major factor in passing cigarette excise tax legislation, especially in a region with such dominant tobacco influence.

Again, the purpose of the research is not merely to understand the political landscapes surrounding the passage of cigarette tax bills, but to apply these findings to the creation of a legislative toolbox for representatives or interests groups concerned with pushing similar legislation.

Swimming in a Sea of Data

This post comes from Erika Kociolek, a second year Master in Environmental Management student at the Nicholas School.  The Data and GIS staff want to congratulate Erika on successfully defending her project!

For about 4 months, I’ve been swimming in a proverbial sea of data related to hypoxia (low dissolved oxygen concentrations) and landings in the Gulf of Mexico brown shrimp fishery.  I’m a second year master of environmental management (MEM) student at the Nicholas School, focusing on Environmental Economics and Policy.  I’ve been working with my advisor, Dr. Lori Bennear, to complete my master’s project (MP), an analysis attempting to estimate the effect of hypoxia  on landings and other economic outcomes of interest.

To do this, we are using data from the Southeast Monitoring and Assessment Program (SEAMAP), NOAA/NMFS, and a database of laws and policies related to brown shrimp that I compiled in Fall 2010.  By running regressions that difference out all variation in catch except for that attributable to hypoxia, we can isolate its effect on economic outcomes of interest.  I’ve found that catch, revenue, catch per unit effort, and revenue per unit effort are all larger in the presence of summer hypoxia.  However, if we look at catch for different sizes of shrimp, we see that in the presence of summer hypoxia, catch of larger shrimp decreases and catch of smaller shrimp increases significantly.

Getting to the point of discussing results has required a bunch of data analysis, cleaning, management, and visualization.  I used R, STATA, ArcGIS, and have even used video editing software to make dynamic graphics representing my results that have improved my own understanding of the raw data.  As an example, the video below, showing the change in hypoxia over time (1997-2004), was created using ArcGIS 10.

Note: The maps in the video above use data from the Southeast Monitoring and Assessment Program (SEAMAP).

Hypoxia is a dynamic and complex phenomenon, varying in severity, over time, and in space; hypoxia in Gulf waters is more severe and widespread in summer.  The model I’m using actually takes advantage of this variation to obtain an estimate of the effect of hypoxia on catch and other economic outcomes.  To show people the source of variation I’m exploiting, I created this video.  These maps are drawing on data of dissolved oxygen concentrations and displaying it spatially.

We have dissolved oxygen measurements for most of the Gulf in the summer (June) and fall (December).  Each subarea-depth zone (see related map) that changes from salmon shading (not hypoxic) to red (hypoxic), or vice-versa, is variation in hypoxia that the models I’m running use to get an estimate of the hypothesized effect.

Many thanks are due to my advisor, Dr. Bennear, as well as to the helpful folks at the Data/GIS lab, who have provided invaluable assistance with the data management and data visualization components of this project!

This research was funded by NOAA’s National Center for Coastal Ocean Science, Award #NA09NOS4780235.

Wrangle, Refine, and Represent

Data visualization and data management represented the core themes of the 2011 Computer Assisted Reporting (CAR) Conference that met in Raleigh from February 24-27.  Bringing together journalists, computer scientists, and faculty, the conference united a number of communities that share a common interest in gathering and representing empirical evidence online (and in print).

While the conference featured luminaries in data visualization (Amanda Cox, David Huynh , Michal Migurski, Martin Wattenberg) who gave sage advice on how to best represent data online, web based data visualization tools provided a central focus for the conference.

Notable tools that may be of interest to the Duke research (and teaching) community include:

DataWrangler – An interactive data cleaning tool much like Google Refine (see below)

Google Fusion Tables – “manage large collections of tabular data in the cloud” – Fusion tables provides convenient access to google’s data visualization and mapping services.  The service also allows groups to annotate data online.

Google Refine – Refine is primarily a data cleaning tool that simplifies the process of cleaning data for further processing or analysis.  While users of existing data management tools may not be convinced to leave their current data management tool, Refine provides a rich suite of tools that will likely attract many new converts.

Many Eyes – One of the premier online visualization tools hosted by IBM.  Visualizations range from pie charts to digital maps to text analysis.  Many Eye’s versatility is one of its key strengths.

Polymaps – Billed as a “javascript library for image- and vector-tiled maps” – Polymaps allows the creating of custom lightweight map services on the web.

SIMILE Project (Semantic Interoperability of Metadata and Information in unLike Environments) – The SIMILE Project is a collection of different research projects designed to “enhance inter-operability” among digital assets.  At the conference, the Exhibit Project received particular attention for its ability to produce data rich visualization with very little coding required.

Timeflow –  Presented by Sarah Cohen and designed by Martin Wattenberg- Timeflow provides a convenient application for visualizing temporal data.

SimplyMap! – Census and business data made easier

Online mapping and data access has become even easier with the launch of SimplyMap 2.0.  A long time favorite of Economics and Public Policy courses (and faculty) at Duke, this program provides a straight forward interface for web-based mapping and data extraction application that lets users create thematic maps and reports using US census, business, and marketing data.

Screenshot
SimplyMap 2.0 map interface

Version 2.0 includes improvements designed to make it easier to find and analyze data and create professional looking GIS-style thematic maps.

Significant changes include:

  • A new multi-tab interface to allow you to easily switch between your projects
  • Interactive wizards to guide you through making maps and reports
  • Can choose to automatically select the geographic unit displayed on a map based on the zoom level
  • Easier searching and browsing to choose data variables
  • Assign keyword tags to organize your maps and reports
  • Share your work with other users of SimplyMap (send a URL that lets them open a copy of your map or report)
  • Data filters (greater than, less than, etc.) can now be applied to both maps and reports
  • More export options: Data: Excel, DBF, CSV;  Maps: GIF, PDF, Shapefiles (boundaries only, no attributes)
  • Faster performance

Give SimplyMap 2.0 a try and let us know what you think.  Support is always available in Perkins Data and GIS.

Policy Paradox: Mapping Residential Restrictions

Do residential restrictions placed on convicted sex offenders serve to protect the public?  Duke Economics Ph.D. candidate Songman Kang, has been using the analytical capabilities of geographic information software to help determine the extent to which the restrictions affect residential locations of sex offenders: computing the area covered by a restriction and determining which offenders had to relocate due to a restriction.

According to Kang, the residential restrictions are designed to reduce recidivism among sex offenders and prevent their presence near places where children regularly congregate.  Neither of these claims has been found consistent with empirical evidence though, and it is unclear whether the restrictions have been successful in reducing the rates of repeat sex offenses.  On the other hand, the restrictions severely limit residential location choices, and may force offenders to relocate away from employment opportunities and supportive networks of family and friends.  As a result of the deteriorated economic conditions, the offenders who had to relocate may become more likely to commit non-sex offenses.

The following maps illustrate some of the restricted zones in Miami and in the Triangle area of North Carolina studied by Mr. Kang.

Figure 1: Residential Restricted Zones in Miami

Figure 2: Triangle Restricted Residences

Making Data Flow

As water quality and questions of water supply have grown more salient in the Triangle, Duke researchers have tried to contribute to the growing debate over water quality using the latest digital mapping (GIS) tools.  In the fall of 2009, Data and GIS Services in Perkins Library provided GIS analysis support for a stream and watershed assessment project that developed strategies to reverse the impact of poor urban stormwater management, degraded water quality, and the loss of natural habitats on the Duke campus.

Data/GIS helped the researchers access critical spatial data for the characterization of the contributing watershed’s current land use patterns.  This data enabled the students to analyze the watershed’s area of impervious surface and hydrologic flow paths, and helped inform the understanding of the water quality issues faced at the stream site.

The GIS map below illustrates how digital mapping tools can be used to summarize a large amount of complex data into a compelling presentation.

Special thanks to the interdisciplinary team of environmental and civil engineers, biology and environmental science majors, and a Nicholas MEM student who shared their project results: Alicia Burtner, Matt Ball, Nari Sohn, Avni Patel, Will Bierbower, Adam Nathan, Mike Schallmo, Justine Jackson-Ricketts, and Jai Singh.