All posts by Joel Herndon, Ph.D.

Data Management Planning Advice – DMPTool @ Duke

Data and GIS Services is happy to announce the launch of new service designed to provide detailed data management planning help online.  As an increasing number of granting agencies require a data management plan as part of the grant application process, the DMPTool provides “an open source, web application that assists researchers in producing data management plans and delivering them to funders.” For Duke researchers, the tool provides constantly updated advice about how to complete a data management plan while simultaneously highlighting Duke resources available from a variety of data support providers for the planning, maintenance, and sharing of research data.

We hope that the DMPTool will streamline the grant writing process and help researchers make the appropriate connections to resources available both at Duke and beyond for data management planning.  We welcome your comments and suggestions on this resource.

DMPTool

Data and GIS Spring Semester News

New workshops for Spring 2013
http://library.duke.edu/data/news/index.html

Clean your data with Google Refine.  Use digital maps to explore the present and past.  Analyze data with R or Stata. Visualize your research with one of our data visualization courses.  The Data and GIS Workshops offer a range of research strategies for data based questions. Register online for our courses or schedule a session for your course by emailing askdata@duke.edu

Visualize This (and win a $500 technology prize)!
http://blogs.library.duke.edu/data/2012/12/04/2013-data-visualization-contest/

Are you a current Duke University undergraduate or graduate student? Have you used data visualization in a past or current research project to help solve a problem, tell a story, or highlight an interesting trend? Write up a short description and you’ll have a submission for the contest and a chance to win a $500 technology prize.

New Data Lab
http://library.duke.edu/data/about/lab.html

As mentioned in the fall – with 12 workstations with dual 24″ monitors and 16 gigs of memory, the new Data and GIS lab is ready to take on the most challenging statistical, mapping, and visualization research projects. The new lab also features a flatbed scanner for projects moving from print to digital data. Lab hours are the same hours as Perkins Library (almost 24/7).

Get help with Data Management Planning
http://library.duke.edu/data/guides/data-management/index.html

Puzzled by data management planning?  Not sure what to include in your grants data management plan?  Data and GIS has launched a guide that supports researchers looking for advice on data management plans now required by several granting agencies.  The guide provides examples of sample plans, key concepts involved in writing a plan, and contact information for groups on campus providing data management advice.

Get Data Help
http://library.duke.edu/data/about/staff.html

Come visit us in Perkins 226 for a consultation or contact us online (email: askdata@duke.edu or twitter: duke_data OR duke_vis).  Our consultants are available weekdays 8-5 by appointment or offer drop in hours as well We look forward to working with you on your next data driven project.

 

Data and GIS Back to School – Fall 2012

Visualize your data, analyze your results, map your statistics, and find the data you need!  Come visit us in Perkins 226 (second floor Perkins) for a consultation or contact us online (email: askdata@duke.edu or twitter: duke_data OR duke_vis).  We look forward to working with you on your next data driven project.

New Data Lab Opens- August 2012

http://library.duke.edu/data/about/lab.html

With 12 workstations with dual 24″ monitors and 16 gigs of memory, the new Data and GIS lab is ready to take on the most challenging statistical, mapping, and visualization research projects.  The new lab also features a flatbed scanner for projects moving from print to digital data.  Lab hours are the same hours as Perkins Library (almost 24/7).

Visualize This!  New Data Visualization Program

Perkins Library is proud to introduce Angela Zoss our new Data Visualization Coordinator. Schedule a consultation, attend a workshop, or learn more about research in Data Visualization at Viz Forum this fall.

New workshops for Fall 2012

http://library.duke.edu/data/news/index.html

Learn about data management planning. Apply text mining strategies to understand your documents.  Visualize your data with Tableau Public, or map your results using ArcGIS or Google Earth Pro.  A new series of workshops connects traditional statistical, geospatial, and visualization tools with web based options.  Register online for our courses or schedule a session for your course by emailing askdata@duke.edu

Bloomberg Professional News and Financial Data

http://blogs.library.duke.edu/data/2011/08/29/bloomberg-has-arrived/

If you missed last fall’s Bloomberg service – Duke Libraries in pleased to announce the installation of three Bloomberg financial terminals in the Data and GIS Lab in 226 Perkins.  The terminals provide the latest news and financial data and include an application that makes it easy to export data to Excel.  Access is restricted to all current Duke affiliates.  Training on Bloomberg is currently being planned for the last week of September.  Please email askdata@duke.edu to reserve a space at the training session.

Get help with Data Management Planning

http://library.duke.edu/data/guides/data-management/index.html

Data and GIS has launched a new guide that provides guidance for researchers looking for advice on data management plans now required by several granting agencies.  The guide provides examples of sample plans, key concepts involved in writing a plan, and contact information for groups on campus providing data management advice.  In addition, we offer individual consultations with researchers on data management planning.

New Collections for Fall 2012

http://library.duke.edu/data/collections/new.html

Contact Us! – askdata@duke.edu

 

Data and GIS Winter Newsletter 2012

Data driven teaching and research at Duke keeps growing and Perkins Data and GIS continues to increase support for researchers and classes employing data, GIS, and data visualization tools.  Whether your discipline is in the Humanities, Sciences, or Social Sciences, Perkins Data and GIS seeks to support researchers and students using numeric and geospatial data across the disciplines.

New Website for 2012
http://library.duke.edu/data/

You can find:

  • Online data or digital maps that you need for your project
  • A workshop on the latest software packages and digital tools

New workshops for 2012
http://library.duke.edu/data/news/index.html
Clean your data with Google Refine. Learn about data management planning. Visualize your data with Tableau Public, or map your results using ArcGIS or Google Earth Pro.  A new series of workshops connects traditional statistical, geospatial, and visualization tools with web based options.  Register online for our courses or schedule a session for your course by emailing askdata@duke.edu

  • StataReview                               (Statistics/Data Management)
  • Introduction to ArcGIS           (Geographic Information Systems / Data Visualization)
  • Data Management Planning  (Data Management/Grants)
  • Geocommons                            (Geographic Information Systems / Data Visualization)
  • Google Earth (Pro)                   (Geographic Information Systems / Data Visualization)
  • Google Refine                           (Data Management/Descriptive Statistics)
  • Tableau Public                          (Data Visualization)

Bloomberg (terminals) have arrived
http://blogs.library.duke.edu/data/2011/08/29/bloomberg-has-arrived/

Duke Libraries in pleased to announce the installation of three Bloomberg financial terminals in the Data and GIS Lab in 226 Perkins.  The terminals provide the latest news and financial data and include an application that makes it easy to export data to Excel.  Access is restricted to all current Duke affiliates.

Get help with Data Management Planning
http://library.duke.edu/data/guides/data-management/index.html

Data and GIS has launched a new guide that provides guidance for researchers looking for advice on data management plans now required by several granting agencies.  The guide provides examples of sample plans, key concepts involved in writing a plan, and contact information for groups on campus providing data management advice.

New Collections
http://library.duke.edu/data/collections/new.html
Explore the Indonesian Village Potential Statistics (PODES), look at household economic behavior in the Indian National Sample Survey, or explore historical digital maps of Europe- the Data and GIS collection collects research data sets and maps of interest to the Duke community covering a wide range of topics.

Support for Restricted Data Contracts and Restricted Data Licensing
Perkins Library has partnered with the Social Science Research Institute (SSRI) to support restricted data licensing with Paul Pooley as a restricted data specialist.  Paul is available  to work with researchers licensing restricted data and negotiating restricted data management plans.  Please contact Paul paul.pooley@duke.edu or askdata@duke.edu for more details.

Contact Us!askdata@duke.edu – twitter: duke_datahttp://library.duke.edu/data/hours.html

Joel Herndon
Head, Data and GIS Services
919-660-5946
Location: Room 227 Perkins
joel.herndon@duke.edu
Mark Thomas
Economics/GIS Librarian
919-660-5853
Location: Room 233 Perkins
mark.thomas@duke.edu
Teddy Gray
Biological Sciences Librarian
919-660-5971
Location: Room 233 Perkins
teddy.gray@duke.edu

Swimming in a Sea of Data

This post comes from Erika Kociolek, a second year Master in Environmental Management student at the Nicholas School.  The Data and GIS staff want to congratulate Erika on successfully defending her project!

For about 4 months, I’ve been swimming in a proverbial sea of data related to hypoxia (low dissolved oxygen concentrations) and landings in the Gulf of Mexico brown shrimp fishery.  I’m a second year master of environmental management (MEM) student at the Nicholas School, focusing on Environmental Economics and Policy.  I’ve been working with my advisor, Dr. Lori Bennear, to complete my master’s project (MP), an analysis attempting to estimate the effect of hypoxia  on landings and other economic outcomes of interest.

To do this, we are using data from the Southeast Monitoring and Assessment Program (SEAMAP), NOAA/NMFS, and a database of laws and policies related to brown shrimp that I compiled in Fall 2010.  By running regressions that difference out all variation in catch except for that attributable to hypoxia, we can isolate its effect on economic outcomes of interest.  I’ve found that catch, revenue, catch per unit effort, and revenue per unit effort are all larger in the presence of summer hypoxia.  However, if we look at catch for different sizes of shrimp, we see that in the presence of summer hypoxia, catch of larger shrimp decreases and catch of smaller shrimp increases significantly.

Getting to the point of discussing results has required a bunch of data analysis, cleaning, management, and visualization.  I used R, STATA, ArcGIS, and have even used video editing software to make dynamic graphics representing my results that have improved my own understanding of the raw data.  As an example, the video below, showing the change in hypoxia over time (1997-2004), was created using ArcGIS 10.

Note: The maps in the video above use data from the Southeast Monitoring and Assessment Program (SEAMAP).

Hypoxia is a dynamic and complex phenomenon, varying in severity, over time, and in space; hypoxia in Gulf waters is more severe and widespread in summer.  The model I’m using actually takes advantage of this variation to obtain an estimate of the effect of hypoxia on catch and other economic outcomes.  To show people the source of variation I’m exploiting, I created this video.  These maps are drawing on data of dissolved oxygen concentrations and displaying it spatially.

We have dissolved oxygen measurements for most of the Gulf in the summer (June) and fall (December).  Each subarea-depth zone (see related map) that changes from salmon shading (not hypoxic) to red (hypoxic), or vice-versa, is variation in hypoxia that the models I’m running use to get an estimate of the hypothesized effect.

Many thanks are due to my advisor, Dr. Bennear, as well as to the helpful folks at the Data/GIS lab, who have provided invaluable assistance with the data management and data visualization components of this project!

This research was funded by NOAA’s National Center for Coastal Ocean Science, Award #NA09NOS4780235.

Wrangle, Refine, and Represent

Data visualization and data management represented the core themes of the 2011 Computer Assisted Reporting (CAR) Conference that met in Raleigh from February 24-27.  Bringing together journalists, computer scientists, and faculty, the conference united a number of communities that share a common interest in gathering and representing empirical evidence online (and in print).

While the conference featured luminaries in data visualization (Amanda Cox, David Huynh , Michal Migurski, Martin Wattenberg) who gave sage advice on how to best represent data online, web based data visualization tools provided a central focus for the conference.

Notable tools that may be of interest to the Duke research (and teaching) community include:

DataWrangler – An interactive data cleaning tool much like Google Refine (see below)

Google Fusion Tables – “manage large collections of tabular data in the cloud” – Fusion tables provides convenient access to google’s data visualization and mapping services.  The service also allows groups to annotate data online.

Google Refine – Refine is primarily a data cleaning tool that simplifies the process of cleaning data for further processing or analysis.  While users of existing data management tools may not be convinced to leave their current data management tool, Refine provides a rich suite of tools that will likely attract many new converts.

Many Eyes – One of the premier online visualization tools hosted by IBM.  Visualizations range from pie charts to digital maps to text analysis.  Many Eye’s versatility is one of its key strengths.

Polymaps – Billed as a “javascript library for image- and vector-tiled maps” – Polymaps allows the creating of custom lightweight map services on the web.

SIMILE Project (Semantic Interoperability of Metadata and Information in unLike Environments) – The SIMILE Project is a collection of different research projects designed to “enhance inter-operability” among digital assets.  At the conference, the Exhibit Project received particular attention for its ability to produce data rich visualization with very little coding required.

Timeflow –  Presented by Sarah Cohen and designed by Martin Wattenberg- Timeflow provides a convenient application for visualizing temporal data.

Rolling with R in 2011

Interest in the open source statistical package R has grown over the last few years as researchers discover its powerful graphic capabilities, a suite of packages that extend its functionality, and its data import capabilities.  While several courses use R to teach introductory statistics, most researchers arrive at R with some statistical experience.  The following selected resources represent a growing number of books and websites designed to help orient users to the capabilities of R.

Quick-R Homepagequick_r
This website tries to provide a quick overview of basic data management and statistical capabilities of R for current SAS, SPSS, Stata, and Systat users.  The stress is on providing a brief overview of R commands for common data analysis needs.

R for Stata UsersR for Stata
A comprehensive guide for getting started in R using Stata as point of reference.

R for SAS and SPSS Users (not pictured)
Similar concept as R for Stata users.

Using R for Introductory Statistics
John Verzani’s R for Introductory Statistics is one of several introductions to using R for basic statistics. Examples are available as an R package.

Do you have other sources that you like for R?  Let us know in the comments.

Welcome!

Welcome to the Perkins Data and GIS blog!  Our goal is to highlight Duke research, collections, policies, and tools surrounding empirical
data and digital maps of interest to the research community.  We hope that this blog will serve as a catalyst to link researchers and resources across the Duke community and beyond!