Category Archives: Duke research

CDVS Chat or Zoom for Online Data Advice

As students and classes moved online in the spring of 2020, the Center for Data and Visualization Sciences realized that it was time to expand our existing email (askdata@duke.edu) and lab based consultation services to meet the data demands of online learning and remote projects. Six months and hundreds of online consultations later, we have developed a new appreciation for the online tools that allow us to partner with Duke researchers around the world. Whether you prefer to chat, zoom, or email, we hope to work with you on your next data question!

Chat

 

Ever had a quick question about how to visualize or manage your data, but weren’t sure where to get help? Having trouble figuring out how to get the data software to do what you need for class/research? CDVS offers roughly thirty hours of chat support each week.  Data questions on chat cover our full range of data support. If we cannot resolve a question in the chat session, we will make a referral for a more extended consultation.

Zoom

We’re going to be honest…  we miss meeting Duke students and faculty in the Brandaleone Lab in the Edge and consulting on data problems!  However, virtual data consultations over zoom have some advantages over an in-person data consultations at the library. With zoom features such as screen sharing, multiple participants, and chat, we can reach both individuals and project teams in a format where everyone can see the screen and sharing resource links is simple. As of October 1st, we have used zoom to consult on questions from creating figures in the R programming languages to advising Bass Connection teams on the best way to visualize their research.  We are happy to schedule zoom consultations via email at: askdata@duke.edu.

Just askdata@duke.edu

Even with our new data chat service and video chat services, we are still delighted to advise on questions over email at askdata@duke.edu. As the days grow shorter this fall and project deadlines loom, we look forward to working with you to resolve your data challenges!

Fall 2020 – CDVS Research and Education During COVID-19

The Center for Data and Visualization Sciences is glad to welcome you back to a new academic year! We’re excited to have friends and colleagues returning to the Triangle and happy to connect with Duke community members who will not be on campus this fall.

This fall, CDVS will expand its existing online consultations with a new chat service and new online workshops for all members of the Duke community. Since mid-March, CDVS staff have redesigned instructional sessions, constructed new workflows for accessing research data, and built new platforms for accessing data tools virtually. We look forward to connecting with you online and working with you to achieve your research goals.

In addition to our expanded online tools and instruction, we have redesigned our CDVS-Announce data newsletter to provide a monthly update of data news, events, and workshops at Duke. We hope you will consider subscribing.

Upcoming Virtual CDVS Workshops

CDVS continues to offer a full workshops series for the latest strategies and tools for data focused research. Upcoming workshops for early September include:

R for data science: getting started, EDA, data wrangling
Thursday, Sep 1, 2020 10am – 12pm
This workshop is part of the Rfun series. R and the Tidyverse are a data-first coding language that enables reproducible workflows. In this two-part workshop, you’ll learn the fundamentals of R, everything you need to know to quickly get started. You’ll learn how to access and install RStudio, how to wrangle data for analysis, gain a brief introduction to visualization, practice Exploratory Data Analysis (EDA), and how to generate reports.
Register: https://duke.libcal.com/event/6867861

Research Data Management 101
Wednesday, Sep 9, 2020 10am – 12pm
This workshop will introduce data management practices for researchers to consider and apply throughout the research lifecycle. Good data management practices pertaining to planning, organization, documentation, storage and backup, sharing, citation, and preservation will be presented using examples that span disciplines. During the workshop, participants will also engage in discussions with their peers on data management concepts as well as learn about how to assess data management tools.
Register: https://duke.libcal.com/event/6874814

R for Data Science: Visualization, Pivot, Join, Regression
Wednesday, Sep 9, 2020 1pm – 3pm
This workshop will introduce data management practices for researchers to consider and apply throughout the research lifecycle. Good data management practices pertaining to planning, organization, documentation, storage and backup, sharing, citation, and preservation will be presented using examples that span disciplines. During the workshop, participants will also engage in discussions with their peers on data management concepts as well as learn about how to assess data management tools.
Register: https://duke.libcal.com/event/6867914

ArcGIS StoryMaps
Thursday, September 10, 2020 1pm – 2:30pm
This workshop will help you get started telling stories with maps on the ArcGIS StoryMaps platform. This easy-to-use web application integrates maps with narrative text, images, and videos to provide a powerful communication tool for any project with a geographic component. We will explore the capabilities of StoryMaps, share best practices for designing effective stories, and guide participants step-by-step through the process of creating their own application.
Register: https://duke.libcal.com/event/6878545

Assignment Tableau: Intro to Tableau work-together
Friday, September 11, 2020 10am – 11:30am
Work together over Zoom on an Intro to Tableau assignment. Tableau Public (available for both Windows and Mac) is incredibly useful free software that allows individuals to quickly and easily explore their data with a wide variety of visual representations, as well as create interactive web-based visualization dashboards. Attendees are expected to watch Intro to Tableau Fall 2019 online first, or have some experience with Tableau. This will be an opportunity to work together on the assignment from the end of that workshop, plus have questions answered live.
Register: https://duke.libcal.com/event/6878629

Boost Your Energy

Energy at Duke

With the launch of the Duke University Energy Intiative (EI) several years ago, the Center for Data and Visualization Sciences (CDVS) has seen an increased demand for all sorts of data and information related to energy generation, distribution, and pricing.  The EI is a university-wide, interdisciplinary hub that advances an accessible, affordable, reliable, and clean energy system.  It involves researchers and students from the Pratt School of Engineering, the Nicholas School of the Environment, the Sanford School of Public Policy, the Duke School of Law, the Fuqua School of Business, and departments in the Trinity College of Arts & Sciences.

The creation of the EI included development of an Undergraduate Certificate in Energy and Environment and an Undergraduate Minor in Energy Engineering in the Pratt School.  An Energy Data Analytics PhD Student Fellows program is affiliated with the EI’s Energy Data Analytics Lab, and  Duke’s BassConnections program includes several Energy & Environment teams led by the Energy Initiative.

The EI website provides links to energy-related data sources, particularly datasets that have proven useful in Duke energy research projects. We will discuss below some more key sources for finding energy-related data.

Energy resources and potentials

The sources for locating energy data will vary depending on the type of energy and the spot on the source-to-consumption continuum that interests you.

The US Department of Energy’s (DoE’s) Energy Information Administration (EIA) has a nice outline of energy sources, with explanations of each, in their Energy Explained web pages. These include nonrenewable sources such as petroleum, gas, gas liquids, coal, and nuclear.  The EIA also discusses a number of renewable sources such as hydropower (e.g., dams, tidal, or wave action), biomass (e.g., waste or wood), biofuels (e.g., ethanol or biodiesel), wind, geothermal, and solar. Hydrogen is another fuel source discussed on these pages.

Besides renewability, a you might be interested in a source’s carbon footprint. Note that some of the sources the EIA lists as renewables may be carbon creating (such as biomass or biofuels), and some non-renewables may be carbon neutral (such as nuclear).  Any type of energy source clearly has environmental implications, and the Union of Concerned Scientists has a discussion of the Environmental Impacts of Renewable Energy Technologies.

The US Geological Survey’s Energy Resources Program measures resource potentials for all types of energy sources.  The Survey is a great place to find data relating to their traditional focus of fossil fuel reserves, but also for some renewables such as geothermal.  The EIA provides access to GIS layers relating to energy, not only reserves and renewable potentials, but also infrastructure layers.

The DOE’s Office of Scientific and Technical Information (OSTI) is well known as a repository of technical reports, but it also hosts the DOE Data Explorer. This includes hidden gems like the REPLICA database (Rooftop Energy Potential of Low Income Communities in America), which has geographic granularity down to the Census Tract level.

For more on renewables, check out the NREL (National Renewable Energy Laboratory), which disseminates GIS data relating to renewable energy in the US (e.g., wind speeds, wave energy, solar potential), along with some international data. The DoE’s Open Data Catalog is also particularly strong on datasets (tabular and GIS) relating to renewables.  The data ranges from very specific studies to US nationwide data.

REexplorer, showing wind speed in Kenya

For visualizing energy-related map layers from selected non-US countries, the Renewable Energy Data Explorer (REexplorer) provides an online mapping tool. Most layers can be downloaded as GIS files. The International Renewable Energy Agency (IRENA) also has statistics on renewables. Besides downloadable data, summary visualizations can be viewed online using Tableau Dashboards.

Price and production data

The US DOE “Energy Economy” web pages will introduce you to all things relating to the economics of energy, and their EIA (mentioned above) is the main US source for fossil fuel pricing, from both the production and the retail standpoint.

Internationally, the OECD’s International Energy Agency (IEA) collects supply, demand, trade, production and consumption data, including price and tax data, relating to oil, gas, and coal, as well as renewables.  In the OECD iLibrary go to Statistics tab to find many detailed IEA databases as well as PDF book series such as World Energy Balances, World Energy Outlook, and World Energy Statistics. For more international data (particularly in the developing world), you might want to try Energydata.info.  This includes geospatial data and a lot on renewables, especially solar potential.

Finally, a good place to locate tabular data of all sorts is the database ProQuest Statistical Insight. It indexes publications from government agencies at all levels, IGOs and NGOs, and trade associations, usually providing the data tables or links to the data.

Infrastructure (Generation, Transportation/Distribution, and Storage)

ArcGIS Pro using EPA’s eGRID data

Besides the EIA’s GIS layers relating to energy, mentioned above, another excellent source for US energy infrastructure data is the Homeland Infrastructure Foundation-Level Data (HIFLD), which includes datasets on energy infrastructure from many government agencies. These include geospatial data layers (GIS data) for pipelines, power plants, electrical transmission and more. For US power generation, the Environmental Protection Agency has their Emissions & Generation Resource Integrated Database (eGRID).  eGRID data includes the locations of all types of US electrical power generating facilities, including fuel used, generation capacity, and detailed effluent data. For international power plant data, the World Resources Institute’s (WRI’s) Global Power Plant Database includes data on around 30,000 plants, and some of WRI’s other datasets also relate to energy topics.

Energy storage can include the obvious battery technologies, but also pumped hydroelectric systems and even more novel schemes.  The US DoE has a Global Energy Storage Database with information on “grid-connected energy storage projects and relevant state and federal policies.”

Businesses

For data or information relating to individual companies in the energy sector, as well as for more qualitative assessments of industry segments, you can begin with the library’s Company and Industry Research Guide. This leads to some of the key business sources that the Duke Libraries provide access to.

Trade Associations

Trade associations that promote the interests of companies in particular industries can provide effective leads to data, particularly when you’re having trouble locating it from government agencies and IGOs/NGOs. If they don’t provide data or much other information on their websites, be sure to contact them to see what they might be willing to share with academic researchers. Most of the associations below focus on the United States, but some are global in scope.

These are just a few of the sources and strategies for locating data on energy.  For more assistance, please contact the Center for Data and Visualization Sciences: askdata@duke.edu

Visualization Exhibit and Events

2015-01-07 16.32.31

ps_logoThis semester, Duke is proud to host the Places & Spaces: Mapping Science exhibit, visiting from Indiana University.  Places & Spaces is a 10-year effort by Dr. Katy Börner (director of the Cyberinfrastructure for Network Science Center) to bring focus to visualization as a medium of scholarly communication.

20150105_105415The exhibit includes 100 maps from various disciplines and cultures and highlights myriad visualization techniques that have been used to communicate science to a broader public. The maps are divided among three spaces on campus: The Edge (newly opened on the first floor of Bostock Library), Smith Warehouse (on the second floor of Bay 11), and Gross Hall (on the third floor).

KatyBorner_weblrgTo celebrate the opening, Dr. Börner will visit Duke on January 21st and 22nd.  She will give a keynote presentation on Wednesday, January 21, at 4pm, in the Edge.  A reception will follow.

Additional events next week and throughout the semester will celebrate the exhibit and promote ongoing visualization work at Duke.  All events are open to the public!

Upcoming events

Wednesday, January 21

Thursday, January 22

Friday, January 23

More information about the exhibit and related events is available at:
http://sites.duke.edu/scimaps/ and
http://scimaps.org/duke

Please contact Angela Zoss (angela.zoss@duke.edu) with any questions or suggestions.  We hope you can join us in celebrating and enjoying this exhibit!

Demystifying Data & GIS Services

Staff Expertise ChartConfused about Data & GIS Services?  Not sure what questions you should be asking us or what kind of services we provide?  Here’s one handy chart we’ve come up with to explain what exactly we cover in our consultations and workshops.

When it comes to picking what day to stop by our walk-in hours or knowing how much of the data life cycle our consultants cover, this graphic might be your first stop.  Whether it’s finding data, processing or analyzing that data, or mapping and visualizing that data, we have staff with expertise to help!

Still not sure who to approach or what kind of help you might need?  Just email askdata@duke.edu to get in touch with all of us at once.  Some questions can be answered quickly over email, but we’re also happy to schedule an appointment to talk in person.

Data and GIS Services Spring 2014 Workshop Series

DGSwkshpExplore network analysis, text mining, online mapping, data visualization, and statistics in our spring 2014 workshop series.  Our workshops provide a chance to explore new tools or refresh your memory on effective strategies for managing digital research.  Interested in keeping up to date with workshops and events in Data and GIS?  Subscribe to the dgs-announce listserv or follow us on Twitter (@duke_data).

Currently Scheduled Workshops

 Thu, Jan 9 2:00 PM – 3:30 PM  Data Management Plans – Grants, Strategies, and Considerations

 Mon, Jan 13 2:00 PM – 3:30 PM Webinar: Social Science Data Management and Curation

 Mon, Jan 13 3:00 PM – 4:00 PM Google Fusion Tables

 Tue, Jan 14 3:00 PM – 4:00 PM Open (aka Google) Refine 

 Wed, Jan 15 1:00 PM – 3:00 PM Stata for Research

 Thu, Jan 16 3:00 PM – 5:00 PM Analysis with R

 Tue, Jan 21 1:00 PM – 3:00 PM Introduction to ArcGIS

 Wed, Jan 22 1:00 PM – 3:00 PM ArcGIS Online

 Wed, Jan 22 3:00 PM – 4:00 PM Open (aka Google) Refine 

 Mon, Jan 27 2:00 PM – 3:30 PM Introduction to Text Analysis

 Wed, Jan 29 1:00 PM – 3:00 PM Analysis with R

 Thu, Jan 30 2:00 PM – 4:00 PM Stata for Research

 Mon, Feb 3 1:00 PM – 2:00 PM  Data Visualization on the Web

 Mon, Feb 3 2:00 PM – 3:00 PM  Data Visualization on the Web (Advanced)

 Tue, Feb 11 2:00 PM – 4:00 PM Using Gephi for Network Analysis and Visualization

 Wed, Feb 12 1:00 PM – 3:00 PM Introduction to ArcGIS

 Tue, Feb 18 2:00 PM – 3:30 PM Introduction to Tableau Public 8

 Tue, Feb 25 1:00 PM – 3:00 PM ArcGIS Online

 Thu, Feb 27 1:00 PM – 3:00 PM Historical GIS

 Mon, Mar 3 2:00 PM – 3:30 PM  Designing Academic Figures and Posters

 Tue, Mar 4 1:00 PM – 3:00 PM  Useful R Packages: Extensions for Data Analysis, Management, and Visualization

Announcing the 2014 Student Data Visualization Contest

Student Data Visualization ContestData & GIS Services will soon be accepting submissions to its 2nd annual student data visualization contest.  If you have a course project that involves visualization, start thinking about your submission now!

The purpose of the contest is to highlight outstanding student data visualization work at Duke University. Data & GIS Services wants to give you a chance to showcase the hard work that goes into your visualization projects.

Data visualization here is broadly defined, encompassing everything from charts and graphs to 3D models to maps to data art.  Data visualizations may be part of a larger research project or may be developed specifically to communicate a trend or phenomenon. Some are static images, while others may be animated simulations or interactive web experiences.  Browse through last year’s submissions to get an idea of the range of work that counts as visualization.

The Student Data Visualization Contest is sponsored by Data & GIS Services, Perkins Library, Scalable Computing Support Center, Office of Information Technology, and the Office of the Vice Provost for Research.

For more details, see the 2014 Student Data Visualization Contest page.   Please address all additional questions to Angela Zoss (angela.zoss@duke.edu), Data Visualization Coordinator, 226 Perkins Library.

Data Management Planning Advice – DMPTool @ Duke

Data and GIS Services is happy to announce the launch of new service designed to provide detailed data management planning help online.  As an increasing number of granting agencies require a data management plan as part of the grant application process, the DMPTool provides “an open source, web application that assists researchers in producing data management plans and delivering them to funders.” For Duke researchers, the tool provides constantly updated advice about how to complete a data management plan while simultaneously highlighting Duke resources available from a variety of data support providers for the planning, maintenance, and sharing of research data.

We hope that the DMPTool will streamline the grant writing process and help researchers make the appropriate connections to resources available both at Duke and beyond for data management planning.  We welcome your comments and suggestions on this resource.

DMPTool

Data and GIS Back to School – Fall 2012

Visualize your data, analyze your results, map your statistics, and find the data you need!  Come visit us in Perkins 226 (second floor Perkins) for a consultation or contact us online (email: askdata@duke.edu or twitter: duke_data OR duke_vis).  We look forward to working with you on your next data driven project.

New Data Lab Opens- August 2012

http://library.duke.edu/data/about/lab.html

With 12 workstations with dual 24″ monitors and 16 gigs of memory, the new Data and GIS lab is ready to take on the most challenging statistical, mapping, and visualization research projects.  The new lab also features a flatbed scanner for projects moving from print to digital data.  Lab hours are the same hours as Perkins Library (almost 24/7).

Visualize This!  New Data Visualization Program

Perkins Library is proud to introduce Angela Zoss our new Data Visualization Coordinator. Schedule a consultation, attend a workshop, or learn more about research in Data Visualization at Viz Forum this fall.

New workshops for Fall 2012

http://library.duke.edu/data/news/index.html

Learn about data management planning. Apply text mining strategies to understand your documents.  Visualize your data with Tableau Public, or map your results using ArcGIS or Google Earth Pro.  A new series of workshops connects traditional statistical, geospatial, and visualization tools with web based options.  Register online for our courses or schedule a session for your course by emailing askdata@duke.edu

Bloomberg Professional News and Financial Data

http://blogs.library.duke.edu/data/2011/08/29/bloomberg-has-arrived/

If you missed last fall’s Bloomberg service – Duke Libraries in pleased to announce the installation of three Bloomberg financial terminals in the Data and GIS Lab in 226 Perkins.  The terminals provide the latest news and financial data and include an application that makes it easy to export data to Excel.  Access is restricted to all current Duke affiliates.  Training on Bloomberg is currently being planned for the last week of September.  Please email askdata@duke.edu to reserve a space at the training session.

Get help with Data Management Planning

http://library.duke.edu/data/guides/data-management/index.html

Data and GIS has launched a new guide that provides guidance for researchers looking for advice on data management plans now required by several granting agencies.  The guide provides examples of sample plans, key concepts involved in writing a plan, and contact information for groups on campus providing data management advice.  In addition, we offer individual consultations with researchers on data management planning.

New Collections for Fall 2012

http://library.duke.edu/data/collections/new.html

Contact Us! – askdata@duke.edu

 

Where There’s Smoke …

A team of Duke undergraduates participating in the Global Health Capstone course was awarded the “Outstanding Capstone Research Project” for their examination of state and congressional district characteristics that might influence the outcome of legislative efforts to raise cigarette excise taxes in North Carolina, South Carolina, and Mississippi.  Sarah Chapin and Gregory Morrison used GIS mapping tools in the Library’s Data & GIS Services Department to illuminate the relationships between county demographics and state legislators’ votes for or against cigarette tax hikes. Brian Clement, Alexa Monroy, and Katherine Roemer were other members of the research group.  Congratulations!

Regional Focus
The recent cigarette excise tax increases Mississippi (2009), North Carolina (2009), and South Carolina (2010) served as case studies from which to draw components of successful strategies to develop a regional legislative toolkit for those wishing to increase cigarette excise taxes in the Southeast.  In all of these states, the tax increase was controversial. The Southeast in general is tax averse, which presents a systemic challenge to those who advocate raising taxes on cigarettes.

Senate Votes & Poverty by CountyThe researchers examined state characteristics which might influence the outcome of efforts to raise excise taxes, such as coalitions for and against proposed increases, the facts each side brought to bear and the nature of the discourse mobilized by different groups, the economic impact in each state of both smoking and the proposed excise taxes, and local political realities. The students restricted the area of interest to the Southeast because this region has a shared history and, consequently, similar challenges when it comes to race, poverty, and rural populations. They are also, broadly speaking, politically similar and have had a similar experience with both tobacco use and government regulation.

This multi-disciplinary analysis provides a reference point for state legislators or interest groups wishing to pass cigarette tax increases.  The deliverable provided a model of past voting trends, suggestions for framing political dimensions of the issue, and strategies to overcome opposition in state legislatures.

Comparing Legislative Districts and County Data
Senate Votes & Party AffiliationThe bulk of the research involved mapping the political landscape surrounding cigarette tax legislation.  In doing so, researchers looked at voting records, interest group politics, campaigns, and state ideology. Broadly, the research entailed charting the electoral geography by overlaying state house and senate districts with county-level data.  Districts were coded based on voting history, party affiliation, smoking rates, and constituent demographics.  State legislature websites were used to find representatives’ voting histories, allowing the researchers to match legislators by county when constructing a GIS dataset.  County party affiliations are available through the state board of elections.  Finally, county demographics came from the 2010 Census data.

Senate Votes & Percent Black by County

Overcoming Ideology
Besides using GIS mapping to illustrate these relationships, the researchers analyzed the involvement of major interest groups, specifically, lobbying expenditures and campaign contributions to map the involvement of both pro- and anti-tobacco interest groups.  Additionally, they examined the impact of state ideology on the framing of political dimensions, looking at editorials, opinion pieces, newspapers, and committee markups, as well as interviews (both previous interviews and ones they conducted) with state legislators and interest groups.  Overcoming state ideology, both political and social, is a major factor in passing cigarette excise tax legislation, especially in a region with such dominant tobacco influence.

Again, the purpose of the research is not merely to understand the political landscapes surrounding the passage of cigarette tax bills, but to apply these findings to the creation of a legislative toolbox for representatives or interests groups concerned with pushing similar legislation.