As we begin our summer in Data and GIS Services, we spend this post reflecting back on some of the services, software, and tools that made data work this spring more productive and more visible. We proudly present our top 10 list for the Spring 2014 semster:
While we enjoy working directly with researchers crafting data management plans, we realize that some data management needs arise outside of consultation hours. Fortunately, the Data Management Planning Tool (DMPTool) is there 24/7 to provide targeted guidance on data management plans for a range of granting agencies.
9. Fusion Tables
A database in the cloud that allows you to query and visualize your data, Fusion Tables has proven a powerful tool for researchers who need database functionality but don’t have time for a full featured database. We’ve worked with many groups to map their data in the cloud; see the Digital Projects blog for an example. Fusion Tables is a regular workshop in Data and GIS.
8. Open Refine
You could learn the UNIX command line and a scripting language to clean your data, but Open Refine opens data cleaning to a wider audience that is more concerned with simplicity than syntax. Open Refine is also a regular workshop in Data and GIS.
7. R and RStudio
A programming language that excels at statistics and data visualization, R offers a powerful, open source solution to running statistics and visualizing complex data. RStudio provides a clean, full-featured development environment for R that greatly enhances the analysis process.
6. Tableau Public
Need a quick, interactive data visualization that you can share with a wide audience? Tableau Public excels at producing dynamic data visualizations from a range of different datasets and provides intuitive controls for letting your audience explore the data.
ArcGIS has long been a core piece of software for researchers working with digital maps. ArcOnline extends the rich mapping features of ArcGIS into the cloud, allowing a wider audience to share and build mapping projects.
A Python library that brings data analysis and modeling to the Python scripting language, Pandas brings the ease and power of Python to a range of data management and analysis challenges.
Paste in your spreadsheet data, choose a layout, drag and drop your variables… and your visualization is ready. Raw makes it easy to go from data to visualization using an intuitive, minimal interface.
2. Stata 13
Another core piece of software in the Data and GIS Lab (and at Duke), Stata 13 brought new features and flexibility (automatic memory management — “hello big data”) that were greatly appreciated by Duke researchers.
1. R Markdown
While many librarians tell people to “document your work,” R Markdown makes it easy to document your research data, explain results, and embed your data visualizations using a minimal markup language that works in any text editor and ties nicely into the R programming language. For pulling it all together, R Markdown is number one in our top ten list!
We hope you’ve enjoyed the list! If you are interested in these or other data tools and techniques, please contact us at firstname.lastname@example.org!