Now the program manager for the USDA’s Forest Service Research and Development Research Data Services, David Rugg grew up in the village of Buffalo Grove, a suburb northwest of Chicago. Growing up, he enjoyed catching and raising tadpoles and observing insects. He watched nature shows on public television and read popular books on animal behavior by authors like Jane Goodall and George Schaller. Today, he spends his free time helping research dolphin communication, co-authoring scientific articles on topics unrelated to Forest Service work, reading and listening to music.
What do you do in the Forest Service and when did you start working here?
I started working for Research and Development (R&D) in 1983 as a mathematical statistician. I had about 16 fun years doing that before being asked to build a research data publishing program for what was then the North Central Research Station, which led to an R&D-wide pilot project. Since 2010 I’ve managed R&D’s Research Data Services program, the largest component of which is the Forest Service Research Data Archive. The Archive formally publishes research data. Just as we have research articles from over 100 years ago, with luck our successors will be distributing these data publications in 100 years.
What is your favorite part of your job?
Creating and improving R&D’s scientific data publishing capability and seeing it become more successful than any of us expected. We expect the data from research articles to be less popular than the articles themselves, however, our top data publications are downloaded roughly as often as our top research articles in Treesearch.
How has your education, background, or personal experiences prepared you for the work that you do now?
My Master of Science (MS) degree in Ecology provides a good background for writing and reviewing metadata – a set of data that describes and gives information about other data. My MS in Statistics gives me a good feel for data structures. After all, as a statistician I had to get such information from scientists all the time. Authoring a couple dozen scientific journal articles has also been helpful. I also have some background knowledge in computer science, which has been handy for designing the Archive’s IT infrastructure.
Describe a recent, current, or upcoming project that you’re currently working on.
We’re doing a major data archaeology project at four long-term research sites managed in our Flagstaff, Arizona, office. The objective is to bring scientific data and documentation that are over 100 years old from the original paper files and into modern digital formats. As we progress, the data will support the work of agency scientists working at these sites – and will be available to scientists, educators and the public around the world. This is part of a larger project to make all historical and modern long-term data from R&D’s experimental forests, ranges and watersheds digitally accessible.
Describe a professional or personal achievement that you are particularly proud of.
In 2004, I developed software useful for writing data documentation, then published it through the Station in case it would be useful for anyone else in R&D. It became one of the Station’s most popular products and enjoyed worldwide use until the metadata standard it was designed for finally became obsolete.
Why do you think your field is important?
It’s a new and shiny object in science. Plus, when research data publishing is done right, it supports transparency in science and facilitates re-use and re-purposing of scientific data for modeling; analysis using new statistical techniques; and coupling with other research data to create insights and tools that no single study could accomplish. All of these are great for moving science forward, for applying science to management problems, and for extracting maximum return on the taxpayer investment in the original study.
What are some of the greatest challenges confronting your field?
It’s rather new, so figuring out how to do it well is a serious challenge. It’s not hard to do this work poorly, with the trade-off of creating nearly useless content. Of course, scientists have shared data privately for a long time. It’s like being at the dawn of scientific journals, which replaced letters between individual scientists as a primary means of communicating results.
What are some of the most promising strategies used by the Forest Service to address these challenges?
One key strategy is treating data publications as core scientific outputs rather than a box to be checked. We borrow/adapt ideas and technology from the journal community, we use a lot of open source software, and we keep up with advances in our field through activities like membership in the Research Data Alliance. Of course, two of our key staff members are statisticians, so we have a fundamental belief that data are important.
How would you like the public to perceive the work we do at the Forest Service?
We create some of the best land management science in the world across a surprisingly diverse array of disciplines…and increasingly, the best scientific data as well.