Purdue soil scientist digs into data with the help of ITaP Research Computing

A Purdue professor who studies how the physical and chemical properties of soils change across landscapes is using a new ITaP Research Computing data analysis tool to speed up computations for his research.

Jason Ackerson, an assistant professor of agronomy, uses techniques such as spectroscopy – a measurement of how the soil reflects light – to study the properties of different soils. He generates soil maps that can be fed into simulations such as climate models or hydrologic models. To clean up his data and estimate the accuracy of his measurements and of the resulting soil maps, he needs to run computationally intensive models as well.

That’s where ITaP’s Data Workbench tool comes in. Data Workbench is an interactive computing environment that provides access to web-based data analysis tools such as JupytherHub and R Studio Server.

Unlike many of the researchers using Purdue’s Community Cluster Program supercomputers, Ackerson doesn’t technically need more computational power than he already has on his personal computer. But now, instead of leaving his machine running over the weekend and crossing his fingers that his computations have successfully finished by Monday morning, he can quickly and efficiently do them in a matter of a few hours using Data Workbench.

“It’s taken something that was a real chore, and it’s now a trivial task,” Ackerson says. “It’s made the science a lot faster.” 

That decreased time to science is something that comes in handy not just during his initial data-crunching phase, but also during the paper submission and review process. When the reviewers of one of Ackerson’s papers wanted him to re-specify the models he used, which required re-running all the computations, he relied on Data Workbench to do it in a timely manner. 

Data Workbench is user-friendly, even for Ackerson’s students who are just learning how to use R Studio. They can figure out how to code everything on a smaller data set on their laptops, and then easily move it to Data Workbench to do the computational heavy lifting. Ackerson himself appreciates that he could use the same R environment he was already familiar with, so “the start-up costs were non-existent. If I’d had to sit down and figure out how to re-do everything in Linux or Python, I never would have done it.”

Ackerson is also grateful for the ability to connect to Data Workbench remotely from anywhere in the world. That’s given him the flexibility to double-check his analysis just before presenting at conferences, for example.

Access to Data Workbench is available for approximately $300 annually for each lab or research group.

To learn more about Data Workbench and other Research Computing resources, contact Preston Smith, director of research services and support for ITaP, psmith@purdue.edu or 49-49729.

Writer:  Adrienne Miller, science and technology writer, Information Technology at Purdue (ITaP), 765-496-8204, mill2027@purdue.edu

Last updated: March 4, 2019