When exercise science student Josh Zushi swabbed dogs’ mouths to collect DNA from the bacteria living in their saliva, the data was so complex that analyzing it nearly destroyed his laptop.
The College of Science’s High-Performance Computing Committee is looking at options to improve access to supercomputing so researchers like Zushi can more easily process large sets of data.
The amount of information in a single DNA strand is enormous. Comparing and sorting many strands is more than an average computer can handle.
“I just had to leave my computer running for 17 hours, plugged in and not touched,” said Zushi, who graduated in April 2020. “It could potentially have fried my computer for good.”
At UVU, researchers wrangle large data sets to study topics from NASA satellite images to the shape of molecules. The need for high-performance computing has increased as new faculty with backgrounds in bioinformatics are hired and students look for research mentors who know these techniques.
Students and computational research
Reagan Dodge, a junior majoring in botany, sees the scientific world moving toward computing. That’s why she chose to work with Geoffrey Zahn, an assistant professor of biology, to study how populations of slime molds and bacteria change with rising temperatures.
“I really wanted to have a mentor that pushed coding and learning computer languages and understanding data analysis through computers,” Dodge said.
Thanks to his computer data analysis skills, Zushi is already an author on one academic paper, and he’s preparing to publish his research on bacteria in dogs.
“Being able to literally do every single step in this process of planning, and implementing, collecting data, analyzing the data, interpreting and publishing – it’s helped me,” he said.
Individual research labs in the College of Science have found their own ways to increase computing power using local systems or partners off campus. The committee was formed in 2019 to find a larger-scale solution for the entire college.
One option is to pay for access to the University of Utah supercomputer, which links together the processing power of multiple elements to do tasks quickly. Currently, UVU researchers have free access to the system, but their jobs must wait in a queue while the computer is processing the jobs of paid users.
That wait can be a problem. “If I were to teach a class where I had all of my students needing to do this to log into the U of U server and submit jobs, it may be a day or two before their project gets run,” Zahn said. “So it’s not something we could do in class as it currently stands.”
By paying for space on the supercomputer, UVU researchers would bypass the line. But even connecting to the system can be an infrastructure challenge, as information still has to be uploaded, downloaded and stored.
Another option is to maintain a high-performance computer system at UVU.
Meanwhile, Tony Nwabuba, area IT director for the College of Science, helps faculty find ways to meet their computing needs, whether that’s connecting to the University of Utah or solving the challenges of other ideas they bring to him. The committee keeps tabs on what faculty use to inform the decision on how to move forward.
“I think one of the reasons we didn’t do this until now at UVU is that computational research, even though it sounds cheap, is actually really, really expensive,” said Cyrill Slezak, an associate professor of physics. The price covers not only equipment but a staff to manage it and ongoing upgrades.
The investment would have educational benefits. “I’m thinking about my students,” said Zahn. “If they graduate without being able to run genomic computational jobs on a remote server, they’re going to have a hard time finding a job in biology.”
“This is a necessity at the university and they need to put resources behind it,” said Heath Ogden, an associate professor of biology and head of the HPC Committee. “Our dean has said that he’s willing to put some resources behind this, but I would like to see some buy-in from the president and the academic vice president.”
The need for a push toward high-performance computing isn’t limited to science — fields as different as literature and business handle large data sets.
Learning supercomputing skills
Isaac Wilson, who graduated April 2020 with a certificate in GIS, used supercomputing to analyze satellite images of the Great Basin in a class taught by Justin White, an assistant professor of earth science. “It’s very fun to plug in these really long complicated processes or these massive downloads that a normal computer would take months and to see that complete in 12, 18, 24 hours,” he said. “I hope that more students, specifically in scientific fields, will have the opportunity to work with supercomputing and high-performance data processing. Because you can do things you’ve never been able to do before.”
Students can learn about high-performance computing through classes as well as mentored research. Some courses taught in the College of Science are available for students with no computer experience, including courses in R (a computer language) and data analysis. A new class that teaches how to use remote supercomputers (Bioinformatics Data Skills, BIOL 490R-005) will be offered for the first time in spring 2021.