Programming can be similar to a crossword. Writing in wrong answers can lead one astray, changing the entire outcome of the puzzle.
91ֿ computer science Professor Jonathan Maletic, Ph.D., and computer science alumnus Michael Decker, Ph.D. '17, may just have solved the puzzle.
As Maletic explains it, tracking changes through lines of code can be an impossible task. Changing a line early in the sequence can spark a chain reaction, destroying the entire system.
“You have millions of lines of source code in a system. These are big, with multiple people working on them,” Maletic said. “Somebody changes something then they submit that change to the system. And it breaks something. The system fails. Okay, so what did they change?”
Decker and Maletic are developing a new software solution, creating a world where the computer science community can effectively track changes and improve group code development.
SrcML, previously developed by Maletic, acts as infrastructure to explore, analyze and manipulate source code.
Maletic and Decker’s new software, srcDiff, is a scalable syntactic-aware differencing tool for source code that describes changes to the code in a way that is easy for a programmer to understand. SrcDiff has been built on srcML’s foundation.
This three-year project is being funded by a $750,000 grant from the National Science Foundation.
This grant is unlike many other projects funded by the National Science Foundation. Typically, these grants are given to produce research papers. This funding supports the construction of infrastructure to expand technology needed to produce future research. A portion of the funding will be used to hire graduate and undergraduate students to work on the project. Maletic hopes that by increasing the number of people who interact with the software, they will build a broader understanding of its purpose and the many fields it can be used in.
“We are building a differencing tool that actually understands what syntax is,” Maletic said.
Lines of code are connected, changing a line early on in the code can break lines further down in the block. Previous tools were incapable of identifying these connections, but srcDiff understands the context of these lines and how their arrangement impacts the entire block of code.
This tool will help programmers better identify changes in code that are impacting the entire program. SrcDiff understands the relationship between lines of code and how the entire structure functions.
This software concept originated during Decker’s time at the University of Akron, but blossomed when he started his dissertation at 91ֿ under Maletic’s teaching.
Time passed by and the duo continued to consider creating this software. Decker finally asked Maletic to dive in.
“We've been working on the idea for 10 years,” Maletic said. “We probably had a rough prototype for the past three or four, but we haven't made any movement on improving it, because it takes time.”
The pair are eager to improve the current infrastructure, developing an easy way to track changes when programming.
“We're really interested in having this work and looking at changes over the entire history of a system and having a bit better, more programmer-centric view of the changes,” Maletic said.
So how useful is this? Maletic believes this tool will serve a broad global audience.
“Open-source is neat because it allows the public to use this software for free,” Maletic said. “They can install it on their system, they can even alter it if they want.”
Open-source tools are usable, modifiable and distributable by all, making them ideal for educational purposes.
"Mainly, we built this for researchers,” Maletic said. “Practitioners are using it internally in their organizations, and students, mainly graduate students, are using it to support their research endeavors.”
“We started building this and then people started saying, ‘Hey, I can use this here?’ and ‘I'm using it!’ Our srcML open-source system gets about 1,000 downloads a year,” Maletic said.
SrcDiff has yet to receive many downloads, but Maletic anticipates it to be widely used once released. He continues to incorporate this software into his classes, with each of his 22 doctoral students using it. He also has introduced it to his undergraduate students.
“In my freshman-level class, they have a project that actually uses some of this. It kind of makes for a neat project because they're using some external tools,” Maletic said. “It also fills the needs of the actual kind of learning objectives of the class. In fact, many people across Europe and the U.S. have used it as part of their class for doing programming analysis. There are a lot of people who have used it internationally in their undergrad and grad classes.”
In addition to its use in college-level courses, the computer science community is all in to improve srcDiff, according to Maletic.
“We’re building infrastructure that supports research, and getting feedback from the community is essential as we continue developing our tool,” Maletic said. “In addition to creating awareness of our tool, we’re helping researchers understand how they can use srcDiff as part of their research.”
Decker and Maletic are eager to see the impact of their differencing tool.
“I'm excited to take this tool from the prototype that it is and make it into a full-fledged tool that others besides myself can use,” Decker said. “I want to see and grow it into something that the software community finds value in. And that would make me feel good because I've contributed to that.”
Learn more about the computer science department
Photo Caption: Jonathan Maletic smiling for a group photo with his former Ph.D. students and their Ph.D. students