Rakesh Kumar, a researcher at the Coordinated Science Laboratory (CSL) and University of Minnesota Assistant Professor John Sartori, a former CSL graduate student, have received a $300,000, three-year grant from the National Science Foundation and the Semiconductor Research Corporation to research the use of software canaries in detecting hardware failures.
Written by
Rakesh Kumar, a researcher at the Coordinated Science Laboratory (CSL) and University of Minnesota Assistant Professor John Sartori, a former CSL graduate student, have received a $300,000, three-year grant from the National Science Foundation and the Semiconductor Research Corporation to research the use of software canaries in detecting hardware failures.
Kumar, an assistant professor of electrical and computer engineering at Illinois, said the idea of a software canary can be explained by the analogy of canaries in mines. Miners would bring birds into mines to detect methane gas. When the canaries smelled methane, they would begin singing, thus warning the miners to evacuate before they were harmed.
Rakesh Kumar
Rakesh Kumar
"The idea of a canary, in the context of processors, is that if you can build something inside a processor that will fail before your program fails, then it’s a good detector," Kumar explained. "It means that maybe you can dial down the speed, or you can dial up the voltage so that your programs can work correctly."
Traditionally, the industry has used hardware canaries to check for problems. However, “When you’re building a hardware warning system to check a different piece of hardware, you know it’s a question of who checks the checker,” Kumar said. The hardware canary will suffer from the same kinds of issues that the actual hardware would, so the system is very conservative.
However, a software canary can run on the same processor that the actual hardware runs on -- as opposed to running alongside it -- and therefore is less conservative and can potentially save power and enhance performance, Kumar said.
This grant is part of the NSF/SRC Joint Initiative in Failure Resistant Systems.
“It’s a big program,” Kumar said. “They encourage impactful research in failure resistant systems … and essentially they are looking for cross-cutting solutions,” he said. Recently, Oracle Corp. also has taken interest in this specific research and is offering their machines to test the software canaries on.
Kumar and Sartori have worked on projects together in the recent past when Sartori was a graduate student in Kumar's group. Kumar said that he likes that this project gives them the opportunity to keep working together.
“I’m also very happy with the fact that (Sartori) has something to support his research as soon he has started,” he said. __________________
Contact:Rakesh Kumar, Department of Electrical and Computer Engineering, 217/333-5955.
Kim Gudeman, Coordinated Science Laboratory, 217/493-1618.
Writer:Elise King, CSL Communications
If you have any questions about the College of Engineering, or other story ideas, contact Rick Kubetz, editor, Engineering Communications Office, University of Illinois at Urbana-Champaign, 217/244-7716.