Illinois IFP finishes best in the U.S. in global ImageNet Large Scale Visual Recognition Challenge
Amid a landscape of sweeping advances in computer vision, image processing and artificial intelligence, leading researchers in government, academia and industry compete each year for the top honors in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). This year, a group of electrical and computer engineering (ECE) graduate students at the University of Illinois’ Beckman Institute was the top U.S. finisher and third overall in one category of this prestigious global competition.
Contested each year since 2010, ImageNet added video and scene classification categories this year, and Illinois’ Image Formation Processing team (UIUC-IFP) claimed third in the video category behind only universities from Hong Kong and South Korea while another team formed by IFP and Microsoft Research Redmond (UIUCMSR) placed 11th overall in the very large-scale scene classification competition.
On the heels of the results, both teams, UIUC-IFP and UIUCMSR, were asked to be a part of a poster presentation on Dec. 17 at the biannual International Conference on Computer Vision in Santiago, Chile. Team member Pooya Khorrami is representing the team at the conference.
Thomas Huang, a professor emeritus in ECE and the Coordinated Science Lab and a research professor at Beckman, is a founding figure in computer vision and image processing and has advised more than 100 PhD students over the years. Professor Huang’s graduate students led the efforts. Honghui Shi coordinated the teams, and contributed to both. Pooya Khorrami, Tom Le Paine, and Wei Han led the UIUC-IFP team for the Video competition, with help from Prajit Ramachandran, and Mohammad Babaeizadeh. Meanwhile, Yingzhen Yang led the UIUCMSR team, with help from Wei Han and Shiyu Chang, as well as former IFP student Jianchao Yang, now a principal research at Snapchat, and Nebojsa Jolic, a principal researcher at Microsoft Research.
The ILSVRC, which has been sponsored by industry giants such as Google, Facebook, and Nvidia, gained notoriety in 2012, when for the first time the field of image recognition and artificial intelligence began using deep learning neural networks and supercomputing to more quickly and accurately detect objects.
“Using these neural networks has allowed us to train our models which understand the objects and images deeply,” Shi said. “Naturally, when humans look at an object more than once it has a better chance to stick in their minds. The same holds true for neural networks.”
ImageNet’s Object Detection from Video competition (VID), where Illinois placed third, charged competitors to create models to learn to detect objects from approximately 4,000 videos. This is challenging as videos tend to include a larger variety of poses, motion blur, and occlusion. In the Scene competition, each team is required to classify 381,000 images into 401 scene categories, given 8.1 million training images. For each image, algorithm produces a list of five scene categories in descending order of confidence. The quality of the predicted list will be evaluated based on how well it matches the ground truth scene category for the image.
“This is an interesting time for artificial intelligence,” Paine said. “Now we have machine models which can recognize objects in pictures more accurately than humans. This challenge helps drive research throughout the computer vision and artificial intelligence community.”
As the team members continue to work on their algorithms and as the models continue to train, the group is seeing even more accurate results. “We feel great about what we have been able to accomplish with a relatively small team,” Khorrami said. “I think the competition is a good way to showcase our research in deep learning on a global stage.”