Computer vision started with the goal of building machines that can see like humans and perform perception for robots, but it has become much broader than that. Applications such as image database search in the world wide web, computational photography, biological imaging, vision for graphics, GIS, biometrics, vision for nanotechnology, were unanticipated and other applications keep arising as computer vision technology develops. Areas such as document analysis and medical image analysis have developed rapidly and have their own conferences. As our computers achieve even a crude understanding of video imagery, computer vision will profoundly change our lives as visual sensors becomes increasingly ubiquitous and enable us to transcend current human limitations. Rapid developments in supportive technologies -- such as digital cameras and computers -- ensure that computer vision systems will become increasingly more capable and affordable. Moreover, the field of robotics itself has enormous potential to revolutionize manufacturing, to provide service by assistive robots, to perform medical surgery -- applications which all require perceptual input from computer vision systems. In addition, there are many applications to defense, homeland security, and the intelligence community.

We propose a workshop to address these issues and to explore the frontiers of computer vision. The goals of the workshop is to (1) to identify the future impact of computer vision on the economic, social, and security needs of the nation; (2) to outline the scientific and technological challenges to address; and (3) to draft a roadmap to address those challenges and realize the benefits.

More details are given in the current white paper.

The final report from the 1991 NSF workshop on computer vision -- "Challenges in Computer Vision Research; Future Directions of Research" -- is also available: 1991 final report, 1991 appendices.

A forum is now open for discussing 10 proposed key objectives and for proposing others.


  • Alan Yuille, UCLA
  • Aude Oliva, MIT

Advisory Board

  • David Forsyth, UIUC
  • William Freeman, MIT
  • Martial Hebert, CMU
  • Anil Jain, MSU
  • Daniel Kersten, UMN
  • Daphne Koller, Stanford
  • Yann LeCun, NYU
  • Jitendra Malik, UC Berkeley
  • Antonio Torralba, MIT
  • Rick Szeliski, Microsoft


  • Edward Adelson, MIT
  • Narendra Ahuja, UIUC
  • J. K. Aggarwal, U. Texas
  • Paul Bello, ONR
  • Serge Belongie, UCSD
  • Alex Berg, Stony Brook U.
  • Tamara Berg, Stony Brook U.
  • Rama Chellappa, U. Maryland
  • David Cooper, Brown U.
  • Jason Corso, SUNY at Buffalo
  • Gary Cottrell, UCSD
  • John Cozzens, NSF
  • Liyi Dai, ARL/ARO
  • Trevor Darrell, UC Berkeley
  • James DiCarlo, MIT
  • James Donlon, DARPA
  • Fredo Durand, MIT
  • Alexei Efros, CMU
  • Pedro Felzenszwalb, U. Chicago
  • Rob Fergus, NYU
  • Jack Gallant, UC Berkeley
  • Donald Geman, JHU
  • Mike Geertsen, DARPA
  • Polina Golland, MIT
  • Shlomo Gortler, Harvard
  • Kristen Grauman, U. Texas at Austin
  • James Hays, Brown U.
  • Aaron Hertzmann, U. Toronto
  • Derek Hoiem, UIUC
  • Qiang Ji, Rensselaer Polytech. Inst.
  • Behzad Kamgar-Parsi, ONR
  • Benjamin Kimia, Brown U.
  • Svetlana Lazebnik, UNC Chapel Hill
  • Erik Learned-Miller, U Mass. Amherst
  • Fei-Fei Li, Stanford
  • Ce Liu, Microsoft
  • David Lowe, UBC
  • David Martin, Google
  • Scott McCloskey, Honeywell ACS Labs
  • Dimitri Metaxas, Rutgers
  • Nasser Nasrabadi, ARL
  • Predrag Neskovic, Booz Allen Hamilton
  • Andrew Ng, Stanford
  • Tristan Nguyen, AFOSR
  • Pietro Perona, Caltech
  • Tomaso Poggio, MIT
  • Deva Ramanan, UC Irvine
  • Visvanathan Ramesh, Siemens
  • Anand Rangarajan, U. Florida
  • Ruth Rosenholtz, MIT
  • Guillermo Sapiro, UMN
  • Silvio Savarese, U. Michigan
  • Harpreet Sawhney, Sarnoff
  • Cordelia Schmid, INRIA
  • Steve Seitz, U. Washington
  • Eitan Sharon, Videosurf
  • Eero Simoncelli, NYU
  • Wesley Snyder, ARO
  • Erik Sudderth, Brown U.
  • Stefano Soatto, UCLA
  • Rong Yan, Facebook
  • Seth Teller, MIT
  • Josh Tenenbaum, MIT
  • Sinisa Todorovic, Oregon State
  • Ken Whang, NSF
  • Yair Weiss, Hebrew U. of Jerusalem
  • Jie Yang, NSF
  • Ming-Hsuan Yang, UC Merced
  • Kai Yu, NEC
  • Song-Chun Zhu, UCLA
  • Andrew Zisserman, Oxford
Frontiers in Computer Vision Workshop
Funded by
National Science Foundation, CISE, Computer Vision

and U.S. Army Research Office
August, 21-24, 2011

Computer Science and Artificial Intelligence Laboratory
Stata Center
Patil/Kiva Room, 32-G449
Massachusetts Institute of Technology
Cambridge, MA 02139

This meeting brings together experts in computer vision and related disciplines from academia and industry. The goal of the workshop is for the community to develop and promote a unified agenda for computer vision research and development between US agencies, universities, and industries (while recognizing that research thrives in a flexible environment). We seek to address issues such as what are the open computer vision tasks, what are the technical and scientific barriers we must overcome in order to solve these tasks, and what strategies – scientific, organizational, funding – are most likely to lead to the greatest progress in addressing these challenges.

The schedule below in PDF.

Click on the blue names to download slides.

SUNDAY 8/2112:30 - 12:50 Coffee Prologue  
12:50 - 1:00Opening RemarksIntroduction to the WorkshopYuille / Oliva 
1:00 - 3:15TalksRelationship of Computer Vision to Studies of Biological VisionRosenholtzCottrell, Dicarlo, Gallant, Poggio, Simoncelli, Weiss
3:15 - 3:45 Coffee Break  
3:45 - 5:30TalksHistorical Perspective on Computer VisionChellappaAdelson, Ahuja, Lowe, Malik, Seitz, Zhu
MONDAY 8/22Beginning at 8:30am POSTER SET-UP  
8:15 - 8:45 Breakfast - In Poster area  
8:45 - 10:15TalksTaxonomy of Computer Vision: Hilbert Problems for VisionMalikAdelson, Chellappa, Perona, Soatto, Zisserman
10:15 - 10:45 Coffee Break  
10:45 - 12:00Panel/TalkHumans and Machines Collaborating on Vision TasksBelongieT Berg, Geman, Grauman, Perona
12:00 - 1:15 Lunch  
1:15 - 2:30Panel/TalkApplications of Machine Vision to SciencePeronaBelongie, Fergus, Golland, Poggio
2:30 - 3:45TalksCross-fertilization with other Disciplines Freeman Durand, Hebert, Hertzmann, Metaxas
3:45 - 4:15 Coffee Break  
4:15 - 6:00PanelFoundations and CoreYuilleSzeliski, Sawhney, Forsyth, Liu, Soatto, Sapiro, Zisserman
6pm onwards Group Beer at Cambridge Brewing Company  
TUESDAY 8/238:00 - 8:30 Breakfast - In Poster area  
8:30 - 10:30TalksVisual RepresentationZhuA Berg, Corso, Darrell, Felzenswalb, Hoiem, Kimia, Learned-Miller, Tenenbaum, Todorovic
Posters: Corso, Learned-Miller
10:30 - 11:00 Coffee Break  
11:00 - 12:30TalksImage, Video, and Scene UnderstandingOliva Efros, Hebert, Lazebnik, Savarese, Schmid, Yuille
Posters: Hays, Grauman, Lazebnik, Savarese
12:30 - 1:45 Lunch   
1:45 - 3:00Talk/PanelThe Promise and Perils of Benchmark DatasetsZisserman (1, 2) Efros, Fei-Fei, Forsyth, Torralba
3:00 - 3:30 Coffee Break  
3:30 - 5:30TalksThe Role of Learning in VisionFergusLearned-Miller, LeCun, Ng, Ramanan, Sudderth, Yu, Yuille
Posters: Q. Ji, M-H Yang
5:30 - 6:30ReceptionBCS - Building 46 - Atrium  
6:30 - 7:30PanelReviewing and Evaluating Computer Vision RangarajanBelongie, Ji
WEDNESDAY 8/248:15 - 8:45 Breakfast  
8:45 - 10:30PanelRelations between Academia and IndustryLoweAdelson, Martin, Ramesh, Sharon, Szeliski, Yan
10:30 - 11:00 Coffee Break  
11:00 - 12:00Panel/TalksProgram Officers SessionYuilleDai, Donlon, Kamgar-Parsi, Geertsen, Yang, TBD
12:00 - 12:30PanelWrap-Up SessionYuilleHebert, Perona, Szeliski
12:00 - 1:00 Lunch (on site)  

Poster Presentations

Board Size: 4' tall x 8' wide

1Jason CorsoJ. J. Corso, J. A. Delmerico, P. David, R. Alomari, V. ChaudharyLayered Models for Bridging from Low to High Level VisionVisual Representation
2Deva RamananY. Yang & D. RamananArticulated pose estimation using flexible mixtures of partsTBD
3Kristin GraumanK. Grauman & D. ParikhRelative AttributesImage/Scene understanding
4James HaysJ. Hays & G. PattersonSUN Attributes: A Large-Scale Database of Scene AttributesImage/Scene understanding
5Eitan SharonE. Borenstein, A. Brandt, P. Srinivasan, S. Tran, M. Tek, A. Moshe, E. SharonVideoSurf's video recognition technology in the connected devices raceTBD
6Qiang JiQ. JiKnowledge augmented visual learningThe Role of Learning in Vision
7Ming-Hsuan YangQ. Wang & M.-H. YangLearning to track objectsThe Role of Learning in Vision
8Aaron HertzmannM. de Lasa & I. MordatchFull-Body Locomotion Control By Low-Dimensional PlanningTBD
9Sinisa TodorovicS. TodorovicBridging the gap between pixels and compositional activities for video parsingTBD
10Svetlana LazebnikJ. Tighe & S. Lazebnik Understanding Scenes on Many LevelsImage/Scene understanding
11Aude OlivaP. Isola, J. Xiao, D. Parikh, A. Torralba, A. OlivaHigh-level attributes of images: how memorable is an image?Image/Scene understanding
12Edward AdelsonK. Johnson & E. AdelsonGelsightAcademia and Industry
13Silvio SavareseS. Y. Bao, S. SavareseSemantic Structure from Motion
Image/Scene understanding
14Erik Learned-MillerE. Learned-Miller, L. Sevilla Lara, M. Narayana, E. Shelhamer, B. MearsDistribution Fields: A Unifying Representation for Low Level VisionVisual Representation
15Barbara Hidalgo-SoteloB. Hidalgo-Sotelo, T. Judd, K. Ehinger, F. Durand, A. Torralba, A. Oliva What are you looking at? Predictability and Patterns in Human Eye Movements Image/Scene understanding
16Alexei A. Efros T. Malisiewicz, A. Gupta, A. A. Efros Ensemble of Exemplar-SVMs for Object Detection and BeyondImage/Scene understanding
17Anand RangarajanA. Rangarajan Distance transforms, wave functions and shape analysisTBD

The McGovern Institute for Brain Research at MIT is gratefully acknowledged for their support.

A map of the MIT campus and Kendall Square area:

Workshop participants are lodging primarily at these hotels:

The Kendall Hotel
350 Main St
Cambridge MA 02139
For google map directions To/From the Kendall to the workshop click here.

Boston Marriott Cambridge Hotel
2 Cambridge Center
50 Broadway
Cambridge MA 02142
For google map directions To/From the Marriott to the workshop click here.

Directions to Patil/Kiva Seminar Room, 32-G449

The room is on the 4th floor of the Gates Tower in the Stata Center (Building 32).

If you enter Building 32 from the entrance on Vassar St, proceed straight ahead and elevators are on the right.
If you enter Building 32 from the entrance near the cafeteria, walk down the hallway (away from the cafeteria) toward Vassar street, and elevators are on your left.
Take the elevators to the 4th floor; exit to the left and then turn right at the end of the elevator bank. At the end of the short corridor bear to the left and continue around the R&D Dining Room. Patil/Kiva Seminar Room will be straight ahead.

Please allow extra time for the initial trip to the seminar room, as the Stata Center is a unique building and can be challenging to navigate for even the most adept individuals. Look out for fliers for Frontiers in Computer Vision.

Street address of the building:
32 Vassar St
Cambridge MA 02139

For navigating around MIT campus in general, Whereis.mit.edu is a handy resource.

Directions to MIT from Logan Airport:

by subway - From any terminal at Logan Airport, take the Silver Line bus to South Station. At South Station, change to the Red Line subway to Kendall/MIT (inbound toward Alewife). Under normal conditions the ride will take about one-half hour and the fare is $1.70-$2.00. Purchase a “Charlie Card” from the outside kiosks at the airport terminal before getting on the bus.

by taxi - Taxi fare from the airport is about $35–$40. During non-rush hour, the taxi ride will take about 15 minutes. During rush hour, the ride could take 30 minutes or more.

Taxi cabs stations are conveniently located outside of the Kendall T station (Main St) and outside of the Marriott hotel (Broadway). Additionally, here are a few local taxi companies:

Cambridge Taxi Company 617-686-9690
Cab of Cambridge 617-621-2500
Ambassador Brattle Cab 617-492-1100

Food Hints

Want coffee, quick lunch places, or a dinner idea? Here are some nearby options for coffee, food, drink.

Internet access

Visitors need to make sure the wireless card is on and enabled. The machine needs to be configured for DHCP (obtaining an IP address automatically). If the machine is running firewall software, it will need to be disabled until the registration process is complete.

Once their equipment is ready, visitors should open a web browser and point it to any web page.

After selecting Visitor registration, the returned page will display the MITnet Rules of Use, followed by a registration screen, requesting the visitor's contact information, number of days of connectivity, and the event for which they are on campus. Visitors can register between one and five (consecutive) days at a time, up to fourteen days per year. The network connection takes about ten minutes to activate, and remains active for the number of days selected.

For any questions, please contact bhs@mit.edu.

To set the stage for the workshop and start the discussion, ten key objectives for computer vision are proposed. To discuss these objectives, sign in. You can also add a new objective in the "Other discussions" section.

To submit a white paper, sign in and add a discussion to the "White papers submissions" section below.