Roozbeh Mottaghi's Webpage

ADAPT: Actively Discovering and Adapting to Preferences for any Task.
Maithili Patel, Xavier Puig, Ruta Desai, Roozbeh Mottaghi, Sonia Chernova, Joanne Truong, Akshara Rai
Conference on Language Modeling (CoLM), 2025.

PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks.
(alphabetical order) Matthew Chang, Gunjan Chhablani, Alexander Clegg, Mikael Dallaire Cote, Ruta Desai, Michal Hlavac, Vladimir Karashchuk, Jacob Krantz, Roozbeh Mottaghi, Priyam Parashar, Siddharth Patki, Ishita Prasad, Xavier Puig, Akshara Rai, Ram Ramrakhya, Daniel Tran, Joanne Truong, John M. Turner, Eric Undersander, Tsung-Yen Yang
International Conference on Learning Representations (ICLR), 2025.

From an Image to a Scene: Learning to Imagine the World from a Million 360° Videos.
Matthew Wallingford, Anand Bhattad, Aditya Kusupati, Vivek Ramanujan, Matt Deitke, Aniruddha Kembhavi, Roozbeh Mottaghi, Wei-Chiu Ma, Ali Farhadi
Advances in Neural Information Processing Systems (NeurIPS), 2024.

Situated Instruction Following.
So Yeon Min, Xavier Puig, Devendra Singh Chaplot, Tsung-Yen Yang, Priyam Parashar, Akshara Rai, Ruslan Salakhutdinov, Yonatan Bisk, Roozbeh Mottaghi
European Conference on Computer Vision (ECCV), 2024.

Track2Act: Predicting Point Tracks from Internet Videos enables Diverse Robot Manipulation.
Homanga Bharadhwaj, Roozbeh Mottaghi*, Abhinav Gupta*, Shubham Tulsiani*
European Conference on Computer Vision (ECCV), 2024.

Controllable Human-Object Interaction Synthesis.
Jiaman Li, Alexander Clegg, Roozbeh Mottaghi, Jiajun Wu, Xavier Puig*, C. Karen Liu*
European Conference on Computer Vision (ECCV), 2024.
Oral presentation

GOAT: GO to Any Thing.
Matthew Chang, Theophile Gervet, Mukul Khanna, Sriram Yenamandra, Dhruv Shah, So Yeon Min, Kavit Shah, Chris Paxton, Saurabh Gupta, Dhruv Batra, Roozbeh Mottaghi, Jitendra Malik, Devendra Singh Chaplot
Robotics: Science and Systems (RSS), 2024.

GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation.
Mukul Khanna, Ram Ramrakhya, Gunjan Chhablani, Sriram Yenamandra, Theophile Gervet, Matthew Chang, Zsolt Kira, Devendra Singh Chaplot, Dhruv Batra, Roozbeh Mottaghi
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots.
Xavier Puig*, Eric Undersander*, Andrew Szot*, Mikael Dallaire Cote*, Tsung-Yen Yang*, Ruslan Partsey*, Ruta Desai*, Alexander William Clegg*, Michal Hlavac, So Yeon Min, Vladimír Vondruš, Theophile Gervet, Vincent-Pierre Berges, John M. Turner, Oleksandr Maksymets, Zsolt Kira, Mrinal Kalakrishnan, Jitendra Malik, Devendra Singh Chaplot, Unnat Jain, Dhruv Batra, Akshara Rai†, Roozbeh Mottaghi† (*: core team, †: project leads)
International Conference on Learning Representations (ICLR), 2024.

Neural Priming for Sample-Efficient Adaptation.
Matthew Wallingford*, Vivek Ramanujan*, Alex Fang, Aditya Kusupati, Roozbeh Mottaghi, Aniruddha Kembhavi, Ludwig Schmidt, Ali Farhadi
Advances in Neural Information Processing Systems (NeurIPS), 2023.

HomeRobot: Open-Vocabulary Mobile Manipulation.
Sriram Yenamandra, Arun Ramachandran, Karmesh Yadav, Austin Wang, Mukul Khanna, Theophile Gervet, Tsung-Yen Yang, Vidhi Jain, Alexander William Clegg, John Turner, Zsolt Kira, Manolis Savva, Angel Chang, Devendra Singh Chaplot, Dhruv Batra, Roozbeh Mottaghi, Yonatan Bisk, Chris Paxton
Conference on Robot Learning (CoRL), 2023.

ENTL: Embodied Navigation Trajectory Learner.
Klemen Kotar, Aaron Walsman, Roozbeh Mottaghi
International Conference on Computer Vision (ICCV), 2023.

Navigating to Objects Specified by Images.
Jacob Krantz, Theophile Gervet, Karmesh Yadav, Austin Wang, Chris Paxton, Roozbeh Mottaghi, Dhruv Batra, Jitendra Malik, Stefan Lee, Devendra Singh Chaplot
International Conference on Computer Vision (ICCV), 2023.

Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-Per-Second.
Vincent-Pierre Berges*, Andrew Szot*, Devendra Singh Chaplot, Aaron Gokaslan, Roozbeh Mottaghi, Dhruv Batra, Eric Undersander (* equal contribution)
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics.
Kuo-Hao Zeng, Luca Weihs, Roozbeh Mottaghi, Ali Farhadi
International Conference on Learning Representations (ICLR), 2023.
Oral presentation

Unified-IO: A Unified Model for Vision, Language, and Multi-modal Tasks.
Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
International Conference on Learning Representations (ICLR), 2023.
Spotlight presentation

Neural Radiance Field Codebooks.
Matthew Wallingford, Aditya Kusupati, Alex Fang, Vivek Ramanujan, Aniruddha Kembhavi, Roozbeh Mottaghi, Ali Farhadi
International Conference on Learning Representations (ICLR), 2023.

ProcTHOR: Large-Scale Embodied AI Using Procedural Generation.
Matt Deitke, Eli VanderBilt, Alvaro Herrasti, Luca Weihs, Jordi Salvador, Kiana Ehsani, Winson Han, Eric Kolve, Ali Farhadi, Aniruddha Kembhavi, Roozbeh Mottaghi
Advances in Neural Information Processing Systems (NeurIPS), 2022.
Outstanding Paper Award

Ask4Help: Learning to Leverage an Expert for Embodied Tasks.
Kunal Pratap Singh, Luca Weihs, Alvaro Herrasti, Jonghyun Choi, Aniruddha Kembhavi, Roozbeh Mottaghi
Advances in Neural Information Processing Systems (NeurIPS), 2022.

Benchmarking Progress to Infant-Level Physical Reasoning in AI.
Luca Weihs, Amanda Yuile, Renée Baillargeon, Cynthia Fisher, Gary Marcus, Roozbeh Mottaghi, Aniruddha Kembhavi
Transactions on Machine Learning Research (TMLR), 2022.

A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge.
Dustin Schwenk, Apoorv Khandelwal, Christopher Clark, Kenneth Marino, Roozbeh Mottaghi
European Conference on Computer Vision (ECCV), 2022.

Object Manipulation via Visual Target Localization.
Kiana Ehsani, Ali Farhadi, Aniruddha Kembhavi, Roozbeh Mottaghi
European Conference on Computer Vision (ECCV), 2022.

Interactron: Embodied Adaptive Object Detection.
Klemen Kotar and Roozbeh Mottaghi
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

Continuous Scene Representations for Embodied AI.
Samir Gadre, Kiana Ehsani, Shuran Song, Roozbeh Mottaghi
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

What do navigation agents learn about their environment?
Kshitij Dwivedi, Gemma Roig, Aniruddha Kembhavi, Roozbeh Mottaghi
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

Simple but Effective: CLIP Embeddings for Embodied AI.
Apoorv Khandelwal*, Luca Weihs*, Roozbeh Mottaghi, Aniruddha Kembhavi (* equal contribution)
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

Multi-Modal Answer Validation for Knowledge-Based VQA.
Jialin Wu, Jiasen Lu, Ashish Sabharwal, Roozbeh Mottaghi
AAAI Conference on Artificial Intelligence (AAAI), 2022.
Oral presentation

Container: Context Aggregation Networks.
Gao Peng, Jiasen Lu, Hongsheng Li, Roozbeh Mottaghi, Aniruddha Kembhavi
Neural Information Processing Systems (NeurIPS), 2021.

RobustNav : Towards Benchmarking Robustness in Embodied Navigation.
Prithvijit Chattopadhyay, Judy Hoffman, Roozbeh Mottaghi, Aniruddha Kembhavi
International Conference on Computer Vision (ICCV), 2021.
Oral presentation

Contrasting Contrastive Self-Supervised Representation Learning Pipelines.
Klemen Kotar, Gabriel Ilharco, Ludwig Schmidt, Kiana Ehsani, Roozbeh Mottaghi
International Conference on Computer Vision (ICCV), 2021.

Factorizing Perception and Policy for Interactive Instruction Following.
Kunal Singh, Suvaansh Bhambri, Byeonghwi Kim, Roozbeh Mottaghi, Jonghyun Choi
International Conference on Computer Vision (ICCV), 2021.

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World.
Rowan Zellers, Ari Holtzman, Matthew Peters, Roozbeh Mottaghi, Aniruddha Kembhavi, Ali Farhadi, Yejin Choi
The 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021.
Oral presentation
[Project page]

Visual Room Rearrangement.
Luca Weihs, Matt Deitke, Aniruddha Kembhavi, Roozbeh Mottaghi
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
Oral presentation
[Room Rearrangement Code & Dataset]

ManipulaTHOR: A Framework for Visual Object Manipulation.
Kiana Ehsani, Winson Han, Alvaro Herrasti, Eli VanderBilt, Eric Kolve, Luca Weihs, Aniruddha Kembhavi, Roozbeh Mottaghi
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
Oral presentation
[Project page]

Pushing it out of the Way: Interactive Visual Navigation.
Kuo-hao Zeng, Luca Weihs, Ali Farhadi, Roozbeh Mottaghi
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
[Project page]

Learning Generalizable Visual Representations via Interactive Gameplay.
Luca Weihs, Aniruddha Kembhavi, Kiana Ehsani, Sarah M. Pratt, Winson Han, Alvaro Herrasti, Eric Kolve, Dustin Schwenk, Roozbeh Mottaghi, Ali Farhadi
International Conference on Learning Representations (ICLR), 2021.
Oral presentation

What Can You Learn from Your Muscles? Learning Visual Representation from Human Interactions.
Kiana Ehsani, Daniel Gordon, Thomas Hai Dang Nguyen, Roozbeh Mottaghi, Ali Farhadi
International Conference on Learning Representations (ICLR), 2021.
[Code & Dataset]

Rearrangement: A Challenge for Embodied AI.
Dhruv Batra, Angel X. Chang, Sonia Chernova, Andrew J. Davison, Jia Deng, Vladlen Koltun, Sergey Levine, Jitendra Malik, Igor Mordatch, Roozbeh Mottaghi, Manolis Savva, Hao Su
arXiv, 2020.

Learning About Objects by Learning to Interact with Them.
Martin Lohmann, Jordi Salvador, Aniruddha Kembhavi, Roozbeh Mottaghi
Advances in Neural Information Processing Systems (NeurIPS), 2020.

AllenAct: A Framework for Embodied AI Research.
Luca Weihs*, Jordi Salvador*, Klemen Kotar*, Unnat Jain, Kuo-Hao Zeng, Roozbeh Mottaghi, Aniruddha Kembhavi. (* equal contribution)
arXiv, 2020.
[AllenAct webpage]

Visual Commonsense Graphs: Reasoning about the Dynamic Context of a Still Image.
Jae Sung Park, Chandra Bhagavatula, Roozbeh Mottaghi, Ali Farhadi, Yejin Choi
European Conference on Computer Vision (ECCV), 2020.
Spotlight presentation
[Project page]

ObjectNav Revisited: On Evaluation of Embodied Agents Navigating to Objects.
Dhruv Batra, Aaron Gokaslan, Aniruddha Kembhavi, Oleksandr Maksymets, Roozbeh Mottaghi, Manolis Savva, Alexander Toshev, Erik Wijmans
arXiv, 2020.

RoboTHOR: An Open Simulation-to-Real Embodied AI Platform.
Matt Deitke*, Winson Han*, Alvaro Herrasti*, Aniruddha Kembhavi*, Eric Kolve*, Roozbeh Mottaghi*, Jordi Salvador*, Dustin Schwenk*, Eli VanderBilt*, Matthew Wallingford*, Luca Weihs*, Mark Yatskar*, Ali Farhadi. (* alphabetically listed equal contribution)
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[RoboTHOR webpage]

Visual Reaction: Learning To Play Catch With Your Drone.
Kuo-Hao Zeng, Roozbeh Mottaghi, Luca Weihs, Ali Farhadi.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[Project page]

ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks.
Mohit Shridhar, Jesse Thomason, Daniel Gordon, Yonatan Bisk, Winson Han, Roozbeh Mottaghi, Luke Zettlemoyer, Dieter Fox.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[Project & Challenge page]

Learning to Learn How to Learn: Self-Adaptive Visual Navigation using Meta-Learning.
Mitchell Wortsman, Kiana Ehsani, Mohammad Rastegari, Ali Farhadi, Roozbeh Mottaghi.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Oral presentation
[Project page]

OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge.
Kenneth Marino, Mohammad Rastegari, Ali Farhadi, Roozbeh Mottaghi.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
[Project page]

Visual Semantic Navigation using Scene Priors.
Wei Yang, Xiaolong Wang, Ali Farhadi, Abhinav Gupta, Roozbeh Mottaghi.
International Conference on Learning Representations (ICLR), 2019.
[Project page]

On Evaluation of Embodied Navigation Agents.
Peter Anderson, Angel Chang, Devendra Singh Chaplot, Alexey Dosovitskiy, Saurabh Gupta, Vladlen Koltun, Jana Kosecka, Jitendra Malik, Roozbeh Mottaghi, Manolis Savva, Amir R. Zamir.
arXiv, 2018.

Who Let The Dogs Out? Modeling Dog Behavior From Visual Data.
Kiana Ehsani, Hessam Bagherinezhad, Joe Redmon, Roozbeh Mottaghi, Ali Farhadi.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[Project page]

SeGAN: Segmenting and Generating the Invisible.
Kiana Ehsani, Roozbeh Mottaghi, Ali Farhadi.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
Spotlight presentation
[Project page]

AI2-THOR: An Interactive 3D Environment for Visual AI.
Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Daniel Gordon, Yuke Zhu, Abhinav Gupta, Ali Farhadi.
arXiv, 2017.
[http://ai2thor.allenai.org/]

Visual Semantic Planning using Deep Successor Representations.
Yuke Zhu*, Daniel Gordon*, Eric Kolve, Dieter Fox, Li Fei-Fei, Abhinav Gupta, Roozbeh Mottaghi, Ali Farhadi.
International Conference on Computer Vision (ICCV), 2017.
[Project page]

See the Glass Half Full: Reasoning about Liquid Containers, their Volume and Content.
Roozbeh Mottaghi, Connor Schenck, Dieter Fox, Ali Farhadi.
International Conference on Computer Vision (ICCV), 2017.
[Project page]

Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning.
Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J. Lim, Abhinav Gupta, Li Fei-Fei, and Ali Farhadi.
International Conference on Robotics and Automation (ICRA), 2017.
[Project page]

"What happens if..." Learning to Predict the Effect of Forces in Images.
Roozbeh Mottaghi, Mohammad Rastegari, Abhinav Gupta, Ali Farhadi.
European Conference on Computer Vision (ECCV), 2016.
[Project page]

ObjectNet3D: A Large Scale Database for 3D Object Recognition.
Yu Xiang, Wonhui Kim, Wei Chen, Jingwei Ji, Christopher Choy, Hao Su, Roozbeh Mottaghi, Leonidas Guibas, Silvio Savarese.
European Conference on Computer Vision (ECCV), 2016.
Spotlight presentation
[Project page]

Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images.
Roozbeh Mottaghi, Hessam Bagherinezhad, Mohammad Rastegari, Ali Farhadi.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
[Project page]

A Task-oriented Approach for Cost-sensitive Recognition.
Roozbeh Mottaghi, Hannaneh Hajishirzi, Ali Farhadi.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
[Project page][Supplementary Material]

Human-Machine CRFs for Identifying Bottlenecks in Scene Understanding.
Roozbeh Mottaghi, Sanja Fidler, Alan Yuille, Raquel Urtasun, Devi Parikh.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 38(1):74-87, 2016.
[Supplementary Material]

Complexity of Representation and Inference in Compositional Models with Part Sharing.
Alan Yuille and Roozbeh Mottaghi.
Journal of Machine Learning Research (JMLR), 17(11):1-28, 2016.

A Coarse-to-Fine Model for 3D Pose Estimation and Sub-category Recognition.
Roozbeh Mottaghi, Yu Xiang, and Silvio Savarese.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
[Supplementary Material][dataset][CAD models]

Monocular Multiview Object Tracking with 3D Aspect Parts.
Yu Xiang*, Changkyu Song*, Roozbeh Mottaghi and Silvio Savarese.
European Conference on Computer Vision (ECCV), 2014.

Beyond PASCAL: A Benchmark for 3D Object Detection in the Wild.
Yu Xiang, Roozbeh Mottaghi, and Silvio Savarese.
IEEE Winter Conference on Applications of Computer Vision (WACV), 2014.
[PASCAL 3D+ dataset]

The Role of Context for Object Detection and Semantic Segmentation in the Wild.
Roozbeh Mottaghi, Xianjie Chen, Xiaobai Liu, Nam-Gyu Cho, Seong-Whan Lee, Sanja Fidler, Raquel Urtasun, Alan Yuille.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
[Errata][PASCAL Context dataset]

Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts.
Xianjie Chen, Roozbeh Mottaghi, Xiaobai Liu, Sanja Fidler, Raquel Urtasun, Alan Yuille.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
[PASCAL Parts dataset]

Complexity of Representation and Inference in Compositional Models with Part Sharing.
Alan Yuille and Roozbeh Mottaghi.
International Conference on Learning Representations (ICLR), 2013.
Oral presentation

Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs.
Roozbeh Mottaghi, Sanja Fidler, Jian Yao, Raquel Urtasun, Devi Parikh.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013.
[Supplementary material]

Bottom-up Segmentation for Top-down Detection.
Sanja Fidler, Roozbeh Mottaghi, Alan Yuille, Raquel Urtasun.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013.
[Project page]

Augmenting Deformable Part Models with Irregular-shaped Object Patches.
Roozbeh Mottaghi.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012.
[Supplementary material]

A Compositional Approach to Learning Part-based Models of Objects.
Roozbeh Mottaghi, Ananth Ranganathan, and Alan Yuille.
Workshop on 3D Representation and Recognition, held with the International Conference on Computer Vision (ICCV), 2011.
[Code]

Graph-based Planning Using Local Information for Unknown Outdoor Environments.
Jinhan Lee, Roozbeh Mottaghi, Charles Pippin, and Tucker Balch.
International Conference on Robotics and Automation (ICRA), 2009.

Place Recognition-based Fixed-lag Smoothing for Environments with Unreliable GPS.
Roozbeh Mottaghi, Michael Kaess, Ananth Ranganathan, Richard Roberts, and Frank Dellaert.
International Conference on Robotics and Automation (ICRA), 2008.
[Video]

An Integrated Particle Filter and Potential Field Method Applied to Multi-Robot Target Tracking.
Roozbeh Mottaghi and Richard Vaughan.
Autonomous Robots, 23(1):19-35, 2007.
[Videos: clip 1 and clip 2]

An Integrated Particle Filter & Potential Field Method for Cooperative Robot Target Tracking.
Roozbeh Mottaghi and Richard Vaughan.
International Conference on Robotics and Automation (ICRA), 2006.

An Overview of a Probabilistic Tracker for Multiple Cooperative Tracking Agents.
Roozbeh Mottaghi and Shahram Payandeh.
International Conference on Advanced Robotics (ICAR), 2005.

Coordination of Multiple Agents for Probabilistic Object Tracking.
Roozbeh Mottaghi and Shahram Payandeh.
Canadian Conference on Computer and Robot Vision (CRV), 2005.

SharifCESR Small Size Robocup Team.
Mohammad T. Manzuri, Hamid R. Chitsaz, Reza Ghorbani, Pooya Karimian, Alireza Mirazi, Mehran Motamed, Roozbeh Mottaghi and Payam Sabzmeydani.
Robocup 2001: Robot Soccer World Cup V. Lecture Notes in Artificial Intelligence 2377, 2002.

Roozbeh Mottaghi

Highlights and News

About Me

Students and Interns

Interns

Publications

Press Coverage