Ruojin Cai

Research

My research focuses on 3D computer vision and world models, with the long-term goal of building spatial intelligence grounded in the real world. I study how to reconstruct and understand 3D scenes under challenging conditions, including sparse observations and ambiguities from repeated or symmetric structures. Building on this, I am interested in world models that reason over space, time, and physical plausibility, enabling models to infer missing 3D structure and predict future scene evolution beyond what is directly visible. More broadly, I aim to bridge 3D spatial understanding with the semantic reasoning capabilities of vision-language models, toward AI systems that can understand and reason about complex real-world scenes.

Selected Publications

Additional Publications

Long-Tail Internet Photo Reconstruction

CVPR 2026

Yuan Li, Yuanbo Xiangli, Hadar Averbuch-Elor, Noah Snavely, Ruojin Cai

[Project page] [Paper] [Data]

Emergent Extreme-View Geometry in 3D Foundation Models

CVPR 2026

Yiwen Zhang, Joseph Tung, Ruojin Cai, David Fouhey, Hadar Averbuch-Elor

[Project page] [Paper] [Code] [Data]

ArchSym: Detecting 3D-Grounded Architectural Symmetries in the Wild

CVPR 2026

Hanyu Chen, Ruojin Cai, Steve Marschner, Noah Snavely

[Project page] [Paper] [Code]

Can Generative Video Models Help Pose Estimation?

CVPR 2025 (Highlight)

Ruojin Cai, Jason Y. Zhang, Philipp Henzler, Zhengqi Li, Noah Snavely, Ricardo Martin-Brualla

[Project page] [Paper]

Doppelgangers++: Improved Visual Disambiguation with Geometric 3D Features

CVPR 2025 (Highlight)

Yuanbo Xiangli, Ruojin Cai, Hanyu Chen, Jeffrey Byrne, Noah Snavely

[Project page] [Paper] [Code]

Extreme Rotation Estimation in the Wild

CVPR 2025

Hana Bezalel, Dotan Ankri, Ruojin Cai, Hadar Averbuch-Elor

[Project page] [Paper] [Code]

MegaScenes: Scene-Level View Synthesis at Scale

ECCV 2024

Joseph Tung*, Gene Chou*, Ruojin Cai, Guandao Yang, Kai Zhang, Gordon Wetzstein, Bharath Hariharan, Noah Snavely

[Project page] [Paper] [Code] [Data] [Web viewer]

Doppelgangers: Learning to Disambiguate Images of Similar Structures

ICCV 2023 (Oral)

Ruojin Cai, Joseph Tung, Qianqian Wang, Hadar Averbuch-Elor, Bharath Hariharan, Noah Snavely

[Project page] [Paper] [Code]

Tracking Everything Everywhere All at Once

ICCV 2023 (Best Student Paper Award)

Qianqian Wang, Yen-Yu Chang, Ruojin Cai, Zhengqi Li, Bharath Hariharan, Aleksander Holynski, Noah Snavely

[Project page] [Paper] [Code]

Neural Scene Chronology

CVPR 2023

Haotong Lin, Qianqian Wang, Ruojin Cai, Sida Peng, Hadar Averbuch-Elor, Xiaowei Zhou, Noah Snavely

[Project page] [Paper] [Code]

Extreme Rotation Estimation using Dense Correlation Volumes

CVPR 2021

Ruojin Cai, Bharath Hariharan, Noah Snavely, Hadar Averbuch-Elor

[Project page] [Paper] [Code]

CondenseNet V2: Sparse Feature Reactivation for Deep Networks

CVPR 2021

Le Yang*, Haojun Jiang*, Ruojin Cai, Yulin Wang, Shiji Song, Gao Huang, Qi Tian

[Paper] [Code]

Learning Gradient Fields for Shape Generation

ECCV 2020 (Spotlight)

Ruojin Cai*, Guandao Yang*, Hadar Averbuch-Elor, Zekun Hao, Serge Belongie, Noah Snavely, Bharath Hariharan

[Project page] [Paper] [Code]

Order-Sensitive Deep Hashing for Multimorbidity Medical Image Retrieval

MICCAI 2018

Zhixiang Chen*, Ruojin Cai*, Jiwen Lu, Jianjiang Feng, Jie Zhou

[Paper]