This project focuses on 3D reconstruction using wild images with diffusion and nerf.

  • Supervisor: Prof. Hengshuang Zhao
  • Member: Zeyu Jiang

Humans could infer the geometry and texture of a 3D object merely from a few images, relying on their strong prior knowledge built over a lifetime of visual experience. However, enabling machines to perform 3D reconstruction from limited input views remains an open challenge with many applications across robotics, autonomous vehicles, augmented reality and more.

The goal is to develop a scalable framework for 3D reconstruction that can leverage either single or multiple input images. Key objectives include:

  • Developamodelthatcanreconstruct3Dgeometry from single image.
  • Enable multi-view 3D reconstruction where addi- tional images can incrementally improve the re- construction.
  • Support input images without categorical prior, masks or poses.
  • Evaluate reconstruction quality compared to state- of-the-art methods in closed-world benchmarks quantitively.
  • Qualitatively assess plausibility of completions with other state-of-the-art methods.
  • Analyze tradeoffs between single vs. multi-view reconstruction in terms of accuracy and runtime.

Project Documents

Project Plan

Check out our detailed project plan and milestones

Interim Report

Interim report

Final Report

Final report