Vision-Language Memory for Spatial Reasoning
A vision-language model with memory for long-horizon spatial reasoning from videos only.
Zuntao Liu
Intern
Zuntao Liu is a master student at Northeastern University (NEU). He is now an intern at the SAIR Lab, Department of Computer Science and Engineering (CSE), University at Buffalo (UB). His research interests include Multimodal LLMs, 3D reconstruction, event-based vision, and robotics.