01
Vision-Language Models for Spatial Scenes
Multimodal models that parse social interactions, urban form, and built environments from imagery and street-view data.
Multimodal models that parse social interactions, urban form, and built environments from imagery and street-view data.
Graph-based and geometric embeddings for trajectories, road networks, and neighborhood archetypes.
Integrating geospatial, sensor, and tabular data into LLM workflows for spatial analysis and diagnostic systems.

Liu Liu, Alexandra Schild, Marco Cipriano, Fatimeh Al Ghannam, Freya Tan, Gerard de Melo, Andres Sevtsuk
AAAI 2026, 2026

Rohit Sanatani, Richa Gupta, Freya Tan, Randall Davis, Takehiko Nagakura
HCI International 2026, 2026

Xinwei Zhuang, Freya Tan
Energy & Buildings, 2026