Open Scene Graphs for Open-World Object-Goal Navigation

National University of Singapore
ICRA 2024

*Indicates Equal Contribution
Overview of Explorer using Open Scene Graphs

Our Explorer system is capable of searching for a specified object class, given open-set instructions, across diverse embodiments and environments. This is enabled by our Open Scene Graph, which acts as a scene memory for a fully Foundation Model-based (FM) system, that is itself purely built from FMs.


Can we build a system to perform object-goal navigation (ObjectNav) and other complex semantic navigation tasks in the open world? Composing LLMs that are strong semantic reasoners with robotics foundation models that gen- eralise across environments and embodiments seems a viable avenue. While selecting the right representations to connect them effectively is crucial, existing works often prompt LLMs with scene representations that are uninformative, unstructured and constructed with methods that generalise poorly. To address this gap, we propose the Open Scene Graph (OSG), a rich, structured topo-semantic representation, along with an OSG mapper module composed fully from foundation models. We demonstrate that OSGs facilitate reasoning with LLMs, enabling a greedy LLM planner to outperform existing LLM approaches by a wide mar- gin on ObjectNav benchmarks in diverse indoor environments. We take a step towards an open-world ObjectNav system by building the fully foundation model-based Explorer system from an LLM planner and a generalisable visuomotor policy, with OSGs built online by our mapper to connect them. We show that Explorer is capable of effective object-goal navigation in the real world across different robots and novel instructions