Recent years have witnessed the rapid development of Neuro-Symbolic (NeSy) AI systems, which integrate symbolic reasoning into deep neural networks. However, most existing benchmarks for NeSy AI fail to provide long-horizon reasoning tasks with complex multi-agent interaction. Furthermore, they are usually constrained by fixed and simplistic logical rules over limited entities, making them inadequate for capturing real-world complexities.
To address these crucial gaps, we introduce LogiCity, the first simulator based on customizable first-order logic (FOL) for an urban-like environment with multiple dynamic agents. LogiCity models diverse urban elements using semantic and spatial concepts, such as IsAmbulance(X)
and IsClose(X, Y)
. These concepts are used to define FOL rules governing the behavior of various agents. Since the concepts and rules are abstractions, they can be universally applied to cities with any agent compositions, facilitating the instantiation of diverse scenarios. Besides, a key benefit of our LogiCity is its support for user-configurable abstractions, enabling customizable simulation complexities for logical reasoning.
To explore various aspects of NeSy AI, we introduce two tasks: one featuring long-horizon sequential decision-making, and the other focusing on one-step visual reasoning, varying in difficulty and agent behaviors. Our extensive evaluation using LogiCity reveals the advantage of NeSy frameworks in abstract reasoning. Moreover, we highlight the significant challenges of handling more complex abstractions in long-horizon multi-agent reasoning scenarios or under high-dimensional, imbalanced data. With its flexible design, various features, and newly raised challenges, we believe LogiCity represents a pivotal step for advancing the next generation of NeSy AI.
Contribution
- We propose a pioneer abstraction-based dynamic city simulator LogiCity for NeSy AI, which is flexible and scalable for various reasoning tasks.
- We develop Sequential Decision Making and Visual Reasoning tasks in LogiCity to evaluate the compositional generalization capability of various models.
- We conduct exhaustive experiments and demonstrate the advantages of NeSy AI in compositional generalization, while also revealing the new reasoning challenges introduced by LogiCity.
LogiCity is an innovative simulator and benchmark designed for Neuro-Symbolic (NeSy) AI. It models dynamic urban environments with adaptable abstractions and includes an Inductive Abstract Reasoning task, in which the latest language model (GPT-4o) performs below human level.
Simulation Examples
- An entity will Stop if it is CollingClose with another entity,
OR
, if it is AtIntersection with another entity InIntersection,OR
, if it is AtIntersection with another HigherPriority entity AtIntersection.
- An entity will Slow if it is Tiro with another Pedestrian entity NextTo it.
- An entity will move Fast if it is a HigherPriority Reckless entity with another Car entity AtIntersection.
- An entity will Slow if it is a Police entity with at least two other Young entities NextTo each other,
AND
one of them is NextTo to it.
Simulation Rendering
By virtue of fundamental generative models, LogiCity supports a diverse range of rendering styles.
Applications
Two applications including Safe Path Following (SPF) and Visual Action Prediction (VAP) are supported.
Safe Path Following (SPF)
Safe Path Following (SPF) requires an algorithm to control an agent in LogiCity, navigating it to the goal while maximizing the trajectory reward. Since the ego agent could meet with complex situations along the way and needs to smartly plan to maximize trajectory return, this task features long-horizon reasoning with multiple dynamic agents.
Hard mode visualization
The first-order logic (FOL) rule and a training example in the hard mode for the SPF task is shown below.
Stop(X):- Not(IsAmbulance(X)), Not(IsOld(X)), IsAtInter(X), IsInInter(Y).
Stop(X):- Not(IsAmbulance(X)), Not(IsOld(X)), IsAtInter(X), IsAtInter(Y), HigherPri(Y, X).
Stop(X):- Not(IsAmbulance(X)), Not(IsOld(X)), IsInInter(X), IsInInter(Y), IsAmbulance(Y).
Stop(X):- Not(IsAmbulance(X)), Not(IsPolice(X)), IsCar(X), Not(IsInInter(X)), Not(IsAtInter(X)), LeftOf(Y, X), IsClose(Y, X), IsPolice(Y).
Stop(X):- IsBus(X), Not(IsInInter(X)), Not(IsAtInter(X)), RightOf(Y, X), NextTo(Y, X), IsPedestrian(Y).
Stop(X):- IsAmbulance(X), RightOf(Y, X), IsOld(Y).
Stop(X):- Not(IsAmbulance(X)), Not(IsOld(X)), CollidingClose(X, Y).
- An example of Training episode
- An example of Testing episode
Visual Action Prediction (VAP)
Visual Action Prediction (VAP) focuses on reasoning with high-dimensional data, requiring models to predict the actions of all agents from an RGB image.
The challenge here lies in sophisticated abstract reasoning with high-level perceptual noise.
Publication
-
LogiCity: Advancing Neuro-Symbolic AI with Abstract Urban Simulation.Advances in Neural Information Processing Systems (NeurIPS), 2024.