Imperative learning (IL) is a selfsupervised neuralsymbolic learning framework for robot autonomy.
A prototype of IL was first mentioned in the iSLAM paper, while it was then formally defined in this long article:

Imperative Learning: A Selfsupervised NeuralSymbolic Learning Framework for Robot Autonomy.arXiv preprint arXiv:2406.16087, 2024.
This iSeries collects articles from the SAIR lab, named after a leading character “i” from “imperative learning”. In the iSeries collection, IL has been applied to various tasks including path planning, feature matching, and multirobot routing, etc.
The list of iSeries articles

iMatching: Imperative Correspondence Learning.European Conference on Computer Vision (ECCV), 2024.

iMTSP: Solving MinMax Multiple Traveling Salesman Problem with Imperative Learning.IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024.

iSLAM: Imperative SLAM.IEEE Robotics and Automation Letters (RAL), 2024.

iA*: Imperative Learningbased A* Search for Pathfinding.arXiv preprint arXiv:2403.15870, 2024.

iPlanner: Imperative Path Planning.Robotics: Science and Systems (RSS), 2023.
This blog will briefly explain IL in a highlevel perspective, as the reader may find more indepth explanation in the paper.
Readers may also find a slide in this link, which provides a more interactive format.
IL is to alleviate the challenges of robot learning frameworks such as reinforcement learning and imitation learning.
Why do we need NeuralSymbolic AI?
 To combine the advantages of both neural and symbolic methods.
 To overcome the challenges of existing robot learning frameworks.
What is NeuralSymbolic AI?
 There is still NO consensus on NeuralSymbolic (NeSy) AI.
 We have a narrow and a broader definition, where the difference is mainly on the scope of “symbols”.
Examples of existing NeuralSymbolic AI?
 Although many methods haven’t explicitly say this, but they can be viewed as NeuralSymbolic AI.
Why do we need Imperative Leanring?
 Imperative learning is a selfsupervised neuralsymbolic learning framework.
 It is designed to overcome the four challenges by a single design based on bilevel optimization.
 Limited generalization ability, blackbox nature, label intensiveness, suboptimality.
What is Imperative Learning?
 The framework of imperative learning (IL) consists of three primary modules including a neural perceptual network, a symbolic reasoning engine, and a general memory system.
 IL is formulated as a special bilevel optimization, enabling reciprocal learning and mutual correction among the three modules.
Denote the neural system as \(\boldsymbol z = f({\boldsymbol{\theta}}, \boldsymbol{x})\), where \(\boldsymbol{x}\) represents the sensor measurements, \({\boldsymbol{\theta}}\) represents the perceptionrelated learnable parameters, and \(\boldsymbol z\) represents the neural outputs such as semantic attributes; the reasoning engine as \(g(f, M, {\boldsymbol{\mu}})\) with reasoningrelated parameters \({\boldsymbol{\mu}}\) and the memory system as \(M({\boldsymbol{\gamma}}, {\boldsymbol{\nu}})\), where \({\boldsymbol{\gamma}}\) is perceptionrelated memory parameters and \({\boldsymbol{\nu}}\) is reasoningrelated memory parameters. Therefore, imperative learning (IL) is formulated as a special BLO:
\[\begin{align} \min_{ \boldsymbol \psi \doteq [{\boldsymbol{\theta}}^\top,~{\boldsymbol{\gamma}}^\top]^\top} & U\left(f({\boldsymbol{\theta}}, \boldsymbol{x}), g({\boldsymbol{\mu}}^*), M({\boldsymbol{\gamma}}, {\boldsymbol{\nu}}^*)\right), \label{eq:highil} \\ \textrm{s.t.} \quad & \boldsymbol \phi^* \in \arg\min_{ \boldsymbol \phi \doteq [{\boldsymbol{\mu}}^\top,~{\boldsymbol{\nu}}^\top]^\top} L(f({\boldsymbol{\theta}}, \boldsymbol{x}), g({\boldsymbol{\mu}}), M({\boldsymbol{\gamma}}, {\boldsymbol{\nu}})), \label{eq:lowil} \\ &\textrm{s.t.} \quad \xi(M({\boldsymbol{\gamma}}, {\boldsymbol{\nu}}), {\boldsymbol{\mu}}, f({\boldsymbol{\theta}}, \boldsymbol{x})) = \text{ or }\leq 0, \label{eq:ilconstraint} \end{align}\]where \(\xi\) is a general constraint (either equality or inequality); \(U\) and \(L\) are the upperlevel (UL) and lowerlevel (LL) cost functions; and \(\boldsymbol \psi \doteq [{\boldsymbol{\theta}}^\top, {\boldsymbol{\gamma}}^\top]^\top\) are stacked UL variables and \(\boldsymbol \phi \doteq [{\boldsymbol{\mu}}^\top, {\boldsymbol{\nu}}^\top]^\top\) are stacked LL variables, respectively. Alternatively, \(U\) and \(L\) are also referred to as the neural cost and symbolic cost, respectively.
 The term “imperative” is used to denote the passive nature of the learning process:
 Once optimized, the neural system \(f\) in the UL cost will be driven to align with the LL reasoning engine \(g\)
 E.g., logical, physical, or geometrical reasoning process with constraint \(\xi\).
 Therefore, IL can learn to generate logically, physically, or geometrically feasible semantic attributes or predicates.
 Once optimized, the neural system \(f\) in the UL cost will be driven to align with the LL reasoning engine \(g\)
 In some applications, \(\boldsymbol \psi\) and \(\boldsymbol \phi\) are also referred to as neuronlike and symbollike parameters, respectively.
Selfsupervised Nature
 Since many symbolic reasoning engines including geometric, physical, and logical reasoning, can be optimized or solved without providing labels.
 For example, A\(^*\) search, geometrical reasoning such as bundle adjustment (BA), and physical reasoning like model predictive control (MPC) can be optimized without providing labels.
 The IL framework leverages this phenomenon and jointly optimizes the three modules by bilevel optimization, which enforces the three modules to mutually correct each other.
 Consequently, all three modules can learn and evolve in a selfsupervised manner by observing the world.
 Although IL is designed for selfsupervised learning, it can easily adapt to supervised or weakly supervised learning by involving labels either in UL or LL cost functions or both.
Overcoming the other Challenges.
 The symbolic module offers better Interpretability and Generalization Ability due to its explainable design.
 The Optimality is brought by bilevel optimization, compared to separately training the neural and symbolic modules.
Optimization Challenge
 The solution to IL mainly involves solving the UL parameters \({\boldsymbol{\theta}}\) and \({\boldsymbol{\gamma}}\) and the LL parameters \({\boldsymbol{\mu}}\) and \({\boldsymbol{\nu}}\).
 Intuitively, the UL parameters which are often neuronlike weights can be updated with the gradients of the UL cost $U$:
 Since \(U\), \(L\), \(M\), \(g\), and \(f\) are often well defined, the challenge is to compute the derivative of lowerlevel (symbollike) parameters w.r.t the upperlevel (neuronlike) parameters, \(\color{blue}\frac{\partial \boldsymbol \phi^*}{\partial \boldsymbol \psi}\), which takes the form:
 There are generally two ways to compute it, i.e., unrolled differentiation and implicit differentiation. See paper for more details.
 Since \(\boldsymbol \psi \doteq [{\boldsymbol{\theta}}^\top, {\boldsymbol{\gamma}}^\top]^\top\) are LL parameters, the solution depends on the specific LL tasks.
Applications and Examples
 The paper provides five distinct examples covering the different cases of LL tasks.
Path Planning
In the case of LL tasks have closedform solutions, we provide examples in both global and local path planning.
Global Path Planning
 A\(^*\) is widely used due to its optimality, but often suffers low efficiency due to its large search space.
 Therefore, we could leverage a neural module to predict a confined search space, leading to overall improved efficiency.
 We take A\(^*\) as the symbolic reasoning engine and train the neural module in a selfsupervised way based on IL.
 This results in a new framework, which is referred to as iA\(^*\).
 Due to the confined search space and generalization ability from A*, iA\(^*\) outperforms both classic and other learning methods.
 The following figure shows the qualitative results of path planning algorithms on datasets, including MP, Maze, and Matterport3D.
Local Path Planning
 Endtoend local path planning has recently attracted considerable interest, particularly for its potential to enable efficient inference.
 Reinforcement learningbased methods often suffer from sample inefficiency and difficulties in directly processing depth images.
 Imitation learningbased methods rely heavily on the availability and quality of labeled trajectories.
 To solve those problems, we leverage a neural module to predict sparse waypoints, leading to overall improved efficiency.
 The waypoints are then interpolated using a trajectory optimization engine based on a cubic spline.
 We use IL to train this new framework, which is referred to as iPlanner.
 The following figure shows realworld experiment for local path planning using iPlanner with a legged robot.
Logical Reasoning
 In the case of the LL task needs firstorder optimization, we provide an example in inductive logical reasoning.
 Existing works only focus on toy examples, such as Visual Sudoku, and binary vector representations in BlocksWorld.
 They cannot simultaneously perform grounding (high dimensional data) and rule induction.
 Based on IL, we use a neural network for concept and relationship prediction, and a neural logical machine (NLM) for rule induction.
 We denote this new framework as iLogic.
 In the following figure, iLogic conducts rule induction with perceived groundings and the constraining rules exhibited on the right side and finally gets the accurate action prediction exhibited on the left side.
Optimal Control
 In the case of the LL task needs constrained optimization, we provide an example of UAV attitude control based on IMU.
 Differentiable model predictive control (MPC) to combine the physicsbased modeling with datadriven methods, enabling learning dynamic models and control policies in an endtoend manner.
 However, many prior studies depend on expert demonstrations or labeled data for supervised learning.
 They often suffer from challenging conditions such as unseen environments and external disturbances.
 Based on IL, we use a neural network for IMU denoising and predict the hyperparameters for MPC.
 We denote this new framework as iMPC.
 We evaluate the control performance under the wind disturbance to validate the robustness of the proposed approach.
Visual Odometry
 In the case of the LL task needs secondorder optimization, we provide an example of simultaneous localization and mapping (SLAM).
 Existing SLAM systems only have single connection between the frontend odometry and backend pose graph optimization.
 This leads to suboptimal solutions since there is no feedback from the backend to the frontend.
 We proposed to optimize the entire SLAM system based on IL, leading the selfsupervised reciprocal correction between the frontend and the backend.
 We refer to this new framework as iSLAM.
 With more training iterations, the frontend odometry can be kept improving in the following figure.
Multiagent Routing
 In the case of the LL task needs discrete optimization, we provide an example of multiple traveling salesman problem (MTSP).
 Traditional methods for MTSP needs combinatorial optimization, which is discrete optimization in a very large space.
 Classic MTSP solvers such as Google’s ORTools routing library meet difficulties for largescale problems (>500 cities).
 We introduce IL and use a neural network for city allocation to agents and then use single TSP solvers for divided smaller problems.
 To compute the differentiation in discrete space, we introduce a surrogate network to estimate the gradient based on control variate.
 We refer this new framework as iMTSP.
 Due to the generalization abilities of IL, iMTSP outperforms both classic solvers and RLbased methods.
Please refer to the iSeries articles for more technical details!

Imperative Learning: A Selfsupervised NeuralSymbolic Learning Framework for Robot Autonomy.arXiv preprint arXiv:2406.16087, 2024.

iMatching: Imperative Correspondence Learning.European Conference on Computer Vision (ECCV), 2024.

iMTSP: Solving MinMax Multiple Traveling Salesman Problem with Imperative Learning.IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024.

iSLAM: Imperative SLAM.IEEE Robotics and Automation Letters (RAL), 2024.

iA*: Imperative Learningbased A* Search for Pathfinding.arXiv preprint arXiv:2403.15870, 2024.

iPlanner: Imperative Path Planning.Robotics: Science and Systems (RSS), 2023.