Updated: Mar 24, 2026 by

Zitong Zhan

Bundle Adjustment (BA) is one of the core optimization problems in 3D vision, SLAM, and SfM. Yet modern learning systems are often split across two worlds: neural networks live in PyTorch, while BA is still handled by external C++ solvers. We introduce Bundle Adjustment in the Eager mode (BAE) to close the gap by bringing sparse, second-order BA directly into PyTorch eager mode. BAE is a PyTorch-native BA pipeline can be both flexible and fast.

BAE Efficiency Comparison — BAE achieves average GPU speedups of **18.5x**, **22x**, and **23x** over **GTSAM**, **g2o**, and **Ceres**, and can reach nearly or over **100x** speedup on large-scale problems compared with traditional CPU optimizers, on BAL and 1DSfM benchmarks.

Introduction

The key idea is simple: users should be able to define BA the same way they define any other PyTorch module. Instead of switching to a separate factor-graph system or leaving Python, they can express residuals directly in the computation graph and let the optimizer infer the sparse structure needed for efficient second-order optimization.

This matters because BA is not just another differentiable layer. Its Jacobian is highly sparse: each 2D reprojection residual depends on only one camera pose and one 3D point. Traditional BA solvers exploit that sparsity aggressively. Standard dense AutoDiff does not.

PyTorch-Native Bundle Adjustment

BAE keeps the user-facing workflow close to standard PyTorch. A minimal BA example looks like this:

import torch
import pypose as pp
from torch import nn
from pypose.optim import LM
from pypose.optim.solver import PCG
from pypose.optim.strategy import TrustRegion
from pypose.optim.scheduler import StopOnPlateau
from pypose.autograd.function import psjac

class ReprojErr(nn.Module):
    def __init__(self, poses, points):
        super().__init__()
        self.poses = pp.Parameter(poses, sjac=True)
        self.points = pp.Parameter(points, sjac=True)

    @psjac
    def project(poses, points):
        points = poses.Act(points)
        return -points[..., :2] / points[..., [2]]

    def forward(self, pixels, cidx, pidx):
        poses = self.poses[cidx]
        points = self.points[pidx]
        return ReprojErr.project(poses, points) - pixels

torch.set_default_device("cuda")
num_points, poses = 8, pp.randn_SE3(1)
points = torch.randn(num_points, 3)
points[:, 2] += 4
camera_index = torch.zeros(num_points, dtype=torch.long)
point_index = torch.arange(num_points)
pixels = torch.randn(num_points, 2)
inputs = (pixels, camera_index, point_index)

model = ReprojErr(poses, points)
solver = PCG(tol=1e-4, maxiter=250)
strategy = TrustRegion(up=2.0, down=0.5**4)
optimizer = LM(model, solver, strategy, sparse=True)
scheduler = StopOnPlateau(optimizer, steps=5, verbose=True)

while scheduler.continual():
    loss = optimizer.step(inputs)
    scheduler.step(loss)

The optimization target is the usual one for BA: given camera poses, 3D points, and 2D observations, update the variables so that projected points align with measured pixels. The extra ingredients are minimal:

sjac=True marks parameters whose Jacobians should be traced as sparse rather than dense.
@psjac marks functions whose sparse Jacobian assembly should be parallelized.

Once the model is defined, the optimizer still looks like a standard PyPose or PyTorch setup, with a configurable linear solver and trust-region strategy.

Why Sparse Jacobians Matter

The main obstacle is not whether PyTorch can differentiate BA. It can. The real challenge is whether PyTorch can differentiate BA with the right sparsity structure.

In large-scale BA, dense Jacobians are prohibitively expensive. In the Ladybug scene from the BAL dataset, a dense Jacobian in double precision would require about 5.2 TB of memory, while the corresponding sparse Jacobian needs only about 125 MB. That gap is the difference between an impractical prototype and a usable system.

To address this, the framework introduces a sparsity-aware autodiff mechanism that dynamically traces tensor operations and infers the Jacobian block structure directly from the computation graph. This allows PyTorch to compute only the derivatives that actually exist in the BA problem.

Sparse Second-Order Optimization in PyTorch

The sparse Jacobian is stored using native PyTorch sparse tensors, especially the sparse_bsr format, which is well suited to block-sparse matrices. This design avoids introducing a custom factor-graph data structure and keeps the programming model close to ordinary tensor code.

BA is then solved with sparse Levenberg-Marquardt, which repeatedly forms and solves the normal equations

\[(J^T J + \lambda \, \mathrm{diag}(J^T J)) \Delta \theta = -J^T R.\]

That workflow depends on sparse matrix-matrix products, matrix-vector products, diagonal operations, and sparse linear solvers. Since PyTorch does not fully support all of the required sparse BSR operators, the system implements GPU sparse operators and registers them through the PyTorch dispatcher. As a result, users can still write expressions such as A = J.T @ J rather than relying on a separate, specialized solver API.

This combination of sparse-aware AutoDiff and GPU sparse linear algebra gives the framework three practical benefits:

The flexibility of PyTorch eager execution
The efficiency expected from traditional BA solvers
Large-scale GPU acceleration for nonlinear optimization

Beyond Bundle Adjustment

Although the paper focuses on BA, the framework is more general than a single benchmark or a hand-written kernel. The same mechanism can also support other sparse optimization problems such as Pose Graph Optimization (PGO), where indexing patterns and Lie-group operations define similarly structured Jacobian blocks.

In that sense, this work is not only a BA implementation. It is a broader sparse second-order optimization infrastructure for PyTorch, making it easier to combine geometry, learning, and optimization inside one differentiable pipeline.

Resources

Full Example: PyPose BA example
Documentation: PyPose psjac API

Publication

Bundle Adjustment in the Eager Mode.
Zitong Zhan, Huan Xu, Zihang Fang, Xinpeng Wei, Yaoyu Hu, Chen Wang.
IEEE Transactions on Robotics (T-RO), 2026.

A GPU implementation achieving 20x speedup
```
@article{zhan2026bundle,
  title = {Bundle Adjustment in the Eager Mode},
  author = {Zhan, Zitong and Xu, Huan and Fang, Zihang and Wei, Xinpeng and Hu, Yaoyu and Wang, Chen},
  journal = {IEEE Transactions on Robotics (T-RO)},
  year = {2026},
  url = {https://arxiv.org/abs/2409.12190},
  code = {https://github.com/sair-lab/bae},
  website = {https://sairlab.org/bae/},
  cover = {/img/posts/2026-03-24-bae/bae.gif},
  video = {https://youtu.be/ONH7qYGRdFc},
  addendum = {A GPU implementation achieving 20x speedup}
}
```
```
Zhan, Zitong and Xu, Huan and Fang, Zihang and Wei, Xinpeng and Hu, Yaoyu and Wang, Chen, "Bundle Adjustment in the Eager Mode," IEEE Transactions on Robotics (T-RO), 2026.
```

Latest News

Neuro-Symbolic Learning for Long-Horizon Task Planning Under Complex Logical Constraints

Bilevel object-importance learning with robust fail recovery for long-horizon task planning.

VL-Nav: Neuro-Symbolic Reasoning-based Vision-Language Navigation

Neural reasoning with symbolic guidance in large-scale environments.

Learning When to Jump for Off-road Navigation

A traversability map for adaptive strategies beyond simple avoidance on challenging terrains.

Fast Task Planning with Neuro-Symbolic Relaxation

A fast yet reliable neuro-symbolic relaxation strategy to accelerate task planning.

CSE 473/573: Computer Vision and Image Processing

Syllabus for Spring 2026

The Summary of 2025

The Theme of SAIR Lab in 2025 is 👉 Transform 👈

PyPose Accumulated Over 160,000 Downloads in 2025 on PyPI

A PyTorch-based library for robot learning with physics-based optimization.

AnyNav: Visual Neuro-Symbolic Friction Learning for Off-road Navigation

A neuro-symbolic framework for friction learning and physics-informed off-road navigation.

Vision-Language Memory for Spatial Reasoning

A vision-language model with memory for long-horizon spatial reasoning.

iA*: Imperative Learning-based A* Search for Path Planning

A self-supervised path-planning method to imporve the search efficiency of A* algorithm.

CSE 473/573: Computer Vision and Image Processing

Syllabus for Fall 2025

iWalker: Imperative Visual Planning for Walking Humanoid Robot

A vision-to-control humanoid stepping controller enhanced by Imperative Learning

Imperative Learning

A Self-supervised Neuro-Symbolic Learning Framework for Robot Autonomy

SAIR Lab Inspired K-12 Kids on the Robotics Day

An open-to-all interactive robotics day for all K-12 kids and their parents.

GroundSLAM: A Robust Visual SLAM System for Warehouse Robots Using Ground Textures

An extremly efficient and accurate SLAM solution for warehouse robots.

AirRoom: Objects Matter in Room Reidentification

A simple yet highly effective room reidentification system.

SuperPC: A Single Diffusion Model for Unified Point Cloud Processing

A diffusion model for point cloud completion, upsampling, denoising, and colorization.

Roboranking: Robotics Faculty Hub & University Ranking System

A one-stop resources for robotics faculty-student matching, fostering greater visibility.

AirSLAM: An Efficient and Illumination-Robust Point-Line Visual SLAM System

An efficient point-line vSLAM addressing both short-term and long-term illumination challengs.

The Summary of 2024

The Theme of SAIR Lab in 2024 is 👉 Hope 👈

iKap: Kinematics-aware Planning with Imperative Learning

A novel local planning system that integrates a robot's kinematics into its learning to create mo...

LogiCity: Advancing Neural-Symbolic AI with Abstract Urban Simulation

LogiCity is an innovative urban simulator to benchmark Neural-Symbolic AI.

Map it Anywhere: Empowering BEV Map Prediction using Large-scale Public Datasets

A data engine enables seamless curation and modeling map prediction from existing map platforms.

ICRA'25 Workshop on Foundation Models and Neuro-Symbolic AI for Robotics

A series of interactive talks on foundation models and neuro-symbolic AI for robotics.

CSE 473/573: Computer Vision and Image Processing

Syllabus for Fall 2024

PhysORD: A Neuro-Symbolic Approach for Physics-infused Motion Prediction in Off-road Driving

A neural-symbolic motion prediction model integrating the conservation law into neural networks

iMatching: Imperative Correspondence Learning

A self-supervised approach to learn feature matching

iMTSP: Solving Min-Max Multiple Traveling Salesman Problem with Imperative Learning

A Self-supervised Approach to Efficiently Solve Min-Max MTSP

Air Series Articles from Junior Researchers

Air Series is a collection of articles that are first authored by junior researchers.

SAIR STAR Award Announced

The highest honor in SAIR Lab