Thermal infrared cameras offer robustness to darkness, smoke, and illumination changes, making them attractive for robotic deployment in visually degraded environments. However, most existing thermal SLAM systems are geometric and produce sparse maps, while recent learning-based odometry and neural mapping methods are primarily developed for RGB imagery. We present TOM-GS, a monocular thermal odometry and mapping system that integrates learning-based dense tracking with depth-guided 3D Gaussian Splatting for dense reconstruction. Monocular depth priors are used to stabilize thermal odometry and supervise Gaussian geometry, improving scale consistency and robustness in low-texture scenes. Experiments on the RRXIO and VIVID benchmarks show that learned thermal odometry outperforms classical geometric pipelines and that thermal sensing provides improved robustness over RGB under degraded illumination. Furthermore, Gaussian Splatting enables high-quality dense thermal reconstruction in challenging environments.
Thermal odometry first enhances the input image, predicts dense depth priors, and optimizes camera motion with depth-aware bundle adjustment. Selected keyframes are then used to supervise 3D Gaussians with proxy depth and enhanced thermal appearance, enabling dense thermal rendering and depth reconstruction.
Switch sequences below. Each sequence shows two aligned comparisons: RGB on top and thermal below.
TOM-GS supervision input vs rerendered output
TOM-GS supervision input vs rerendered output