Issue 2 (216), article 2

DOI:

Cybernetics and Computer Engineering, 2024,2(216)

Smirnov A.O., PhD Student,
https://orcid.org/0009-0002-6509-4135,
e-mail: tonysmn97@gmail.com

International Research and Training Center
for Information Technologies and Systems
of the National Academy of Science of Ukraine
and the Ministry of Education and Science of Ukraine.
40, Acad. Glushkov av., 03187, Kyiv, Ukraine

CAMERA POSE ESTIMATION USING A 3D GAUSSIAN SPLATTING RADIANCE FIELD

Introduction. Accurate camera pose estimation is crucial for many applications ranging from robotics to virtual and augmented reality. The process of determining agents pose from a set of observations is called odometry. This work focuses on visual odometry, which utilizes only images from camera as the input data.

The purpose of the paper is to demonstrate an approach for small-scale camera pose estimation using 3D Gaussians as the environment representation. 

Methods. Given the rise of neural volumetric representations for the environment reconstruction, this work relies on Gaussian Splatting algorithm for high-fidelity volumetric representation.

Results. For a trained Gaussian Splatting model and the target image, unseen during training, we estimate its camera pose using differentiable rendering and gradient-based optimization methods. Gradients with respect to camera pose are computed directly from image-space per-pixel loss function via backpropagation. 

The choice of Gaussian Splatting as representation is particularly appealing because it allows for end-to-end estimation and removes several stages that are common for more classical algorithms. And differentiable rasterization as the image formation algorithm provides real-time performance which facilitates its use in real-world applications.

Conclusions. This end-to-end approach greatly simplifies camera pose estimation, avoiding compounding errors that are common for multi-stage algorithms and provides a high-quality camera pose estimation. 

Keywords: radiance fields, scientific computing, odometry, slam, pose estimation, gaussian splatting, differentiable rendering 

Download full text!

REFERENCES

1. Jeff Bezanson. Julia: A Fast Dynamic Language for Technical Computing, 2012, arXiv: 1209.5145 [cs.PL].

2. Bernhard Kerbl. 3D Gaussian Splatting for Real-Time Radiance Field Rendering, 2023, arXiv: 2308.04079 [cs.GR].

3. Kai Li Lim, Thomas Braunl. A Review of Visual Odometry Methods and Its Applications for Autonomous Driving, 2020, arXiv: 2009 . 09193 [cs.CV].

4. Tony Lindeberg. Scale Invariant Feature Transform. Vol. 7, May 2012,

doi: 10.4249/scholarpedia.10491.

5. Ben Mildenhall. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, 2020, arXiv: 2003.08934 [cs.CV].

6. Thomas Muller. Instant neural graphics primitives with a multires- olution hash encoding. ACM Transactions on Graphics 41.4 (July 2022), pp. 1–15. issn: 1557-7368. doi: 10 . 1145 / 3528223 . 3530127. url: http://dx.doi.org/10.1145/3528223.3530127.

7. Raul Mur-Artal, J. M. M. Montiel, and Juan D. Tardos. “ORB-SLAM: A Versatile and Accurate Monocular SLAM System”. In: IEEE Transactions on Robotics 31.5 (Oct. 2015), pp. 1147–1163. issn: 1941-0468. doi: 10.1109/tro.2015.2463671. url:  http:// dx.doi.org/10.1109/TRO. 2015.2463671.

8. Jim Nilsson, Tomas Akenine-Moller. Understanding SSIM, 2020, arXiv: 2006.13846 [eess.IV].

9. Shashi Poddar, Rahul Kottath, Vinod Karar. Evolution of Visual Odometry Techniques, 2018, arXiv: 1804.11142 [cs.CV].

10. Kerui Ren. Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians, 2024, arXiv: 2403.17898 [cs.CV].

11. Johannes Lutz Schonberger, Jan-Michael Frahm. Structure-from-Motion Revisited. Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

12. Rich Sutton. The Bitter Lesson, 2019. URL: http://www.incompleteideas.net/IncIdeas/BitterLesson.html.

Received 29.03.2024