DDD++: Exploiting Density map consistency for Deep Depth estimation in indoor environments

Giovanni Pintore, Marco Agus, Alberto Signoroni, Enrico Gobbetti*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

We introduce a novel deep neural network designed for fast and structurally consistent monocular 360° depth estimation in indoor settings. Our model generates a spherical depth map from a single gravity-aligned or gravity-rectified equirectangular image, ensuring the predicted depth aligns with the typical depth distribution and structural features of cluttered indoor spaces, which are generally enclosed by walls, floors, and ceilings. By leveraging the distinctive vertical and horizontal patterns found in man-made indoor environments, we propose a streamlined network architecture that incorporates gravity-aligned feature flattening and specialized vision transformers. Through flattening, these transformers fully exploit the omnidirectional nature of the input without requiring patch segmentation or positional encoding. To further enhance structural consistency, we introduce a novel loss function that assesses density map consistency by projecting points from the predicted depth map onto a horizontal plane and a cylindrical proxy. This lightweight architecture requires fewer tunable parameters and computational resources than competing methods. Our comparative evaluation shows that our approach improves depth estimation accuracy while ensuring greater structural consistency compared to existing methods. For these reasons, it promises to be suitable for incorporation in real-time solutions, as well as a building block in more complex structural analysis and segmentation methods.

Original languageEnglish
Article number101281
JournalGraphical Models
Volume140
DOIs
Publication statusPublished - 22 Jul 2025

Keywords

  • Depth estimation
  • Indoor environments
  • Spherical images
  • Structural consistency

Fingerprint

Dive into the research topics of 'DDD++: Exploiting Density map consistency for Deep Depth estimation in indoor environments'. Together they form a unique fingerprint.

Cite this