Abstract:Omnidirectional and 360° images are becoming widespread in industry and in consumer society, causing omnidirectional computer vision to gain attention. Their wide field of view allows the gathering of a great amount of information about the environment from only an image. However, the distortion of these images requires the development of specific algorithms for their treatment and interpretation. Moreover, a high number of images is essential for the correct training of computer vision algorithms based on learning. In this paper, we present a tool for generating datasets of omnidirectional images with semantic and depth information. These images are synthesized from a set of captures that are acquired in a realistic virtual environment for Unreal Engine 4 through an interface plugin. We gather a variety of well-known projection models such as equirectangular and cylindrical panoramas, different fish-eye lenses, catadioptric systems, and empiric models. Furthermore, we include in our tool photorealistic non-central-projection systems as non-central panoramas and non-central catadioptric systems. As far as we know, this is the first reported tool for generating photorealistic non-central images in the literature. Moreover, since the omnidirectional images are made virtually, we provide pixel-wise information about semantics and depth as well as perfect knowledge of the calibration parameters of the cameras. This allows the creation of ground-truth information with pixel precision for training learning algorithms and testing 3D vision approaches. To validate the proposed tool, different computer vision algorithms are tested as line extractions from dioptric and catadioptric central images, 3D Layout recovery and SLAM using equirectangular panoramas, and 3D reconstruction from non-central panoramas.

HawkI: Homography & Mutual Information Guidance for 3D-free Single Image to Aerial View

Ivs-Net: Learning Human View Synthesis from Internet Videos

A Homography-Based Visual Inertial Fusion Method for Robust Sensing of a Micro Aerial Vehicle

A Geometric Approach to Obtain a Bird's Eye View from an Image

Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models

Cross-View Meets Diffusion: Aerial Image Synthesis with Geometry and Text Guidance

MvAV-pix2pixHD: Multi-view Aerial View Image Translation

3D-Assisted Image Feature Synthesis for Novel Views of an Object

HawkEye Conv-Driven YOLOv10 with Advanced Feature Pyramid Networks for Small Object Detection in UAV Imagery

SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image

Learning Knowledge-Rich Sequential Model for Planar Homography Estimation in Aerial Video

OmniSCV: An Omnidirectional Synthetic Image Generator for Computer Vision

Unsupervised Deep Homography: A Fast and Robust Homography Estimation Model

Stereo Unstructured Magnification: Multiple Homography Image for View Synthesis

Sat2Vid - Street-view Panoramic Video Synthesis from a Single Satellite Image.

View-Aware Image Object Compositing and Synthesis from Multiple Sources

A Construct-Optimize Approach to Sparse View Synthesis without Camera Pose

Horizon-GS: Unified 3D Gaussian Splatting for Large-Scale Aerial-to-Ground Scenes

Zero-1-to-3: Zero-shot One Image to 3D Object

PixelSynth: Generating a 3D-Consistent Experience from a Single Image

3D-free meets 3D priors: Novel View Synthesis from a Single Image with Pretrained Diffusion Guidance