Annotated Objects for Visual Reasoning Dataset

Name: Annotated Objects for Visual Reasoning Dataset
Creator: Silvan Ferreira
Published: 2025-02-24T15:09:46.775Z
Keywords: Object Detection, Multimodality

Ferreira, Silvan; Martins, Allan; Costa, Daniel G.; Silva, Ivanovitch

doi:10.17632/bn5cbjts6j.1

Annotated Objects for Visual Reasoning Dataset

Published: 24 February 2025| Version 1 | DOI: 10.17632/bn5cbjts6j.1

Contributors:

,

Description

The AOVR-Dataset is a synthetic 3D dataset designed to facilitate research in visual reasoning and object detection. The dataset includes various 3D objects placed in different containers, each annotated with bounding boxes and natural language descriptions. The 3D models were created using Blender, and the captions were generated with a Large Language Model (LLM).

Files

Steps to reproduce

The AOVR-Dataset was created using Blender for 3D scene generation and a Large Language Model (LLM) for natural language descriptions. Objects—cylinders, cubes, toruses, cones, and spheres—were assigned randomized attributes such as colors (e.g., blue, red, yellow), materials (metal, rubber), and sizes (small, big) and placed in various containers (e.g., shelf, table, crate, box). Bounding boxes were extracted using Blender’s Python API, converting 3D coordinates into 2D annotations. Captions were generated by an LLM, which received metadata about each scene and produced structured descriptions. The dataset is reproducible using Blender for rendering, Python for automation, and an LLM for text generation.

Institutions

Universidade Federal do Rio Grande do Norte

Annotated Objects for Visual Reasoning Dataset

Description

Files

Steps to reproduce

Institutions

Categories

Licence