ChartDataset2023: Introducing a Synthetic Dataset Featuring Various Chart Types for Chart Identification and Visualization
Description
We introduce a meticulously curated synthetic chart dataset designed to drive algorithmic advancements in data visualization and interpretation. This dataset, tailored for training and testing, spans diverse chart types, including Area, Bar, Box, Donut, Line, Pie, and Scatter charts. The data collection process employs a fully automatic low-level algorithm for efficient graphical element extraction. With a focus on excluding 3D representations and overlapping elements, the dataset is categorized into training and testing subsets based on resolutions and specific chart types. Comprising in total of 25,894 images, this resource not only serves as a valuable training and testing tool for deep models but also establishes a benchmark for evaluating system performance. The organized structure and versatility of this dataset provide a standardized platform for assessing the effectiveness of different systems in understanding and decomposing visualizations.