GrocerySales10MDataset
Description
This dataset represents a large-scale synthetic transactional sales dataset designed to simulate real-world e-commerce or point-of-sale systems. It consists of 10 million transaction records, where each record corresponds to a single product purchase made by a user. Each transaction contains a unique transaction identifier, user and product identifiers, product category, quantity purchased, unit price, timestamp of the transaction, and the geographic region where the transaction occurred. The timestamps span a full calendar year (2023), enabling temporal analysis such as daily, monthly, or seasonal trends. The dataset includes multiple product categories (Electronics, Clothing, Books, Toys, and Home) and regional information (Dhaka, Chittagong, Sylhet, and Rajshahi), making it suitable for analyzing customer behavior, sales patterns, regional demand, and time-based purchasing trends. Due to its large size, the dataset is particularly useful for evaluating big data processing frameworks, scalability of data analytics pipelines, and performance benchmarking of distributed data processing systems.
Files
Institutions
- Chittagong University of Engineering and Technology