Published: 30 May 2024| Version 1 | DOI: 10.17632/mjzfskw9hh.1
Satyam Tomar


The car dataset is a comprehensive collection of various attributes related to cars, encompassing technical specifications, features, and performance metrics. It is widely used in numerous analyses such as machine learning, data visualization, and statistical evaluation. Key attributes within this dataset include the car's make and model, manufacturing year, engine size in liters, the number of cylinders, and the type of fuel used (e.g., petrol, diesel, electric). Additionally, it includes performance indicators like horsepower, type of transmission (automatic or manual), drivetrain (FWD, RWD, AWD), fuel efficiency measured in miles per gallon, market price in USD, and the car's total mileage. This dataset serves multiple purposes, from predicting car prices based on various attributes to analyzing factors that influence fuel efficiency. It is also useful for understanding market trends and consumer preferences over different years and models, as well as classifying cars into different categories such as luxury, economy, or sports. Example entries might list a Toyota Corolla from 2020 with a 1.8-liter engine and an automatic transmission, highlighting the dataset's diversity. Sources of such data typically include government databases, automotive websites, and industry research studies. In machine learning, the dataset can be employed for regression analysis to predict continuous outcomes like prices or fuel efficiency, classification tasks to categorize cars, and clustering to group similar cars for market segmentation. The car dataset's extensive range of attributes makes it a valuable tool for both practical and theoretical applications in automotive analysis.