Road Accidents Dataset Poland 1st Half 2023

Published: 10 November 2023| Version 1 | DOI: 10.17632/t9h5bzmwc5.1
Igor Betkier


This dataset is curated for machine learning applications in the context of transportation accidents. It encompasses a range of variables that describe the accident itself, the resulting damages and casualties, environmental and road conditions, vehicle specifics, and temporal aspects. Each accident entry is described by the number and type of casualties, including hospital deaths, immediate deaths, minor injuries, and severe injuries. Lighting conditions during the accident are binary variables, indicating the presence of daylight, twilight, artificial light, or darkness. The dataset details the types of objects damaged during the accident, such as poles, buildings, barriers, viaducts, and the road surface itself. It also records the outcomes of the accident, including shuttle flow, detours, road blocks, lane blocks, and traffic regulated by the police. Location characteristics provide context regarding the presence of traffic signals, signs, road geometry (straight, curved, decline, ascent), environmental features (hilltop, embankment, crosswalk), infrastructure elements (bridges, tunnels), and road construction. The road surface condition is captured as binary variables that indicate whether the surface was dry, wet, had puddles, was icy, contaminated, or rutted. The vector of accident types includes various collision modalities, impacts with pedestrians, animals, barriers, stationary vehicles, and other unspecified incidents. Weather conditions are also binary variables, reflecting whether it was clear, cloudy, glare, windy, rainy, snowy, or foggy at the time of the accident. The dataset includes the number of vehicles involved and their types, such as two-wheelers, cars, buses, trucks without trailers, tractors, and others. Road characteristics such as highway or expressway status, carriageway type, number of lanes, population density, average traffic, and allowed speed are included. Time-related variables provide details on the accident, response times, delays in reporting and clearing the accident, as well as daily and weekly temporal markers. Lastly, the dataset includes a predictive variable for the anticipated time required to clear the accident from the road, which is vital for traffic management and emergency response optimization.



Wojskowa Akademia Techniczna im Jaroslawa Dabrowskiego


Machine Learning, Road Traffic Safety, Road Transportation, Road Safety