A multi-device and multi-operator dataset from mobile network coverage on Android devices
Description
The demand for mobile coverage with adequate signal quality has triggered criticism due to the maturity of the Internet's diffusion in today's society. However, with the deployment of 5G networks, even 5G NSA by 4G LTE, the complexity of the operating environment of mobile networks has increased. To evaluate the behavior of mobile networks in terms of signal quality and other important metrics for mobile telephony, we developed a dataset consisting of 33 radio parameters that can collect up to 736,974 records generated daily by smartphones and tablets. To create the dataset, an application was designed for the Android operating system using the Kotlin programming language, which can collect data in real time and generate a CSV file. The dataset has 10 samples collected from 9 cities located on the Amazon and Negro Rivers. The complete database covering all regions has 33 columns and 736,974 rows. In addition to the primary dataset, we divided the data into three regions: the metropolitan area of Manaus, the middle Solimões River, and the middle Amazonas River. During the scheduled trips, data were collected along rivers and roads that provide access to the locations selected for the experiment. The data was processed, indexed, and organized into a comprehensive database, then categorized by location. This organization allows experiments using the entire dataset across all cities or with data specific to an individual city. To access the database and conduct initial experiments, Python scripts were developed alongside the database to facilitate data loading and the generation of histograms and charts necessary for the initial investigation. In addition to the graph generation scripts, we also created heat maps based on the collected network variables.The data is organized in a folder named “network_dataset,” which contains a list of datasets. Each dataset is named according to the device ID concatenated with the timestamp at which it was collected.The raw dataset was stored inside the mobile device, and stored in the cloud after the preprocessing steps. The collected data contains mobile network variables such as Reference Signal Received Power (RSRP), Reference Signal Received Quality (RSRQ), Signal-to-Noise Ratio (SNR) and Channel Quality Indicator (CQI), collected in real-time and stored on the mobile device in Comma-separated Values (CSV) data format. After completing the daily collection, the device automatically sends the file to the cloud.
Files
Steps to reproduce
To use the Dataset: 1- download the src and network_dataset folders; 2- open the network_dataset folder and copy the csv you want to test to the src folder; 3 open Jupyter Notebook and use the scripts to test the database. The src folder contains the scripts to load the database, generate various graphs and generate a heat map of the collected data, based on mobile network parameters. For example, you can open the Jupyter notebook and load the load.ypynb script called "df_anonymizedManaus.csv" and you will be able to see the main properties of the dataset. You can also load the maps.ipynb script with the df_anonymizedItacoatiara.csv dataset, which will display the map of the Itacoatiara region, Amazonas with the variables of interest, such as latitude, longitude, dBM, RSRQ and CQI. You can also load the heatmap.py script to visualize the heatmap based on RSRP, CQI and RSRQ with the df_anonymizedManaus.csv dataset. Furthermore, the application is available to be used for specific collections, if necessary. To do this, you can open the armadeira_app folder and download the collection application. Along with the application, we also provide the user manual. The current version of the application allows you to collect more data beyond those processed and made available in this published dataset.
Institutions
Categories
Funding
Motorola (United States)