Large-Scale Synthetic Dataset for Q-Factor Prediction in Optical Communication Systems

Published: 19 May 2025| Version 1 | DOI: 10.17632/6fcnwdjxt5.1
Contributors:
Ahmed Al-Dulaimi,

Description

This dataset comprises 1,000,000 synthetic samples designed for machine learning-based prediction of the Q-Factor, a key quality metric in optical communication systems. Each sample simulates a fiber-optic transmission scenario using five numerical input features representing: - OSNR (Optical Signal-to-Noise Ratio) - Launch Power - Fiber Length - Chromatic Dispersion - Nonlinear Effects The target output is the Q-Factor, modeled using a nonlinear combination of the input features, incorporating quadratic, logarithmic, sinusoidal, and cubic terms to reflect realistic physical interactions. Gaussian noise is added to emulate measurement variability found in real-world systems. This dataset is ideal for: - Training and benchmarking regression models (e.g., neural networks, XGBoost) - Research in QoT (Quality of Transmission) estimation - Educational use in optical communications and machine learning The dataset is saved in CSV format and can be directly used in Python (e.g., via pandas), MATLAB, or any data analysis environment.

Files

Categories

Electrical Engineering, Optics, Artificial Intelligence, Electronic Engineering, Data Science, Applied Physics, Machine Learning, Photonics

Licence