Big Datasets for Intelligent Anomaly Detection, Mitigation, and Forecasting in Feeders to Secure Smart Grid
Description
Big Data (BD) offers new opportunities for scientists to detect both normal and anomalous events in the Smart Grid (SG). However, the diverse and complex nature of Internet of Things (IoT)-enabled Feeder (FED) datasets poses significant challenges in analyzing and visualizing the real impacts. As a result, existing Centralized Protection, Control, and Monitoring (PCM) solutions often fail to uncover hidden patterns, making it difficult to distinguish between original and anomalous events in FEDs. Consequently, stealthy anomalies can bypass PCM anomaly detection layers, leading to a decline in the overall resilience and security of the smart grid. Artificial Intelligence (AI) shows great potential for uncovering hidden correlations and patterns in large datasets generated from FEDs in the SG. To address this, a Self-learning Hybrid Machine Learning (SHEL) model has been designed to process BD for detecting, mitigating, and forecasting anomalies caused by false data injection in the energy network. The BD first undergoes rigorous preprocessing, including noise reduction, temporal alignment, and statistical feature extraction in FEDs. Afterward, the SHEL model learns the hidden patterns with high precision and exposes stealthy anomalies based on the real-time input BD stream and historical BD in coordination with the Supervisory Control and Data Acquisition (SCADA) system. This enables the Distribution System Operator (DSO) to gain deeper insights into cybersecurity threats that escape PCM mechanisms and could compromise the smart grid security. Moreover, the regenerated BD can support further research in fault diagnosis and forecasting, enhancing the resilience and security of the SG.
Files
Steps to reproduce
1. Running Python Code (SHEL) in VS Code Install Python (https://www.python.org) and ensure it is added to PATH. Install Visual Studio Code (https://code.visualstudio.com). Install the Python and Jupyter extensions from the VS Code Extensions Marketplace. Open a project folder in VS Code and create a virtual environment: Windows: python -m venv .venv && .venv\Scripts\activate macOS/Linux: python3 -m venv .venv && source .venv/bin/activate Select the Python interpreter in VS Code (Ctrl+Shift+P → "Python: Select Interpreter"). Install dependencies: pip install numpy pandas matplotlib. Open the Python file and run using the "Run" button or python script.py in the terminal. For notebooks, use Jupyter extension and run cells interactively. ------------------------------------------------------------------------------- 2. Modeling Substation and Feeders in RTDS/RSCAD Build the substation and feeder network in RSCAD Draft: Include buses, transformers, lines/cables, loads, and DG if needed. Add IEDs and DFR interfaces: For IEDs: Configure GOOSE/Sampled Values (SV) using GTNET cards. For DFR: Use COMTRADE (IEEE C37.111) output for fault/event recording. Configure SCADA connection: Use IEC 60870-5-104 and IEEE 802.3 for real-time data. Set time synchronization: PTP for synchronized timestamps across RTDS, IEDs, DFR, and SCADA, SHEL. Define signal mappings: Map analogs (V, I, F, P, Q), binary signals (breaker status), and events to SCADA and DFR datasets. Run the simulation: Inject faults, switching events, or disturbances in RSCAD Runtime. Verify IED operation (trips, GOOSE messages). Collect data: Export COMTRADE files from DFR. Confirm SCADA receives real-time measurements and status updates. ----------------------------------------------------------------------------------- 3. Reproducing or Regenerating Data Option A: Deterministic Replay Use original COMTRADE (.cfg + .dat) files for playback in RTDS or relay test tools. Align sampling rates and channel configurations with the original setup. Option B: Scenario Regeneration Recreate the same network model, fault type, and conditions in RSCAD. Use the same simulation timestep and random seeds for comparable results. Option C: Synthetic Data Generation (Python) Use Python scripts to create synthetic voltage/current waveforms with fixed seeds). Export results in CSV or COMTRADE format for testing analytics.
Institutions
- Teknologian tutkimuskeskus VTT Oy
- Al-Ahliyya Amman University