Nonastreda: Multimodal Dataset for Identifying Tool Wear Condition
Description
# Nonastreda: 9 Multimodal Dataset Featuring Time Series and Image Data for Flank Tool Wear Classification and Regression * Detailed description: 'Data in Brief' Journal (available soon) * Repository: https://github.com/hubtru/Impala * Repository: https://github.com/hubtru/Girape * Notebooks for converting forces_xyz_raw.mat into spectrograms, scalograms or wavelets: https://github.com/hubtru/Girape/tree/main/scripts # Overview: Nonastreda (Nona) * 'Nona' from Latin "ninth" * Dataset Size: 512 samples (instances, observations) * Modalities: 9 modalities * Tasks: * Classification: 3 classes (sharp, used, dulled) * Regression: 3 targets (flank wear [µm], gaps [µm], overhang [µm]) * Additional subtasks: * Uni/Multi-Modal Classification * Multilabel Regression * Anomaly Detection * Remaining Useful Life (RUL) Estimation * Signal Drift Measurement * Zero-Shot Flank Tool Wear Classification * Diagnostic Feature Engineering * Domain: industrial flank tool wear of the milling machine * Input (per sample): * Images: 1 tool image, 1 chip image, 1 workpiece image * Mel-Spectrograms: x, y, z axes (3 images) * Complex Morlet Scalograms: x, y, z axes (3 images) * Extra Modalities: raw (time-series) force signals in x, y, z axes * Output: * Machine state classes: sharp, used, dulled * Regression targets: flank wear [µm], gaps [µm], overhang [µm] * Evaluation metrics: * Classification: accuracies, precision, recall, F1-Score, ROC curve * Regression: MAE, MSE, RMSE * Data splitting: * Protocol: 10-Fold Cross Validation * Training and Validation: data from 9 tools * Testing: data from the 10th tool * Results: accuracy averaged over ten splits * The dataset includes measurements from ten tools Extra Time-Series Modality * Raw forces signal in x, y, z axes is provided in `forces_xyz_raw.mat` file. * The `*.mat` file can be used with scripts from the Girape repository to generate spectrograms, scalograms, and wavelets. * Source force signals (Fx, Fy, Fz) allow experimentation with new types of feature engineering and embeddings, such as Shannon, Daubechies, or Morlet wavelets. * Sampling rate for force signals: 1 kHz. * forces_xyz.mat + Girape/scripts -> spectrograms or scalograms or wavelets Future Work * Improvements of (zero-shot flank) tool wear classification and regression. * Incorporating raw force signals (Fx, Fy, Fz) into multimodal studies. * Calculating new modalities using the raw force signals (Fx, Fy, Fz). * Conducting experiments on: * Anomaly Detection * Remaining Useful Life (RUL) estimation * Signal Drift measurement * Designing Diagnostic Feature Engineering. * Modalities Correlation Analysis. # Data Structure Nonastreda/ │ ├── chip/ ├── scal/ │ ├── x/ │ ├── y/ │ └── z/ ├── spec/ │ ├── x/ │ ├── y/ │ └── z/ ├── tool/ │ ├── work/ │ ├── labels.csv ├── labels_reg.csv └── forces_xyz_raw.mat
Files
Steps to reproduce
* Detailed description: See the upcoming publication in the 'Data in Brief' journal. The Nonastreda dataset is obtained from a real industrial milling device processing material with a shaft milling tool. During the milling process, time-series and image data were collected to model industrial tool wear. * **Time-Series Data**: - Three force signal sequences (Fx, Fy, Fz) were collected using: - Industrial dynamometer - Amplifier - Bus coupler - Industrial PC * **Image Data**: - Images were captured using an industrial unit microscope.