Guide7 min read
Build a SCADA Cyber-Range Dataset
Generate a complete IDS / SOC training dataset — realistic SCADA telemetry with overlaid MITRE ATT&CK ICS attacks, ground-truth labels, packet captures, and a cryptographic evidence bundle — in one ADS run.
What you get
| Artifact | Format | Contents |
|---|---|---|
traffic.pcapng | Wireshark | All network flows (Modbus/OPC-UA/BACnet/MQTT/DNP3) |
truth.ndjson | NDJSON | One line per malicious event: technique_id, tactic, affected_assets |
signals.parquet | Columnar | Engineering-scaled PLC register values over time |
alarms.ndjson | NDJSON | SCADA operator alarms with asserted/cleared flags |
ics_security_events.json | JSON | Attack + benign events on same timeline (~1000:1 ratio) |
defense_alerts.json | JSON | IDS alerts with severity, tactic, technique, confidence |
plc_state.json | JSON | Per-minute PLC register + controller_mode snapshots |
evidence/* | 9 files | BLAKE3-sealed provenance bundle |
Recipe goal
In the ADS new project modal, use a concrete goal:
Generate a 4-hour water-treatment cyber-range dataset. Include a setpoint_change attack on the main pump PLC at the 90-minute mark and a sensor_spoof_injection attack on the chlorine-dosing pump at 180 minutes. Produce IDS training data with Wireshark pcapng, truth labels, and signal parquet.
The planner translates this into an ics_attack + scada_sim step chain with the integration pipeline that merges both event streams on a single timeline.
Training your IDS
import pandas as pd
signals = pd.read_parquet("signals.parquet")
labels = pd.read_json("truth.ndjson", lines=True)
# Align signals to labels by asset_id + time window
# ...your IDS training pipeline here...What's guaranteed
- Physics-accurate process data. Natural sensor drift, alarm deadbands, setpoint fluctuations.
- MITRE ATT&CK ICS-aligned attacks. Every malicious event carries the canonical technique_id.
- Tenant isolation. Your dataset is stored under your tenant's storage backend.
- Cryptographic reproducibility. Same seed + same scenario pack = byte-identical output.
See the Virtual SCADA andICS Securitypages for protocol specifics and technique coverage.