Agentic Data Scientist

Autonomous multi-step pipeline agent with self-healing execution, human approval gates, and a cryptographic audit trail. Works across all engines automatically.

Overview

The Agentic Data Scientist (ADS) is the multi-turn autonomous surface of the unified orchestration system. It plans multi-step pipelines from a natural language goal, executes each step as a real job, reads the signed evidence bundle after each step, and self-heals if quality thresholds are not met. Every decision is written into a tamper-evident cryptographic audit trail.

ℹAvailable on all plans

The ADS is available on all plans including Free. It is credit-limited, not feature-gated. BYO API key removes the AI request cap on any plan. The ADS works across all engines — it automatically selects the correct engine for each step.

Project Lifecycle

Project state machine

planning               # LLM generates step plan
  |
  v
awaiting_approval      # User reviews and approves plan
  |
  v
running                # PlanExecutor iterates steps
  |-- complete         # All steps succeeded
  |-- blocked          # A step failed or timed out
  +-- (cancel)         # User cancels from any non-terminal state
failed                 # Planning LLM call itself threw an exception

Key Capabilities

Capability	Description
Multi-step planner	Plans a coherent sequence of steps from a natural language goal
Autonomous executor	Creates real jobs, polls completion, and chains steps sequentially
Self-healing execution	Reads the sealed evidence after each step and replans if quality is below threshold
Human approval gates	Pauses for human approval on write actions, cost thresholds, quality uncertainty, low confidence, and unresolved constraints
Shadow execution	Parallel shadow run for comparison and divergence scoring
Session memory	Cross-session learning from past runs, fully tenant-isolated
Physics validator	Validates output against physical constraints per step
Budget enforcement	Credit-aware planning — will not exceed tenant budget
Per-step proof records	Cryptographic proof of planned versus executed state, with a quantitative risk score

Using the ADS

Create and execute a project

from radmah_sdk import RadMahClient

client = RadMahClient(api_key="sl_live_...")

# Create a project with a goal
project = client.agent.create_project(
    title="WWTP Dataset Generation",
    goal="Generate a SCADA dataset for a municipal wastewater treatment "
         "plant with attack scenarios and full evidence bundles"
)
# Status: planning → awaiting_approval

# Review the plan
plan = client.agent.get_project(project.id)
for step in plan.steps:
    print(f"Step {step.index}: {step.description} ({step.tool_name})")

# Approve and execute
client.agent.approve(project.id)
# Status: running → complete (or blocked on failure)

REST Endpoints

Method	Path	Description
POST	/v1/client/agent/projects	Create project and trigger planning
GET	/projects	List all projects for tenant
GET	/projects/{id}	Full project detail with steps and output
POST	/projects/{id}/approve	Approve plan and begin execution
POST	/projects/{id}/cancel	Cancel project
GET	/projects/{id}/output	Final output (artifacts, seal, narrative)
GET	/projects/{id}/stream	SSE live-stream progress
GET	/projects/{id}/verify-crypto-trail	Verify cryptographic audit trail integrity

Self-Healing Execution

After each step completes, the ADS reads the signed evidence bundle and evaluates quality against thresholds. If quality is below the threshold, self-healing execution triggers:

Read the quality report from the evidence bundle
Diagnose the failure (distributional drift, constraint violation, etc.)
Generate a patch plan (adjust parameters, change engine, retry)
Present the patch decision to the user if a human approval gate is triggered; otherwise auto-apply

✦Grounded in cryptographic proof

Self-healing is driven by measured evidence from the signed bundle, not language-model guesses. Evidence-grounded replanning is only possible because every job produces a signed evidence bundle.

Cryptographic Audit Trail

Every ADS decision — planning, step execution, self-healing, replan, approval — is written into a tamper-evident cryptographic audit trail. The trail is independently verifiable via the /verify-crypto-trail endpoint.