R
RadMah AIDOCS
Sign In

Agentic Data Scientist

Autonomous multi-step pipeline agent with self-healing execution, human approval gates, and a cryptographic audit trail. Works across all engines automatically.

Overview

The Agentic Data Scientist (ADS) is the multi-turn autonomous surface of the unified orchestration system. It plans multi-step pipelines from a natural language goal, executes each step as a real job, reads the signed evidence bundle after each step, and self-heals if quality thresholds are not met. Every decision is written into a tamper-evident cryptographic audit trail.

Available on all plans

The ADS is available on all plans including Free. It is credit-limited, not feature-gated. BYO API key removes the AI request cap on any plan. The ADS works across all engines — it automatically selects the correct engine for each step.

Project Lifecycle

Project state machine
planning               # LLM generates step plan
  |
  v
awaiting_approval      # User reviews and approves plan
  |
  v
running                # PlanExecutor iterates steps
  |-- complete         # All steps succeeded
  |-- blocked          # A step failed or timed out
  +-- (cancel)         # User cancels from any non-terminal state
failed                 # Planning LLM call itself threw an exception

Key Capabilities

CapabilityDescription
Multi-step plannerPlans a coherent sequence of steps from a natural language goal
Autonomous executorCreates real jobs, polls completion, and chains steps sequentially
Self-healing executionReads the sealed evidence after each step and replans if quality is below threshold
Human approval gatesPauses for human approval on write actions, cost thresholds, quality uncertainty, low confidence, and unresolved constraints
Shadow executionParallel shadow run for comparison and divergence scoring
Session memoryCross-session learning from past runs, fully tenant-isolated
Physics validatorValidates output against physical constraints per step
Budget enforcementCredit-aware planning — will not exceed tenant budget
Per-step proof recordsCryptographic proof of planned versus executed state, with a quantitative risk score

Using the ADS

Create and execute a project
from radmah_sdk import RadMahClient

client = RadMahClient(api_key="sl_live_...")

# Create a project with a goal
project = client.agent.create_project(
    title="WWTP Dataset Generation",
    goal="Generate a SCADA dataset for a municipal wastewater treatment "
         "plant with attack scenarios and full evidence bundles"
)
# Status: planning → awaiting_approval

# Review the plan
plan = client.agent.get_project(project.id)
for step in plan.steps:
    print(f"Step {step.index}: {step.description} ({step.tool_name})")

# Approve and execute
client.agent.approve(project.id)
# Status: running → complete (or blocked on failure)

REST Endpoints

MethodPathDescription
POST/v1/client/agent/projectsCreate project and trigger planning
GET/projectsList all projects for tenant
GET/projects/{id}Full project detail with steps and output
POST/projects/{id}/approveApprove plan and begin execution
POST/projects/{id}/cancelCancel project
GET/projects/{id}/outputFinal output (artifacts, seal, narrative)
GET/projects/{id}/streamSSE live-stream progress
GET/projects/{id}/verify-crypto-trailVerify cryptographic audit trail integrity

Self-Healing Execution

After each step completes, the ADS reads the signed evidence bundle and evaluates quality against thresholds. If quality is below the threshold, self-healing execution triggers:

  1. Read the quality report from the evidence bundle
  2. Diagnose the failure (distributional drift, constraint violation, etc.)
  3. Generate a patch plan (adjust parameters, change engine, retry)
  4. Present the patch decision to the user if a human approval gate is triggered; otherwise auto-apply

Grounded in cryptographic proof

Self-healing is driven by measured evidence from the signed bundle, not language-model guesses. Evidence-grounded replanning is only possible because every job produces a signed evidence bundle.

Cryptographic Audit Trail

Every ADS decision — planning, step execution, self-healing, replan, approval — is written into a tamper-evident cryptographic audit trail. The trail is independently verifiable via the /verify-crypto-trail endpoint.