Dev, Staging, Production: Environment Management for AI Workflows

DevOps solved this problem for software years ago. You don’t push code directly to production. You develop locally, test in staging, promote through gates, and deploy with reviews. The pattern works because it catches mistakes before they reach users.

AI workflows deserve the same discipline. A prompt change, a model swap, a new tool integration — these affect production output quality just as much as a code change affects application behavior. But most AI platforms treat every change as a production change. Edit a workflow, and it’s live. No review. No staging. No safety net.

JieGou’s environment management brings the dev-staging-prod pipeline to AI workflows.

Three environments

Every workflow exists independently across three environments. Each has its own configuration, its own approval requirements, and its own deployment history.

Environment	Approval Required	Minimum Promotion Role
Development	No	Member
Staging	Yes	Dept Lead
Production	Yes	Admin

Development is the sandbox. Anyone on the team can deploy here without approval. Test new prompts, swap models, add steps — iterate freely without risk.

Staging mirrors production but with safe boundaries. Promotion from dev to staging requires a Dept Lead to review and approve the changes. This is where you validate that a workflow behaves correctly before it handles real workloads.

Production is the live environment. Promotion from staging to production requires Admin approval. Changes here affect real users, real data, real outputs.

Independent settings per environment

Environments aren’t just permission tiers. Each one carries its own operational configuration:

MCP server instances — Sandbox Slack in dev, production Slack in prod. Test against mock integrations without triggering real side effects.
Default LLM provider and model — Use a cheaper, faster model in dev for rapid iteration. Use the best model in prod for output quality.
Approval gates — Different role requirements per environment, matching your organization’s risk tolerance at each tier.
Deploy webhooks — Notify different Slack channels or CI/CD systems per environment. Dev deployments ping #eng-dev, production deployments ping #ops-alerts.
Environment variables — Non-secret key-value pairs injected into step templates. Point API_BASE_URL at your test server in staging and your production server in prod.

This separation means a workflow in development can call a sandbox API, use a cheaper model, and skip expensive tool calls — while the same workflow in production uses real integrations, the best model, and full processing.

The promotion pipeline

Promotion follows a strict sequence. No shortcuts.

A developer makes changes and deploys to dev. No approval needed — deploy as many times as you want.
The developer requests promotion from dev to staging. The system computes a version diff — a structural comparison of what changed between the current staging version and the proposed version.
A Dept Lead reviews the diff and approves or rejects the promotion.
On approval, the workflow version auto-deploys to staging. This is atomic — approval and deployment happen in one step. There’s no window where a promotion is approved but not yet deployed.
The same process repeats from staging to production, requiring Admin approval.

Self-approval prevention: the person who requests a promotion cannot approve it. This forces a second pair of eyes on every change that moves toward production.

Version diff engine

Reviewers don’t approve blind. They see exactly what changed.

The diff engine matches steps by ID across versions and compares 15+ properties:

Step type, label, model, and provider
Recipe assignment and task description
System prompt content
Condition logic (if/then/else configuration)
Loop configuration (iteration limits, break conditions)
Evaluation criteria and quality thresholds
Step dependencies
Nested steps within conditionals and loops

Changes are rendered as human-readable descriptions:

“Model changed from ‘claude-sonnet’ to ‘claude-opus’”
“Quality threshold changed from 0.7 to 0.9”
“System prompt modified”
“Step ‘Summarize’ added”
“Loop max iterations changed from 3 to 5”

The diff also detects input/output schema changes (fields added, removed, or modified) and execution mode changes (sequential vs. DAG). A reviewer sees the full picture: what steps changed, how they changed, and what structural shifts happened to the workflow as a whole.

Deployment tracking

Every deployment creates a record:

Status — active, superseded, or rolled_back
Deployer — Who triggered the deployment
Approval info — Who approved the promotion, when, and the original requester
Diff summary — What changed in this deployment compared to the previous one
Timestamp — When the deployment went live

Deployment history is queryable per workflow, per environment. You can trace the full lifecycle of any workflow in any environment: what was deployed, when, by whom, and what changed each time.

Rollback

One click. Instant.

Rollback doesn’t re-execute anything. It flips deployment statuses: the current active deployment is marked rolled_back, and the previous superseded deployment becomes active again.

This is a status change, not a redeployment. The previous version is already there — it just needs to be reactivated. No build step, no promotion pipeline, no waiting for approval. When production breaks, you fix it in seconds, then investigate the root cause with time on your side.

Audit trail

Every operation is logged:

Configuration changes to environment settings
Deployments to any environment
Promotion requests (who requested, from which environment, to which)
Promotion approvals, rejections, and cancellations
Rollback operations

Each log entry includes full before/after snapshots. You can reconstruct the exact state of any environment at any point in time. This isn’t just operational hygiene — it’s a compliance requirement for organizations in regulated industries.

Integration with VPC agents

For organizations running hybrid deployments, VPC execution agents can be scoped to specific environments.

A production agent only handles production workflow runs. A dev agent only handles development runs. This provides data isolation at the infrastructure level — development experiments never touch production compute resources, and production data never flows through development infrastructure.

Combined with environment-specific MCP servers and environment variables, this creates full isolation from the integration layer down to the execution layer.

Availability

Environment management is available on Enterprise plans. Includes three-environment promotion pipelines, version diffing, approval gates, deployment tracking, rollback, and audit trails. Learn more about enterprise features or start your free trial.