Describe What You Want, Get a Working Recipe: Natural-Language Creation with Template Matching

JieGou's NL creation system checks 132 tested templates before generating anything new. Two-stage matching, vague intent detection, structured LLM output, and a 4-step wizard that produces production-ready recipes and workflows from plain English.

JieGou Team · February 23, 2026 · 8 min read

Building AI automations shouldn’t require understanding JSON schemas, prompt engineering, or workflow DAGs. Users know what they want to automate — they should be able to describe it in plain English.

JieGou’s natural-language creation system lets you type a description like “summarize customer support tickets and flag urgent ones” and get a working recipe or workflow. But the interesting part isn’t the generation — it’s what happens before generation.

Suggest first, generate only if nothing matches

This is the key design decision. When you type a description and click Create, the system does not immediately call an LLM to generate a new recipe from scratch. Instead, it first checks JieGou’s library of 132 tested templates.

If a strong match is found (score > 0.6), a suggestion panel appears showing the matching template with a match percentage badge. You can adopt the tested template instantly or dismiss it and proceed to generation.

The philosophy: JieGou’s 132 tested templates are a moat. Every template has been generated, tested with synthetic inputs, evaluated by LLM-as-judge, and manually reviewed. Natural-language creation should amplify this moat, not bypass it.

This means most users get a battle-tested template in seconds rather than waiting for a freshly generated recipe that hasn’t been validated.

Template suggestion engine

The matching system uses two stages. No vector embeddings — for 132 items, fast keyword scoring plus conditional LLM reranking is enough.

Stage 1: Keyword scoring

The engine computes a Jaccard-like overlap between user intent tokens and template metadata tokens:

Tokenization strips non-alphanumeric characters, lowercases everything, and removes 48 common stop words
Intent coverage (70% weight) — what fraction of the user’s intent tokens appear in the template metadata
Template coverage (30% weight) — what fraction of the template metadata tokens appear in the user’s intent
Partial token matches (substring overlap) score 0.5 instead of 1.0

Department boost: templates from the user’s currently active department get a 20% score boost, capped at 1.0. If you’re working in the Sales department and describe something sales-related, Sales templates rank higher.

Stage 2: LLM reranking

LLM reranking is conditional — it only fires when two conditions are met:

The top keyword score is below 0.8 (high-confidence keyword matches don’t need LLM verification)
At least 2 candidate templates scored above the minimum threshold

When it fires, the top 10 candidates are sent to Claude Haiku for fast reranking with structured output. The LLM sees the user’s intent and each candidate’s metadata, then returns a reranked list with scores.

If the LLM call fails for any reason, the system falls back to keyword-only ranking. Graceful degradation — the suggestion engine never blocks on an LLM failure.

Strong match threshold: any template scoring above 0.6 after both stages triggers the suggestion panel.

Recipe creation wizard

The recipe wizard walks through 4 steps, with template matching intercepting between steps 1 and 2.

Step 1: Intent

A textarea where you describe what the recipe should do. Six example intent chips provide starting points (“Summarize a document”, “Extract key data from emails”, etc.). If you’ve created recipes before, the wizard shows account history pattern hints based on your previous creations.

After you submit your intent, the template suggestion engine runs. If a strong match is found, a green suggestion panel appears with the template name and match percentage badge. You can adopt it (skipping straight to a pre-built, tested recipe) or dismiss it and continue to generation.

Step 2: Draft

If no template matched — or you dismissed the suggestion — the system sends your intent to an LLM. The draft response includes:

A recipe name and description
Suggested tags
A plain-English explanation of what the recipe will do
2-3 clarifying questions to refine the recipe before full generation

Vague intent detection is built into this step. If the LLM determines the intent is too vague to produce a useful recipe (e.g., “do something with data”), the API returns HTTP 422 with a friendly message asking you to be more specific. This prevents low-quality generation at the source.

Step 3: Proposal

You answer the clarifying questions from Step 2. Answers use chip-style options — predefined choices you can tap rather than type. This keeps the interaction fast and constrains the output to well-defined paths.

Step 4: Generate

The system produces the full recipe specification:

inputSchema — typed fields the recipe expects
outputSchema — structured output the recipe produces
promptTemplate — the complete prompt with variable placeholders
sampleInput — realistic test data you can run immediately

A live preview renders the recipe before you save it. You can review the prompt, test it with the sample input, and iterate before committing.

Workflow creation wizard

Workflows follow the same 4-step pattern (intent, draft, proposal, generate) but produce richer output.

Draft differences

A workflow draft produces 2-10 steps, each assigned one of 8 step types:

Step type	Purpose
`recipe`	Execute a reusable prompt template
`llm`	Direct LLM call without a saved recipe
`eval`	Evaluate output quality with LLM-as-judge
`router`	Route to different branches based on input
`aggregator`	Combine outputs from multiple steps
`condition`	Branch execution based on a boolean expression
`loop`	Iterate over a collection
`approval`	Pause for human review before continuing

The LLM determines the execution mode — sequential or DAG (directed acyclic graph) — and provides a reason for the choice. Simple pipelines get sequential mode. Workflows with independent branches that can run in parallel get DAG mode.

Multi-agent pattern hints

The drafting LLM has access to 4 multi-agent pattern hints it can apply when appropriate:

Critic-refiner — one agent generates, another critiques, the first revises
Specialist-router — a router agent dispatches to domain-specific specialist agents
Debate-consensus — multiple agents argue positions, a synthesizer extracts consensus
Plan-execute-verify — a planner breaks down the task, an executor runs each part, a verifier checks results

These patterns produce workflows with 4-8 steps that follow established multi-agent architectures.

”Suggest from Recipes” button

If you already have recipes in your account, the Suggest from Recipes button generates a workflow that chains your existing recipes together. The system examines your recipe library and proposes a workflow that connects them in a logical sequence — no need to describe the workflow from scratch.

Two-phase save

Workflow saving uses a two-phase process:

Phase 1 — Create any new recipes that the workflow references. Each recipe is saved to Firestore and assigned a real document ID.
Phase 2 — Create the workflow, mapping placeholder recipe references to the real IDs from Phase 1.

This ensures referential integrity. The workflow never points to recipes that don’t exist.

Department context injection

Both recipe and workflow drafting resolve department context as a non-blocking side channel. When you’re working within a department, the system fetches:

The department pack’s name and description
Up to 15 available recipe slugs from the department’s starter pack
Suggested integrations relevant to the department

This context is injected into the LLM prompt, instructing it to prefer reusing tested pack recipes as workflow steps rather than generating new ones from scratch. A Sales workflow draft will reference the existing “Lead Enrichment” and “Competitor Analysis” recipes from the Sales pack instead of creating duplicates.

If department resolution fails (network error, missing data), generation proceeds without it. No hard dependency on context availability.

Key technical decisions

No vector embeddings for template matching. With 132 templates, keyword scoring plus conditional LLM reranking is fast, accurate, and requires zero infrastructure. No embedding model to host, no vector database to maintain, no embedding drift to worry about. If the template library grows to 1,000+, this decision gets revisited.

Structured LLM output via Zod schemas. Every LLM call in the creation pipeline uses a Zod schema to validate the response. Draft responses, clarifying questions, recipe specs, workflow step definitions — all typed and validated. Malformed LLM output is caught immediately rather than producing subtle bugs downstream.

Vague intent detection. Rather than generating a mediocre recipe from a vague description, the system returns a 422 and asks for clarification. This is a deliberate quality gate. A recipe that does “something with data” helps no one.

Conditional LLM reranking. When the keyword engine produces a high-confidence match (score >= 0.8), the LLM reranking step is skipped entirely. This keeps suggestion latency low for obvious matches while reserving LLM intelligence for ambiguous cases.

Graceful degradation at every layer. LLM reranking fails? Fall back to keyword ranking. Department context fails? Generate without it. Template matching finds nothing? Proceed to generation. No single failure point blocks the creation flow.

Availability

Natural-language recipe and workflow creation is available on all plans — Free, Pro, and Enterprise. Explore all features or start your free trial.