reliable
customer-facing agents.
A system prompt can't hold your instructions consistently.System prompts don't behave consistently
A routed intent graph can't hold a real-world conversation.Graphs break in real-world interactions
Parlant's conversational harness easily holds both.Parlant gets it right.
























By far the most elegant conversational AI framework I've come across.
Ship agents that behave well, consistently.
Parlant contextually matches your most relevant rules to each turn of the conversation, maximizing the language models' groundedness and instruction-following consistency.
# Create a hook for refund-related rules and tools
refunds = await agent.create_observation("discussing refunds")
find_order_id = await agent.create_guideline(
condition="The refund request lacks an order ID",
action="Find and confirm the order ID before continuing",
# Allow the use of these tools when this guideline is active
tools=[find_order_by_id],
# Ignore lower-priority guidelines until this one is resolved
priority=10,
# This guideline can only activate in the scope of refunds
dependencies=[refunds],
)
Customer message ingestion
Contextual matching
- 3 guidelines matched
- 1 journey matched
- 8 glossary terms matched
Optimized agent output
- Contextually narrowed-down tool selection
- Focused message generation request
Building the best conversational AI on the planet.Building the best conversational AI framework on the planet.
Parlant's design goals optimize for safe and effective customer-facing agents at scale.
Maximum prevention of misaligned behaviors
Misalignment is treated as a core design problem. Constraints and control points are built into how the LLM is used, rather than bolted onto the final output after the fact.
A clean structure for your conversations.
Every conversation session is a timeline of events, not a fragile request-response loop.
Real conversations are not polite single-turn calls. Users add context progressively, tools return results, and the agent needs to keep up with it coherently.
Parlant sessions capture that flow as ordered events: customer messages, status updates, tool activity, frontend payloads, and message events, all with offsets and traces.
- Event timeline
- Messages, status updates, tool results, frontend events, and agent replies live in one ordered session.
- Resumable UI
- Custom frontends resume from the last offset instead of reconstructing state after reconnects.
- Traceable turns
- Completion events keep response data correlated to the logs, matched rules, tools, and turn state that produced it.
Session event stream
- 00hjk8173acmessagecustomer"I need to return this order"
- 01hjk8173acstatusagentmessage acknowledged
- 02hjk8173acstatusagenttyping
- 03hjk8173acmessageagent[quick preamble message]
- 04hjk8173acstatusagentthinking
- 05hjk8173actoolsystemorder lookup result
- 06hjk8173acmessageagent[streamed response]
- 07hjk8173acstatusagentready (idle)
Fits cleanly into your stack.
Use Parlant's REST API and native client SDKs directly from your frontend, or place it behind a BFF that owns auth, tenancy, and product policy.
Direct conversational API
Frontend talks to Parlant as the conversation endpoint.
Conversational microservice
Your BFF owns auth and policy. Parlant owns the conversation.
Powered by research.
Grounded in cutting-edge research on how to focus LLM attention and maximize instruction-following accuracy in real-world conversations.
Read the ARQ paper behind Parlant's accuracyRead the ARQ paperWhy Parlant (1)System prompts fail to deliver
Most builders start the same way. Every rule, every fallback, every brand constraint, packed into one prompt. It works for the first five rules.
By the 10th rule, consistency breaks down silently. You fix one behavior and another regresses. You canβt pinpoint why the agent said what it said, or what to change so it stops. The severity of mistakes is unbounded, and the agent can say anything when itβs no longer grounded and focused.
What went wrong?
- Curse of Instructions: the more rules you stack, the worse the model follows any of them.
- Attention Dilution: irrelevant context crowds out what actually matters for the current turn.
- Lost in the Middle: rules buried in the middle of a long prompt are barely registered.
- Anchoring bias: the model over-weights its initial subset of instructions.
Customer message
The headphones I ordered aren't working!
System prompt instructions
Stay calm and concise
Ask for an order ID first
Route high-risk cases to a human
Avoid unsupported promises
Never invent exceptions
Clarify vague requests
Use regional constraints
Agent message
Absolutely, I can help you return your headphones. What's the order ID?
Why Parlant (2)Routed graphs fail to deliver
Customer message
I need to replace my headphones, but first change my delivery address to 13 Baker St. Does my warranty cover everything?
Routed graph
Router
Sub agent
Replacement
Sub agent
Shipping
Sub agent
Warranties
Agent message
Yes, your warranty covers it. I have placed the replacement order and will update your delivery address.
The next move is a graph. You classify intents, route to nodes, narrow the context per branch. For linear workflows, itβs a real improvement.
But real-world conversations aren't linear. Users mix topics, circle back, ask compound questions, and change their mind. Routing to node X means losing grounding context for topic Y, leading to misaligned, unexpected, and inconsistent agent behavior.
What went wrong?
- Context drift: routing to node X means losing grounding context for topic Y.
- Intent ambiguity: real users use vague language, mix topics, and interweave concerns.
- Coordination tax: different nodes take different approaches across the same conversation.
- Illusion of compliance: high-level constraints look like control, but the model can drift away from them at the moment it actually speaks.
What Parlant does instead
Behavioral rules as contextual guidelines, matched to each moment of the dialogue.
Canned responses at the moments where wording must be strictly exact.
Every decision traced end to end, accounting for every guideline that fired.
Procedures, not branches.
A journey is a state machine you write once. The agent follows it like a human follows an SOP: keeps the goal, adapts the wording, handles interruptions, comes back.
journey = await agent.create_journey(
title="Wire Transfer",
conditions=["customer requests a wire transfer"],
)
# Each state names what to do next, not how to route.
t1 = await journey.initial_state.transition_to(
chat_state="Ask for recipient name and amount",
)
t2 = await t1.target.transition_to(
condition="received name and amount",
tool_state=verify_recipient_and_balance,
)
Bypass generation when it matters.
At a high-stakes moment, the agent does not improvise. It selects a canned response. If a required field is missing from the current turn, that response is automatically disqualified. The guardrail is structural.
await agent.create_guideline(
condition="customer requests a refund",
action="process the refund using their order ID",
tools=[process_refund],
composition_mode=p.CompositionMode.STRICT,
canned_responses=[
await server.create_canned_response(
template="Refund processed. Transaction ID: {{transaction_id}}",
),
],
)
Read the agent's mind.
Every turn comes with a trace. Which guidelines were considered, which fired, which canned response was selected, which tools ran and what they returned.
{
"turn": 4,
"matched_guidelines": [
{"id": "require_order_id", "score": 0.94},
{"id": "polite_acknowledge", "score": 0.71}
],
"canned_response": "acknowledge_then_ask_order_id",
"tools_called": ["lookup_order"],
"latency_ms": 412
}
What the pros say
From the teams running against real customers, at real volume, in domains where being wrong is expensive.
We tested Parlant extensively. The failure patterns that exist in our production logs, capturing them in Parlant takes minutes.
Vishal AhujaML Tech Lead JPMorgan Chase
Parlant isn't just a framework. It's a high-level software that solves the conversational modeling problem head-on.
Sarthak DalabeheraPrincipal Engineer 
Parlant dramatically reduces the need for prompt engineering and complex flow control. Building becomes like modeling.
Diogo SantiagoAI Engineer 
We went live with a fully functional agent in one week. Iβm impressed by how consistent Parlant is with its responses.
Arpit ParasharDeputy Manager
Frequently asked
A harness performs automatic context engineering on top of a language model. Context engineering is the discipline of getting the right context, no more, no less, into the prompt at the right time. You can't cram all of your instructions, knowledge, and tools into one prompt regardless of the model or context window size. Models don't listen beyond a certain point. You need the right items in the prompt, and the wrong items out. Parlant is a conversational harness: it assembles only the relevant rules, data, and tools for each turn.
Prompt engineering is about getting a model to behave correctly when it already has the right context: optimizing the template, tuning temperature, tweaking reasoning. Context engineering is about what goes into that template in the first place. A well-engineered prompt with the wrong context still produces wrong results. They are complementary: prompt handles how, context handles what and when.
LangChain and LangGraph are workflow tools, they manage how data flows between steps. Parlant works on a different level: it manages behavior. Instead of building a graph of nodes and edges, you define individual behavioral rules (guidelines) and the relationships between them. The framework figures out which rules apply at each turn. Changing your agent's behavior is a content change (add a guideline), not a structural change (restructure the graph).
The LLM still handles general conversation naturally. Guidelines define behavioral expectations for specific situations, everything else works as you'd expect from an LLM. If you don't need special handling for a scenario, you don't need to define any rules for it.
Use canned responses with strict composition mode for control of wording. Since each response can reference fields (from tool results, retrievers, or guidelines), if a required field isn't present in the current context, Parlant automatically disqualifies that response from being selected. The guardrails are structural and deterministic, not just prompt-based.
The real rigidity of traditional chatbots comes from tree-based flows, not from controlled wording. Even in strict canned-response mode, the agent still chooses when to use them based on the fluid nature of the interaction, just like a call-center rep does. The flow stays flexible; you get precise wording where it matters.
Parlant is LLM-agnostic, you can use any provider and model. When consistency and reliability matter, some models perform better than others. The officially recommended providers are Emcie, OpenAI (directly or via Azure), and Anthropic (directly or via AWS Bedrock).
Get started
Install. Define your first guideline. Talk to your agent.
$ pip install parlant
End of the manual.