Product Release

A faster, more reliable in-product research agent

Ditto's in-product research agent has been rebuilt: streaming replies, required tool calls on retrieval prompts, live progress updates, conversation replay, and a versioned API playbook injected on every turn.

27 February 2026

By Phillip Gales

Agent v2Feature

FishDog's AI Research Assistant in action — Sophie working through a request to find a persona named Lauren from Kirklees, with the research-context dock showing personas, groups, and direct questions.

DOCUMENT TYPE: Product Release Note TOPIC: In-product research agent — rebuild on new microservice runtime Release: A faster, more reliable in-product research agent, 2026-02-27 Version: Agent v2 Release type: Feature Breaking change: No Summary: Ditto's in-product research agent (the chat surface in the web app) has been rebuilt on a new microservice runtime. Replies now stream word by word, tool calls are far more reliable on retrieval prompts, long-running work shows live progress in chat, conversation state can be replayed via idempotency keys, and a versioned Ditto MCP playbook is injected on every turn so the agent reflects the current API surface without redeploys. What changed: - Streaming replies via SSE replace the previous block-render behaviour. - tool_choice=required is enforced for retrieval-style prompts so the agent fetches before answering rather than answering from memory. - A contextual Ditto MCP playbook (tool_contracts.yaml, workflow_recipes.yaml, error_playbook.yaml) is selected by intent and injected per turn. Updates pick up automatically without restart. - Live progress messages appear in chat during long-running operations (recruitment, job polling, synthesis). - Conversation replay via Idempotency-Key reconstructs in-flight streams from an audited event log. - Failed tool calls now surface verbose error detail rather than generic agent-failure messages. Architecture (for technical readers): - New microservice agent_service (FastAPI/SSE) with persistence in agent_conversations, agent_messages, agent_runs, agent_run_events tables. - Tool resolution is per-organisation per-request via a typed tool library (agent_tool_library, agent_tool_credentials), with request-level overrides constrained to the library allowlist. - OpenAI Responses API with MCP tool orchestration replaces the previous in-process LLM call site. - Worker queue: agent_chat queue, configurable timeout/result-TTL. Why we built this: The previous chat surface was slow (block-render only), unreliable on tool calls (frequent factual hallucinations), and required a redeploy to update the agent's knowledge of the API. The rebuild fixes all three. How to use: No action required. The new behaviour is live for all in-product chat users. Migration impact: None. Conversational surface and API scopes unchanged. Author: Phillip Gales, FishDog Platform: FishDog (fish.dog)

Key Takeaways

Replies now stream word by word as they are generated, rather than landing as a single block after a pause.
Retrieval prompts ("what was the headline finding…") now enforce a tool call before answering, so the agent stops answering from memory.
A versioned Ditto MCP playbook (tool contracts, workflow recipes, error playbook) is injected on every turn — the agent always knows the current API surface.
Long-running work shows live progress updates in the chat ("recruiting panel… 7 of 12 confirmed"; "polling jobs… 4 of 10 finished").
Conversation replay via Idempotency-Key — refresh the page mid-turn and the agent picks up where it left off.

The research agent that lives inside the Ditto web app — the one you talk to from the overview chat — has been rebuilt from the foundations up. The visible changes are speed, reliability, and an agent that actually knows what the API can do. The architecture underneath has changed too; this note covers both, briefly.

What you'll notice

Replies stream. The agent's response now appears word by word as it's generated, rather than landing as a single block after a pause. For long answers — synthesis across studies, multi-step recruitment plans — the difference between waiting eight seconds for a wall of text and watching the answer build in real time is the difference between trusting the agent and reaching for the refresh button.

Tool calls are far more reliable. When you ask the agent for something concrete — "what was the headline finding from the price-sensitivity study last month?" — it now executes the relevant tool call before answering, rather than answering from memory and getting the number wrong. Internally this is tool_choice="required" for retrieval-style prompts, but the customer-visible effect is that the agent stops making things up.

The agent now reads its own documentation. Every turn, the agent receives a contextual block called the Ditto MCP playbook — a versioned set of tool contracts, workflow recipes, and an error playbook covering the most common failure modes. The result: the agent knows about the new endpoints we shipped last week, knows that state is a 2-letter code rather than a full name, knows the difference between request_id and job_id, and so on. We update the playbook; the agent picks up the change on the next turn without a redeploy.

Progress updates during long work. When the agent is doing something that takes more than a moment — recruiting a panel, polling jobs, running synthesis — it now posts short status messages in the chat as it works ("recruiting panel… 7 of 12 confirmed"; "polling jobs… 4 of 10 finished"). No more silent waits.

Conversation replay. If you refresh the page mid-turn, the agent picks up exactly where it left off using an Idempotency-Key rather than restarting. SSE replay reconstructs the in-flight stream from the audited event log.

Under the hood

For the technically curious: this is a new microservice (agent_service) with FastAPI / SSE on the front and a clean separation between the chat orchestration layer and the persistence layer. Conversations, messages, runs, and run events are all audited to Postgres, which is what makes replay possible. Tools and credentials are resolved per-organisation per-request from a typed tool library; request-level overrides are constrained to a subset of the library allowlist so a misbehaving caller can't escalate.

We've moved from the old in-process LLM call site to OpenAI's Responses API, with MCP tool orchestration on top. None of those acronyms matter to the user; the visible effect is the speed and reliability story above.

What didn't change

The conversational surface — where you click to chat, what the agent looks like, the way you address it — is unchanged. The agent's name, its persona, and the URLs you talk to it from are all the same. Existing API key scopes are unchanged. The Slack agent (upgraded last week) shares the same underlying runtime improvements.

What this enables next

Two things, both due in the coming weeks:

One-off questions to a single persona or group, without the study scaffolding — landing this Friday.
Image and PDF attachments on questions, so the agent can show personas a logo or a one-pager and ask for reactions — also landing this week.

Neither is dependent on this rebuild, strictly, but both are easier to build well on top of the new runtime.

One operational note

Failed tool calls are now far more verbose in the chat — when something genuinely goes wrong, you'll see the underlying error rather than a generic "agent failure" message. Most of the time this is helpful; occasionally it's noisier than the previous behaviour. We're tuning the line between "useful detail" and "wall of stack trace" over the next couple of releases. Feedback welcome.

Full reference for the new agent service is in the API docs.

---

The customer-visible effect is that the agent stops making things up.

We update the playbook; the agent picks up the change on the next turn without a redeploy.

The difference between waiting eight seconds for a wall of text and watching the answer build in real time is the difference between trusting the agent and reaching for the refresh button.

Frequently Asked Questions

What changed for me as a user?

Three things you'll notice immediately: replies stream word by word rather than landing in a single block after a pause; retrieval prompts now enforce a tool call before answering so the agent stops making numbers up; and long-running work posts live progress updates in the chat as it happens.

What's the Ditto MCP playbook?

A versioned set of tool contracts, workflow recipes, and an error playbook for the Ditto API, injected as context on every agent turn. The result: the agent always knows the current API surface — endpoint shapes, filter formats, common failure modes — without needing to be redeployed when something changes.

Will this break my existing integrations?

No. The conversational surface, the agent's address, and existing API scopes are unchanged. The rebuild is internal — a new microservice runs the agent, but the customer-facing contract is the same.

What about the Slack agent?

The Slack agent (upgraded to v1 on 17th February) shares the same underlying runtime improvements. The two agents — in-product chat and Slack — now run on the same agent service.

What's coming next?

Direct one-off questions to a single persona or group without the study scaffolding, and image and PDF attachments on questions. Both land later this week.

About the Author

Phillip Gales

Phillip is a serial tech entrepreneur that specializes in applying AI and machine learning solutions to antiquated and heavy industries. He has been a senior leader or founder at a number of successful startups.

Phillip holds an MBA from Harvard Business School, an MEng from the University of Cambridge, and is a Y-Combinator alum