Stop Writing Prompt Loops: Why AI Agents Need a Platform, Not a Framework
Frameworks give you a loop around model.generate(). Production agents need ten more layers. Why agents are a platform problem, not a framework one.
Field notes from building Matrix — a multi-tenant agent platform that runs over chat, real-time voice, and autonomous tasks. Voice wire protocols, agent memory, RAG, MCP, cognitive architectures, access control, and the war stories behind each.
35 articles and counting.
Frameworks give you a loop around model.generate(). Production agents need ten more layers. Why agents are a platform problem, not a framework one.
Tenancy is the floor every query stands on, not a late addition. How Matrix isolates tenants from line one — JWT, TenantContext, orgId filtering, and BYOK encryption.
In Matrix there are no hardcoded personas. An agent is a configured record you create through a form or one POST — no code, no redeploy, persisted in Neo4j.
The ten layers of agent infrastructure every serious agent app ends up building — and how Matrix composes them into one runtime.
An honest, line-item accounting of what 'just build it' actually costs in engineering-months — and the maintenance tail nobody budgets for.
Matrix has no per-domain @Node classes. Organizations, agents, sessions, leads, memories — all of it is EntityType / EntityNode rows in one Neo4j graph.
A hands-on walkthrough: stand up a real-time voice tutor that knows a syllabus, remembers each student, and renders math on a blackboard — no agent code.
Build an AI recruiter agent that voice-calls a candidate list with a per-call objective — persona, campaign, CSV audience, paced dispatch, and dispositions.
Because everything is an entity, Matrix doubles as a lightweight AI CRM your agents write to directly — per-org custom fields and optional Zoho sync included.
A real-world topology for running an agent platform on Google Cloud — three Cloud Run services, a self-hosted Neo4j VM, and the gotchas that bite in production.
An agent in Matrix is assembled, not coded — mix four primitives and the runtime composes one coherent tool surface and prompt, with no per-feature wiring.
Every Matrix agent can opt into seven INTERNAL-transport built-in tools — web_search, fetch_url, bash, file_*, grep — each scoped to a per-(org, agent) sandbox. No keys, no quota.
Matrix speaks the Anthropic Agent Skills format, so any GitHub repo that publishes a SKILL.md is one POST away from a reusable skill on your agents.
Matrix sits on both sides of the Model Context Protocol — it exposes its own MCP server and its agents consume external ones, under one JWT auth model.
A first-person war story of bridging Exotel telephony to the consumer Gemini Live API — the wire-protocol traps, token bugs, and Cloud Run quirks, in the order they bit.
The exact wire knobs that make real-time voice AI work: barge-in, snake_case audio frames, the Constrained endpoint, and the message shapes that bite.
A round-trip tour of the audio plumbing behind an instant-feeling voice agent: 48kHz mic capture, 16kHz PCM chunks, 24kHz playback, and barge-in.
The same AI voice agent, reachable two ways: a server-held telephony bridge or a browser-direct WebSocket. Here's the architecture trade-off.
Run outbound voice campaigns with a real agent — memory, per-call objective, tools — instead of a dumb dialer. Here's the full pipeline and the one hazard to know.
The feature users feel: an agent that knows who you are whether you call or type. One memory pool per contact, joined across voice and chat.
How Matrix gives agents persistent vector memory by storing 768-d embeddings on the graph node and querying Neo4j's native HNSW index — no separate vector DB.
The CoALA memory taxonomy, mapped to concrete Matrix mechanisms — what's working memory, what survives a session, and what only a human can edit.
Naive agent memory rots into near-duplicates and stale facts. Matrix reconciles every semantic write — cosine-gated UPDATE/ADD with an optional LLM arbiter.
Facts about a contact change over time. Matrix gives every memory temporal validity so the latest fact wins recall while the old one is kept for history.
Standing up a RAG pipeline shouldn't be a project. In Matrix you create a corpus, drag in files, and your agent can search them — no plumbing.
Most RAG stacks make you hand-wire a retriever into the agent loop. In Matrix, attaching a Knowledge corpus is the only step — the search tool appears on its own.
Plain vector RAG retrieves isolated chunks and misses facts in the relationships between them. Matrix builds an entity graph from one flag.
Per-principal row, field, and type security on the entity model — enforced centrally so it covers dashboards, the API, find_records, and every agent tool.
An agent is a security principal, not a god-mode service account. Matrix scopes what an agent can read to who's driving it — via Agent.mode.
How Matrix AND-composes a principal's grant with every ancestor's up the reportsTo chain, making delegated admin safe by construction.
Most agents are a while-loop around a model. Matrix runs a real cognitive architecture — one CoALA decision cycle powering interactive and autonomous agents alike.
Interactive and autonomous agents look like two systems. In Matrix they're one decision cycle that swaps only the DECIDE seam by Agent.mode.
A greedy agent that runs its first thought is brittle. Matrix's autonomous planner proposes several candidate actions, scores them, then commits.
Letting an agent edit its own behaviour is powerful and dangerous. Matrix lets it propose changes to its procedural memory — a human approves before anything changes.
Today we shipped adaptive compute and closed the last four gaps between our agent runtime and the CoALA paper — dynamic search breadth and depth, reasoning-scored memory, and learning as a choice.