Why Agents Become Unstable as Conversations Get Longer

I recently published a post about AMCP.

After publishing it, I realized there was a more basic point I should have explained first: how agents actually handle context, and why they tend to become unstable as sessions grow longer.

Many people assume that if a model has a one-million-token context window, it can effectively remember everything. But when you use coding agents for long sessions, that is not really what happens.

In practice, as a session grows longer, you start to see familiar problems:

the agent suddenly misses important context
it asks again about configuration it seemed to know a moment ago
it loses the working rhythm of the task
eventually, you open a new session just to recover

Why does this happen?

1. The amount being sent keeps growing

Even if a conversation feels like a simple back-and-forth chat, what gets sent to the model is much larger than it appears.

For example:

when you ask question 1, only question 1 is sent
when you ask question 2, question 1 + answer 1 + question 2 are sent
when you ask question 3, question 1 + answer 1 + question 2 + answer 2 + question 3 are sent

As the conversation continues, the system does not send only the latest question. It keeps sending the accumulated conversation bundle along with it.

That means even a large context window can shrink quickly in practical use. One long error log in the middle of a session can consume a significant part of the available space.

The important point is this: having a large context window is not the same thing as reliably preserving the information that matters.

2. Coding-agent context usually has three layers

In practice, the context of a coding agent can be thought of as three layers.

Layer 1. The model provider's system prompt

This is the system-level instruction inserted by the model provider, such as Anthropic, OpenAI, or Google.

Users usually do not control this layer directly.

Layer 2. Persistent context the user considers important

For example:

CLAUDE.md
AGENTS.md
.cursorrules
team rules
project structure notes
preferred working style

This is the layer that tells the agent how the work should be done.

Layer 3. The growing pile of questions and answers in the current session

This is the layer that grows the fastest.

questions
answers
error logs
revision requests
follow-up questions
tool execution results

These keep accumulating throughout the session.

3. The part users can actually control is layer 2 and layer 3

This is where the real problem begins.

What users can meaningfully control is:

Layer 2: persistent context
Layer 3: the current session's conversation

But models do not treat long input as if every part mattered equally. In long contexts, the beginning and the end often exert stronger influence, while information in the middle becomes easier to miss.

This is often described as "Lost in the Middle."

That is why you see patterns like these:

the model responds well to the most recent question
but important instructions sitting in the middle, such as CLAUDE.md, AGENTS.md, or .cursorrules, are easier to ignore
even a simple question like "What was that environment variable name again?" still carries the cost of the full layered context

In other words, even a tiny question rides on top of the full context bill.

As this continues:

the context fills up quickly
older information gets truncated or weakened through compression
some information survives only in degraded form
the model starts to drift
eventually, you open a new session

4. But when you open a new session, continuity breaks

Opening a new session usually solves one problem:

the session becomes lighter again.

But it also creates a larger one:

important context from the previous session disappears.

For example:

what architectural direction had already been chosen
what constraints had been agreed on
what approaches the user dislikes
what failed once already
what project rules matter most

These things should survive into the next session. But in most systems today, they vanish with the session or remain trapped inside the internal memory of one product.

5. What agents need is not just conversation, but memory

This is the distinction that matters.

Agents do not simply need longer chat logs. What they really need is a memory layer that extracts and preserves only what remains important over time.

For example:

user preferences
project conventions
important decisions
failed attempts and their fixes
task context that should survive into the next session

These are not the same thing as the raw transcript. They are the durable, high-value information distilled from it.

So the goal is not:

to resend the full conversation forever

Instead, the goal is:

to structure and save what matters
and bring it back when the next session or another agent needs it

6. Nexus exists to provide that memory layer

Nexus was built to solve exactly this problem.

The core idea is simple:

structure and save important conversational context
recall it even after a session changes
carry it across clients, such as from Claude Code to Codex
preserve continuity even when the runtime changes

In other words, Nexus turns context trapped inside a chat window into portable, user-controlled memory.

7. And that is why a standard is needed

At that point, a natural question appears:

"If memory is still trapped inside one product, doesn't that recreate the same problem?"

Yes. That is why memory storage and movement also need an open standard.

That is what AMCP, the Agent Memory Continuity Protocol, is trying to provide.

AMCP defines a minimal common contract for agent memory:

what shape a record should have
how recall should work
what scopes mean
how retention and deletion should be expressed
how export and import should work across systems

With a standard like that, memory no longer has to remain locked inside one model provider or one product. It becomes possible to move it across implementations and runtimes.

8. In short

Agents do not become unstable simply because the model is unintelligent.

The problem is that:

longer sessions keep increasing the amount of context being sent
important instructions get buried inside long contexts
and opening a new session breaks continuity

So what is needed is not just a larger conversation window. What is needed is a memory layer that separates, preserves, and restores what matters.

That is why we are building Nexus and proposing AMCP.

If you want to continue from here: