Nunchi AI's three-axis business in the age of latent reasoning

In one line

The reasoning paradigm changes every other week. The memory layer operates orthogonally above it. The stronger latent reasoning becomes, the more valuable verifiable external memory becomes.

The question Reddit brought back

A question keeps appearing on r/MachineLearning and other ML subreddits.

"Why do LLMs reason in natural language? Internally they are vectors, so wouldn't it be faster to reason directly in latent space?"

The intuition is attractive. And the answer already exists.

Meta's Coconut (2024.12), Chain of Continuous Thought. Instead of decoding the LLM's final hidden state into a token, Coconut feeds that hidden state directly back as the next input embedding. A single continuous thought can encode multiple possibilities at once and explore them like BFS. On logical reasoning tasks such as ProntoQA, it outperforms CoT.

So why is it not mainstream?

  1. Training breaks. Latent thoughts have no ground truth. They cannot be directly supervised, and even attaching RL such as GRPO makes training unstable.
  2. It gets worse at math. Logic improves, but math remains weaker than CoT.
  3. Interpretability disappears. Humans cannot read the reasoning trace. Safety, debugging, and auditability all break.
  4. The ecosystem is built on text. Evaluation, debugging, RAG, and caching all assume tokens.

What happens next

Latent reasoning can absolutely enter the mainstream partially, especially for unattended agents. That is exactly why the memory layer becomes more necessary.

The core thesis is simple.

The reasoning paradigm and the memory layer are orthogonal.

The more opaque reasoning becomes, the more valuable it is to know what the model knew.

Even if humans cannot see what a model is thinking in latent space, they can still trace which atoms the model recalled. That asymmetry is the real position of this business.

Here is how it applies across Nunchi AI's three-axis business structure.

01. Nunchi Plan: people and AI see the same memory

B2C/SMB. Norfolk + Nexus. An alternative to Notion and Obsidian.

The PKM market is split into two camps. In Notion, people write and people read. Obsidian is the same. Even when AI enters the workflow, trust does not accumulate if the reasoning trace cannot be seen.

Latent reasoning makes this problem worse. The tokens that explain "why the AI answered this way" disappear. Users only see the answer, and they cannot tell whether it came from their own notes or from hallucination.

Norfolk's position becomes clear here.

We may not know what the AI was thinking, but we can trace which facts it relied on.

The pipeline from Citation Graph to provenance grading to automatic re-atomization becomes mandatory, not optional, in the age of latent reasoning. This is the position Notion and Obsidian cannot occupy. They are text stores, not traceable memory systems.

02. Circuit Workflow: accountability for unattended agents

B2B. Conductor + Axon + Engram. Tech companies with 2-50 people.

This is the most direct line. The market for coding agents that generate PRs unattended is still open. Cursor and Devin are attended assistants. An agent that runs while the company sleeps is the position Circuit is aiming for.

The problem is that when an unattended agent writes code through latent reasoning, there is no answer afterward to the question, "Why did it write this?" Code review sees only the artifact. The decision trace disappears.

Circuit's answer is this.

Even if the reasoning trace is lost, the decision trace remains.

Conductor's branch selection, Engram's driven log, and the PRD atoms Axon recalled through AMCP all accumulate alongside the PR. Even if humans cannot see inside the model's head, an audit trail remains: "This PR came from yesterday's PRD decision #atom-3417 and two earlier meeting notes."

That is the first gate unattended agents have to pass in the enterprise market. Whether the company wants SOC2 or ISO, the ability to explain what evidence an AI decision relied on becomes mandatory. Text CoT pretends to satisfy this requirement, but as latent reasoning grows, even that pretense stops working.

Circuit does not depend on the reasoning trace from the beginning. It depends on the trace of decisions and sources. That position gets stronger as the market moves toward latent reasoning.

03. MaaS: the domain where reasoning should be hidden

Platform. Synapsis Engine API. Character chat -> robots and buddy bots.

This line is the opposite. In character chat, buddy bots, and robots, the reasoning trace should be hidden. Users do not want to see "why did you say that?" They want a natural response.

That makes this market the best fit for latent reasoning. Whether the model runs BFS inside its own head or follows some latent path, users only need to see the result.

But in this domain, the product value itself is memory. The character has to remember what the user said yesterday. The buddy bot has to keep a promise from a month ago. No matter how the model reasons, what the character or robot remembers has to be injected from outside.

This is the role of the Synapsis Engine API.

A single interface for injecting consistent memory, regardless of the model's reasoning paradigm.

Whether the model is CoT, Coconut-style, on-device Gemma, or GPT, the same atom structure provides the same memory. Model companies compete on reasoning. They do not compete on memory. That is the opening.

Character chat is the first market. The real market is robots and buddy bots. Whether a robot OEM uses its own model or an external model, it is rational for memory to attach to an external standard. Just as OAuth standardized authentication.

Why AMCP has to be the common standard across the lines

This is where the reason AMCP has to be the common standard across both lines becomes coherent again.

Whatever the reasoning paradigm is, CoT, latent, or hybrid, atoms are exposed through the same interface above it. Why AMCP is Apache 2.0 OSS, why the internal products speak through the same protocol, and why that protocol has to spread as an external standard all converge on one thesis.

The way models think changes every other week.

The way companies, people, and characters remember has to be standardized above it.

Closing

The Reddit question, "Why don't LLMs reason in latent space?", already has an answer. It is already entering partially. And it is not a threat to this business. It is an accelerator.

As models become smarter, reasoning becomes more opaque. The place that compensates for that opacity is the memory layer.

Synapsis -> Axon -> Engram. Structure -> transfer -> trace. That is why these three stages remain valuable regardless of the reasoning paradigm.

Land with Nunchi Plan, expand with Circuit Workflow.