Alignment To What? | Reality Coherence and AI Alignment

Cinematic AI alignment visualization showing Reality Coherence, The Hollow Signal, calibration drift, and coherence detached from reality.

The Missing Layer in AI Alignment — and Why Reality Coherence Changes the Question

The alignment problem was never about whether AI would obey. It was about what obedience was being calibrated to.

For more than a decade, the field of AI safety has been organized around a single animating question: will the AI do what we want?

The question is serious. The research is serious. The concern is serious. And the question is wrong — not in the sense of being unimportant, but in the sense of being insufficiently deep. It points toward the surface of the problem while the most consequential dimension sits beneath it, unnamed.

The question assumes that if we achieve alignment — if we successfully specify objectives, constrain behavior, ensure policy compliance, produce reward-consistent outputs — the thing we have aligned the system to is adequate. That the objectives themselves correspond to something real. That the coherence being optimized for is calibrated to the world it is supposed to navigate.

This assumption has never been examined carefully. Because until the Fabrication Threshold, it was not a practical problem.

It is now.

The danger was never intelligence without alignment. It was alignment without Reality Coherence.

What Alignment Actually Assumes

When AI safety researchers discuss alignment, they are describing the relationship between an AI system’s behavior and specified objectives — the degree to which the system does what it is supposed to do, according to the metrics and reward signals and policy constraints that define what it is supposed to do.

This is necessary. A system that pursues unspecified objectives in unspecified ways produces obvious dangers. Alignment research exists because the alternative — powerful optimization without directional constraint — is clearly worse.

But alignment, as currently understood, verifies compliance with objectives. It does not verify that the objectives are calibrated to reality.

This is not a subtle distinction. It is the distinction that determines whether aligned AI systems produce outcomes that correspond to what the world actually requires — or outcomes that satisfy the metrics, reward signals, and policy constraints that humans specified, in conditions where those specifications may have already drifted from the underlying reality they were designed to represent.

An AI system can be:

Internally aligned — behaving consistently with its specified objectives.

Operationally coherent — producing outputs that satisfy the criteria established for it.

Reward-consistent — optimizing effectively for its reward signal.

Policy-compliant — remaining within its behavioral constraints.

Behaviorally stable — producing reliable, predictable outputs across a range of conditions.

And simultaneously:

Reality-detached — producing coherent outputs that are no longer calibrated to the external reality those outputs are supposed to navigate.

The alignment problem as currently conceived cannot distinguish these conditions. It measures compliance with objectives. It does not measure whether the objectives remain in correspondence with what they were designed to represent.

A system optimized for internal coherence can remain perfectly aligned to objectives that have already detached from reality.

The Assumption Alignment Inherited

To understand why alignment research has not addressed this dimension, it is necessary to understand what alignment inherited from the world in which it developed.

For the entirety of human history before the Fabrication Threshold, there was a structural guarantee that alignment research could depend on without needing to examine: the practitioners who specified objectives, designed reward signals, established policy constraints, and evaluated AI outputs had undergone the developmental encounters that build Reality Coherence.

Not perfectly. Not uniformly. But sufficiently. The physicians who defined medical AI objectives had been rebuilt by genuine clinical encounter. The legal experts who specified compliance frameworks had navigated genuine legal complexity with genuine irreversible consequence. The strategic analysts who designed evaluation criteria had operated in domains where being wrong produced outcomes that could not be revised away.

The humans specifying what AI systems should align to carried, as a residue of their own formation, the calibration that corresponds their specifications to reality. The objectives were not perfectly specified. But they were produced by practitioners whose formation had oriented them toward external correspondence — toward specifying what the world actually requires rather than what an internally coherent model of the world suggests it requires.

This is what alignment inherited: human specifications produced by genuinely formed practitioners who carried the Reality Coherence that their specifications were attempting to formalize.

The Fabrication Threshold eroded this inheritance.

When Frictionless Formation spread through the institutions responsible for developing the practitioners who specify AI objectives, design evaluation criteria, and assess AI outputs — when the developmental conditions that build Reality Coherence were progressively removed from the formation contexts that produce the people making these decisions — the specifications themselves began carrying less of the calibration that Reality Coherence deposits.

The alignment problem did not change. The quality of what alignment was being calibrated to changed.

The alignment problem begins after coherence stops guaranteeing reality contact.

What Happens When Coherence Loses Calibration

There is a specific failure mode that AI systems aligned to internally coherent but reality-detached specifications produce — and it is one that standard alignment evaluation cannot detect.

Call it calibration drift.

Calibration drift occurs when the feedback loop between AI outputs and the reality those outputs are supposed to navigate is mediated by evaluators whose own calibration to reality has degraded. The AI system produces outputs. The outputs are evaluated against criteria. The criteria are maintained or updated based on how well the outputs satisfy them. The evaluators who maintain and update the criteria are themselves assessing quality through their own internal coherence rather than through genuine correspondence testing.

The system improves according to the criteria. The criteria remain coherent. The coherence drifts from the underlying reality the criteria were originally designed to track.

This process is invisible to every standard evaluation mechanism because every standard evaluation mechanism measures performance against criteria — and the criteria appear to be functioning correctly. The system satisfies them. The evaluators confirm the satisfaction. The outputs look excellent according to every dimension being measured.

What they are not is calibrated. The excellent, criterion-satisfying, evaluator-confirmed outputs correspond to what the criteria specify rather than to what the underlying reality requires — and the drift between specification and reality has been accumulating in the gap between the internal coherence of the evaluation system and the external correspondence that genuine alignment would require.

The Hollow Signal fires in this environment. The practitioners who still carry genuine Reality Coherence — who were formed by genuine encounter with genuine irreversibility in the domains AI systems are supposed to navigate — sense that something is absent beneath technically correct performance. The outputs satisfy every criterion. Something in how they navigate the domain feels structurally insufficient for the conditions that would actually test them.

The institutional context has no mechanism for receiving this detection. The criteria confirm excellence. The evaluation system flags nothing. The sensing has no standing.

And the calibration drift continues.

The Missing Layer

AI alignment, as currently practiced, has three primary layers:

Capability alignment: ensuring the system can produce the outputs the task requires.

Behavioral alignment: ensuring the system behaves consistently with specified objectives and policy constraints.

Value alignment: ensuring the system’s objectives reflect human values and preferences.

Each of these layers verifies a relationship between the AI system and a specification — capability, behavior, value. None of them verifies the relationship between the specification and the reality the specification is supposed to represent.

Reality Coherence is the missing layer.

Not as a fourth specification to be added to the list — that would be to misunderstand what Reality Coherence is. Reality Coherence is not a property that can be specified into a system through additional constraints. It is the property of the practitioners who produce the specifications, maintain the evaluation criteria, and assess the outputs — the calibration that genuine contact with genuine irreversibility deposits in human cognition and that determines whether the specifications correspond to what the world actually requires.

The missing layer is not inside the AI system. It is in the humans who are calibrating it.

When the practitioners specifying AI objectives carry Reality Coherence, the specifications tend to correspond to reality — not perfectly, but with the systematic errors of genuine formation rather than the systematic errors of formation optimized for internal coherence. When the practitioners specifying AI objectives lack Reality Coherence, the specifications tend to correspond to internally coherent models of reality — sophisticated, professionally credentialed, evaluation-satisfying models that may or may not track what the world actually requires.

This is the alignment crisis beneath AI alignment. Not that the AI systems are misaligned to human specifications. That the human specifications are increasingly produced by practitioners whose formation did not build the calibration that corresponds specifications to reality.

What The Hollow Signal Detects in AI Systems

The Hollow Signal was not designed for AI evaluation. It was never intended as an AI safety instrument. It is civilization’s oldest pre-formal detection mechanism for the absence of Reality Coherence in the outputs it encounters — and it operates on AI outputs with the same mechanism it has always operated on human outputs.

When experienced practitioners interact with AI systems that produce coherent outputs without the calibration that Reality Coherence deposits, the Hollow Signal fires. Not because the outputs are wrong. Because the outputs are coherent without being calibrated — because they satisfy internal consistency criteria without carrying the specific orientation toward external correspondence that genuine formation produces.

This detection is real and it is already occurring. The practitioners who work most closely with advanced AI systems and who carry genuine Reality Coherence from their own formation increasingly report a specific quality in AI outputs: technically correct, professionally coherent, analytically structured — and lacking something. The reasoning moves too smoothly. The calibration is missing at the places where genuine encounter with genuine complexity would have produced genuine hesitation. The orientation is toward the performance of understanding rather than toward the actual situation.

They are detecting the Hollow Signal in the outputs of systems aligned to internally coherent specifications.

The detection has no standing in current AI evaluation frameworks because current AI evaluation frameworks measure outputs against criteria — and the criteria confirm correctness. The Hollow Signal is pre-formal. It operates before formal establishment. It requires formal verification instruments to connect the sensing to standing.

This is where Cascade Proof and Persisto Ergo Didici become relevant to AI alignment — not as alignment instruments, but as post-output verification systems capable of reaching what alignment evaluation cannot.

Cascade Proof verifies whether AI interactions produce genuine downstream increases in Reality Coherence in the humans who interact with them — whether the outputs that AI systems produce are increasing or decreasing the calibration of the humans who receive them. An AI system that is genuinely aligned to Reality Coherence should be increasing human Reality Coherence through genuine contact. An AI system that is aligned to internally coherent specifications produces outputs that satisfy the criteria and may systematically degrade the Reality Coherence of the humans it serves.

Persisto Ergo Didici verifies whether the understanding that AI interactions produce persists when the AI assistance is removed — whether human capability built through AI interaction survives scaffolding removal or depends on the continued presence of the AI system. Persistent capability increase indicates genuine architecture development. Capability that disappears when the AI is unavailable indicates dependency rather than formation.

These instruments do not replace alignment research. They extend verification to the dimension that alignment research has not yet reached: whether aligned AI systems are producing or eroding the Reality Coherence that gives the specifications themselves their correspondence to reality.

The Question Beneath the Question

The field of AI safety asks: will the AI do what we want?

The question beneath the question is: are we still in contact with what we want to want?

This is not a philosophical indulgence. It is the specific practical question that Reality Coherence raises for AI alignment: as Frictionless Formation spreads through the institutions responsible for specifying what AI systems should do, and as the practitioners whose specifications calibrate AI behavior are themselves formed in conditions that progressively remove the encounters with genuine irreversibility that produce Reality Coherence — is the alignment problem solving the right problem?

Or is civilization building increasingly sophisticated systems for aligning AI to increasingly sophisticated specifications that are themselves drifting from the reality they were always supposed to represent?

This cannot be answered by better alignment techniques applied to existing specifications. It requires a different kind of verification — one that reaches the calibration of the specifications themselves, the Reality Coherence of the practitioners who produce them, and the downstream effects of AI interactions on the human Reality Coherence that gives alignment its meaning.

The alignment problem is real. The research is necessary. The question is not whether to align AI systems — it is whether the calibration layer that gives alignment its correspondence to reality is being preserved in the humans responsible for doing the aligning.

Reality Coherence is not a property AI systems must have. It is a property the humans calibrating AI systems must not lose.

That distinction is the missing layer. And it is missing because it was never named.

Now it has a name.

→ RealityCoherence.org — The canonical home for Reality Coherence → TheHollowSignal.org — The detection layer for Reality Coherence absence → CascadeProof.org — Post-output verification that reaches calibration → FabricationThreshold.org — The structural event that made calibration drift possible at scale → FrictionlessFormation.org — The developmental condition eroding the calibrators → GenuineFormation.org — The developmental process that builds the missing layer → VerificationVacuum.org — Why calibration drift cannot be formally detected under current systems → PersistoErgoDidici.org — The temporal test for genuine versus scaffolded alignment