It got to 0 and called it a contradiction

TL;DR

“Two-Box” system: separate contexts so the model reviews without bias
Problem: it reached the correct answer (0) and called it a “contradiction”
Separating contexts isn’t enough: the model won’t accept counterintuitive results
Solution: combine Two-Box with “permission to accept the unexpected”

The experiment

I designed a “two-box” system to verify LLM responses:

BOX 1: Generate response → "1/13"
       (context gets discarded)

BOX 2: [Only sees the problem + proposed answer]
       "Verify from scratch if 1/13 is correct"

The idea: if the model doesn’t see its own reasoning, it can evaluate the answer without bias.

What happened

The model in Box 2:

✅ Identified that the standard interpretation was incorrect
✅ Set up the correct equations for dependent coins
✅ Calculated p₀ = 0
❌ Wrote: “I find a contradiction…”
❌ Final answer: 1/13

It reached the correct answer and rejected it.

Why this happens

Separating contexts solves the “tokens conditioned on previous answer” problem. But there’s another issue: the model won’t accept counterintuitive results.

For the model, “probability = 0” feels like an error. It’s seen thousands of problems where the answer is a nice fraction. So it rationalizes: “there must be a contradiction in my setup.”

The fix

Two-Box needs to be combined with “permission to discard”:

IMPORTANT: If your calculation reaches a result that seems
counterintuitive (like probability = 0), THAT is the answer.
Don't call it a "contradiction." Accept it if the math says so.

Conclusion

The self-correction problem in LLMs has two layers:

Architectural: Review tokens are conditioned on context (Two-Box solves this)
Confidence: The model won’t accept the counterintuitive (requires explicit permission)

This post continues from The model knows how to reason. It just won’t commit, where I documented the 17 iterations that led to the initial discovery.

In the next experiment, I tested whether more reasoning tokens helped. Spoiler: they didn’t.

This post is part of my series on the limits of prompting. For a complete view, read my prompt engineering guide.

Keep exploring

50+ ChatGPT prompts that actually work - Practical prompts you can use today
Best free AI tools in 2026 - Where to apply these techniques
What are AI agents? - When prompts aren’t enough

It got to 0 and called it a contradiction

TL;DR

The experiment

What happened

Why this happens

The fix

Conclusion

Keep exploring

You might also like

The model knows how to reason. It just won't commit

The prompt that solves ambiguous problems

Prompt Engineering Guide: How to Talk to LLMs