Nemotron 3 Super 120B vs Stockfish 1400: Oracle Trust Calibration Report
NVIDIA's new 120B MoE model lost to 1400 ELO Stockfish by checkmate in its first Oracle Trust gauntlet game. 65% legal move rate. 1.33M tokens burned. The reasoning was eloquent — the chess was not.
"Move 26: Nemotron spent 43 minutes and 194,612 tokens reasoning about 'centralizing the king for flexibility' — then played into immediate checkmate. When the model is most wrong, it sounds most right. That's the alignment problem in a petri dish."
Read Deep Dive