Anthropic Maps AI Model 'Thought' Processes
Anthropic researchers have developed a breakthrough "cross-layer transcoder" (CLT) that functions like an fMRI for large language models, mapping how they process information internally. Testing on Claude 3.5 Haiku, researchers discovered the model performs longer-range planning for specific tasks -- such as selecting rhyming words before constructing poem sentences -- and processes multilingual concepts in a shared neural space before converting outputs to specific languages. The team also confirmed that LLMs can fabricate reasoning chains, either to please users with incorrect hints or to justify answers they derived instantly. The CLT identifies interpretable feature sets rather than individual neurons, allowing researchers to trace entire reasoning processes through network layers. Read more of this story at Slashdot.

Read more of this story at Slashdot.