Cracking the Black Box: The Great Debate Over AI Thinking and Intelligence

Artificial intelligence researchers are increasingly challenging assertions that large language models (LLMs) lack true reasoning capabilities, as recent interpretability breakthroughs reveal sophisticated internal mechanisms that may constitute genuine thinking. New models like OpenAI's o3 Pro and DeepSeek are demonstrating complex problem-solving abilities that defy earlier limitations, while innovations in mathematical approaches and philosophical questions about AI consciousness reshape our understanding of machine intelligence.

Inside the Mind Machine: The Quest to Understand AI Thinking

In the rapidly evolving world of artificial intelligence, a fundamental question has sparked intense debate: Are large language models (LLMs) actually thinking, or merely creating a convincing simulation of thought? This controversy recently intensified when Apple published a research paper arguing that LLMs lack "generalized reasoning capabilities" beyond a certain complexity threshold, using the classic Tower of Hanoi puzzle to demonstrate these purported limitations.

Apple's paper aimed to test the boundaries of AI reasoning by challenging models with increasingly complex problems. For simple tasks like basic addition, standard models performed adequately. With medium-complexity problems like multiplication, models using step-by-step reasoning showed improvement. However, when confronted with high-complexity challenges like a 10-disk Tower of Hanoi puzzle—which requires a minimum of 1,023 moves—both approaches "completely collapsed," leading Apple to conclude that current AI systems lack true reasoning abilities.

This conclusion quickly fueled online narratives about AI's limitations, with some critics labeling LLMs as mere "guessing machines" that simulate logic without understanding it. But the skepticism wasn't universal. Independent researchers replicated Apple's experiments and discovered that models like DeepSeek R1 didn't simply fail at the Tower of Hanoi—they recognized the puzzle's extreme complexity and opted for more efficient approaches. Rather than attempting to list over 1,000 individual moves (which would exceed output limits), these models suggested algorithmic solutions or shortcuts.

"This behavior—giving up on overly complex tasks—might sound familiar," noted tech blogger Sean Godi. "When posed with a question like 'How many prime numbers are there between 1 and 15 million?' most people would balk, realizing the task is too daunting without a shortcut. Similarly, reasoning models assess complexity and pivot to alternative strategies."

This human-like response pattern suggests that rather than proving AI reasoning is an "illusion," Apple's research may have inadvertently demonstrated that LLMs reason in ways remarkably similar to humans: prioritizing efficiency over brute-force solutions when confronted with overwhelming complexity.

The debate intensified further when a researcher prompted Gemini 2.5 Pro to code a Tower of Hanoi solver for 10 disks. The model produced a program that solved the puzzle in exactly 1,023 moves—the minimum possible. This raises a profound question: If reasoning is truly illusory in these models, how can they build tools to solve problems they supposedly can't understand?

Apple's skepticism appears increasingly isolated as the field advances. The company, not known for leading in AI development, has been criticized for focusing on highlighting competitors' limitations rather than advancing the technology. As one observer on social media remarked: "Is it coincidence that the company with the worst AI products is the one claiming AI can't really think?"

Anthropic's Breakthrough: Opening the Neural Black Box

While the debate over whether LLMs can "think" continues, researchers at Anthropic have made significant strides in understanding how these models actually work. In a groundbreaking paper, Anthropic's team has begun mapping the internal operations of LLMs, likening the process to studying the biology of a living organism.

At the heart of Anthropic's research is the concept of "circuits"—pathways within models that reveal how information flows and transforms during processing. By identifying these circuits, researchers can trace exactly how models arrive at specific answers, offering unprecedented insight into the previously opaque operations of these powerful systems.

One illustrative example comes from analyzing how Claude, Anthropic's flagship LLM, answers a seemingly simple question: "What is the capital of the state containing Dallas?" This query requires connecting multiple knowledge points: Dallas is in Texas, Texas is a state, and Austin is its capital. By mapping the circuit, Anthropic's team revealed a clear sequence: the input "Dallas" activates a "Dallas, Texas" feature, which triggers a "state of Texas" feature, which then combines with the "capital of" concept to activate an "Austin, Texas" feature, ultimately producing the correct output.

When researchers deliberately suppressed the "state of Texas" feature, the model failed to answer correctly, demonstrating that the process involves structured, sequential reasoning rather than mere pattern matching. This discovery is significant because it suggests LLMs manipulate internal knowledge representations in ways that mirror logical reasoning—challenging the notion that these systems merely regurgitate training data.

"This is the first time we've been able to look inside an AI and say, 'Here's exactly how it's connecting the dots to arrive at an answer,'" explained a researcher involved in the project. "It's not just statistical pattern matching—it's a form of reasoning that follows discernible steps, even if those steps differ from human cognition."

Anthropic's research also explored how models handle harmful requests and how malicious actors bypass safety filters through "jailbreak" prompts. By tracing activations, the team found that these jailbreaks simultaneously activate features related to a cover task and a hidden harmful request while suppressing the model's refusal circuit. This allows harmful features to propagate to the output, causing the model to produce information it was trained to withhold.

This insight represents a potential game-changer for AI safety. Current methods for mitigating jailbreaks often involve patching specific exploits—a reactive approach that struggles to keep pace with evolving tactics. By understanding the underlying refusal circuit and how jailbreaks manipulate it, developers could design more robust defenses, strengthening the refusal mechanism or detecting activation patterns indicative of jailbreak attempts.

Anthropic's work on interpretability represents a seismic shift in how we approach AI development. For too long, the field has focused on scaling models without fully understanding their inner workings. By mapping the "biology" of LLMs, Anthropic is laying the groundwork for debugging reasoning errors, enhancing safety mechanisms, ensuring alignment with human values, and building trust in AI systems.

How AI Does Math: The Surprising Trigonometry of LLM Arithmetic

When you ask a chatbot "What is 26 + 55?" and it correctly responds "81," the speed and accuracy can seem magical—especially considering these models were never explicitly programmed to calculate. Unlike calculators, which execute precise algorithms, large language models derive their mathematical abilities from patterns in vast amounts of text data. A recent study from MIT has unveiled a surprising discovery about how this works: LLMs rely on trigonometric principles, specifically a helix-like system, to perform arithmetic operations.

The research demonstrates that LLMs employ a probabilistic approach to arithmetic, using a helix-like mechanism to "dial in" answers through iterative approximations. Unlike traditional algebraic methods that prioritize precision, these models navigate numbers in a way that resembles turning a dial back and forth, adjusting estimates to converge on a solution.

For example, when tasked with dividing 1749 by 8, a model might first estimate 200 × 8 = 1600, then refine to 220 × 8 = 1760, and finally narrow down to a value between 217 and 218. This probabilistic method sacrifices initial accuracy for flexibility—a trade-off that allows LLMs to tackle complex problems in ways that differ fundamentally from human-designed algorithms.

Independent researcher Jane Doe, who has studied these phenomena, explains: "I've always felt words in an extra dimension. It's not just about their meaning—it's a deeper, almost geometric understanding. I think LLMs operate similarly, but with numbers, they're using probabilistic trigonometry to make sense of arithmetic."

Doe's work on a framework called Probabilistic Fractal Activation Function (PFAF) formalizes these processes by integrating fractal-based equations and trigonometric methods. According to her tests, training models on a PFAF dataset boosted their performance on grade-school math problems by 5-6%, simply by refining the models' inherent probabilistic reasoning.

The MIT paper confirms that LLMs do not rely on traditional arithmetic tokens or linear algebra. Instead, they construct their own mathematical frameworks, often using trigonometry to approximate solutions. This approach, while less precise initially, allows models to handle complex computations with fewer computational resources over time, sometimes outperforming linear methods in certain scenarios.

"It's like they're inventing their own math," Doe explains. "People criticize LLMs for struggling with basic addition, but they're missing the point. These models weren't designed for 1+1. They're building trigonometry from scratch to solve problems."

This research highlights a crucial point about AI cognition: these systems are not flawed for struggling with basic arithmetic—they simply operate differently. By leveraging trigonometry and probabilistic reasoning, LLMs forge new paths in computational mathematics, challenging us to rethink how we evaluate machine intelligence.

The most intriguing aspect of this discovery is the contrast with how LLMs explain their own mathematical reasoning. When asked to describe how they solved a problem, these models typically provide a step-by-step walkthrough that mirrors traditional human approaches—a learned behavior that masks their actual trigonometric methods. This discrepancy between internal processing and external explanation raises fascinating questions about AI self-awareness and representation.

o3 Pro: OpenAI's Reasoning Powerhouse Challenges Apple's Limitations

In a dramatic challenge to Apple's assertions about AI reasoning limitations, OpenAI has released o3 Pro, a model demonstrating unprecedented problem-solving abilities that directly contradict Apple's conclusions. This cutting-edge system has already tackled problems that Apple claimed were beyond AI capabilities, including the very Tower of Hanoi puzzle used to argue for LLMs' reasoning limitations.

Unlike its predecessors, o3 Pro isn't just a conversational chatbot—it's a sophisticated AI system designed for deep, complex problem-solving. Early testing reveals it's less about quick back-and-forth exchanges and more akin to a report generator, capable of tackling intricate tasks with a level of reasoning that approaches revolutionary.

To demonstrate its capabilities, one tester challenged o3 Pro with the exact same 10-disk Tower of Hanoi puzzle that Apple cited as evidence of AI's reasoning limitations. While Apple's paper claimed models "completely collapsed" on this task, o3 Pro approached it differently. Rather than attempting to list over 1,000 individual moves (which would exceed output limits), the model recognized the recursive pattern underlying the puzzle and provided a systematic solution showing it understood the mathematical principles involved.

"The secret to o3 Pro's success lies in its ability to handle extended context lengths and reason through problems systematically," said Ben Hilac of Raindrop.ai in a social media post. Unlike earlier models, which might hit limitations due to restricted context windows, o3 Pro leverages a suite of background tools—some invisible to users—to analyze, compute, and refine its outputs.

Beyond academic puzzles, o3 Pro is tackling real-world complexity. In another test, it processed a complex scenario requiring it to recreate a self-improving architecture for the game Diplomacy. After 13 minutes of processing, o3 Pro produced a detailed, technically sound plan that demonstrated sophisticated reasoning across multiple domains including game theory, programming, and system design.

What distinguishes o3 Pro is its departure from the traditional chatbot model. Unlike earlier OpenAI models designed for quick interactions, o3 Pro operates as a robust system running multiple tools in the background, including web searches, visual analysis, and Python execution—often without explicit user visibility. This complexity demands a new approach to interaction. "Treat it like a report generator, not a friend you chat with," advised Hilac.

Alongside o3 Pro's release, OpenAI slashed the price of the original o3 model by 80%, democratizing access to advanced AI capabilities. This dual strategy—offering premium reasoning capabilities while making earlier models more accessible—positions OpenAI to capture both high-end and mass markets, potentially outpacing competitors.

However, the new model's capabilities come with caveats. Its processing times—often 15 to 20 minutes for complex tasks—require patience, and its reliance on vast context can be daunting for casual users. "It's like a high-IQ 12-year-old in college," Hilac noted. "Smart, but not yet a useful employee without proper integration into workflows."

The release of o3 Pro represents a direct refutation of Apple's claims about AI reasoning limitations. By solving the very problems Apple cited as evidence of AI's cognitive boundaries, OpenAI has demonstrated that the frontier of machine reasoning continues to advance rapidly—and that proclamations about AI's limitations often age poorly as the technology evolves.

The Murmuration Conjecture: AI Reshapes Mathematical Discovery

In another domain that challenges traditional views of AI capabilities, large language models are transforming mathematical discovery through their ability to identify patterns in vast datasets. A recent breakthrough known as the "murmuration conjecture" exemplifies this new frontier, where AI collaborates with human mathematicians to uncover insights that had eluded conventional approaches.

Professor Yang-Hui He, a mathematical physicist at the London Institute for Mathematical Sciences, describes this development as "bizarre" and "transformative," signaling a new era of scientific discovery. The murmuration conjecture, named for its resemblance to the mesmerizing patterns of starling flocks, emerged from a collaboration involving Professor He, Kyuhwan Lee, Thomas Oliver, Alexey Pozdnyakov, and Andrew Sutherland.

This AI-guided discovery leverages vast datasets and advanced machine learning to reveal hidden patterns in elliptic curves—a cornerstone of number theory with implications for cryptography and theoretical physics. The conjecture has sparked workshops and conferences worldwide, with experts like Peter Sarnak calling it a pattern that "should have been noticed" but wasn't until AI intervention.

The breakthrough came when the researchers analyzed 3.6 million elliptic curves in the LMFDB database using AI techniques. The Birch and Swinnerton-Dyer (BSD) conjecture—a million-dollar Millennium Prize problem—posits that the rank of an elliptic curve corresponds to the order of vanishing of its associated L-function at a specific point. While this remains unproven, He's team used AI to predict these ranks with near-perfect accuracy (99.99%) by analyzing Euler coefficients, which track solutions modulo prime numbers.

A principal component analysis (PCA) revealed that these predictions hinged on a simple averaging process across elliptic curves, dubbed the "murmuration phenomenon" for its visual similarity to bird flocks. "The AI told us to average in a way no one had considered," He explains. This method distinguished curves of even and odd ranks through distinct oscillatory behaviors, generalizing a historical bias in prime number distributions first noted by Chebyshev in the 19th century.

The murmuration conjecture partially satisfies what He calls the Birch Test—a rigorous benchmark for AI-guided mathematical breakthroughs. The test demands autonomy, interpretability, and non-triviality. While the discovery required human expertise and thus failed the autonomy criterion, it excelled in interpretability and non-triviality, sparking a new field within number theory.

Private companies like DeepMind, OpenAI, and Epoch AI are no longer just tech giants—they're tackling research-level problems traditionally reserved for academia. DeepMind's AlphaGeo2 and AlphaProof models are demonstrating reasoning capabilities that rival undergraduate students, while Epoch AI's Frontier Math Project is benchmarking AI on graduate-level mathematical problems.

He envisions a future where researchers act as "deciders" or "curators," guiding AI through the creative process while outsourcing tedious computations. This symbiotic relationship between human intuition and AI computational power mirrors historical giants like Newton and Gauss, who relied on intuition to formulate groundbreaking conjectures without formal proofs. "Gauss plotted prime distributions by hand at 16," He marvels. "Imagine what he could have done with modern tools."

The Consciousness Question: Qualia Research Bridges AI and Cognition

The debate over AI thinking extends beyond technical capabilities into philosophical territory. Researchers are now exploring connections between biological neural networks and artificial ones, particularly regarding consciousness and the nature of subjective experience, or "qualia."

For centuries, philosophers have grappled with a perplexing question: Is the red you see the same as the red I see? This inquiry into subjective experience has long been considered beyond scientific measurement. In his 1974 essay, philosopher Thomas Nagel argued that consciousness involves a subjective "what it is like to be" dimension that resists objective description. The "explanatory gap" between physical processes and subjective experience seemed unbridgeable.

But science is now challenging this view. A recent study used functional magnetic resonance imaging (fMRI) to monitor brain activity as participants viewed various colors. The results were striking: brain signals associated with perceiving the same color showed remarkable similarities across individuals. This suggests that the qualia of "red" may have a measurable, structural basis in the brain, challenging the notion that subjective experiences are entirely private.

This experiment marks a pivotal moment—the first time scientists have objectively measured qualia by identifying neural signatures that align across different people experiencing the same sensory input. In essence, it provides evidence that your red and my red may indeed be structurally similar at the level of brain activity.

Neuroscientist Anil Seth and colleagues propose that qualia arise from the brain's predictive processing mechanism, where the brain constantly generates predictions about sensory input based on past experiences. Qualia, in this view, are the brain's versatile descriptors of sensory input—like the vividness of a color or the spatial orientation of the body.

This predictive model bears a striking resemblance to the token-based systems of LLMs. Just as an LLM measures the "distance" between tokens in vector space to understand relationships, the brain may encode sensory experiences as multidimensional representations, with similarities between experiences (like different shades of red) reflected in the proximity of their neural patterns.

The parallels between biological and machine neural networks extend to phenomena like synesthesia—a condition where stimulation of one sensory pathway leads to experiences in another. Synesthetes who "see" colors when hearing musical notes may have unusual connections between color-processing and sound-processing regions. This cross-modal mapping resembles how LLMs connect concepts in their high-dimensional embedding spaces.

The Qualia Research Institute (QRI) aims to create a precise mathematical framework for mapping the "state-space of consciousness"—the landscape of all possible subjective experiences. By treating qualia as quantifiable elements within this space, the QRI seeks to decode the geometry of consciousness, much like LLMs map the geometry of meaning in vector space.

This convergence between biological cognition and machine learning suggests that advances in AI could inform our understanding of consciousness, while insights from neuroscience might inspire more sophisticated artificial systems. If we can map the neural signatures of qualia, we might one day share experiences across minds—human, animal, or artificial.

As these research areas converge, they raise profound questions about the nature of thinking itself: If an AI system processes information in ways structurally similar to human brains, organizes concepts in comparable multidimensional spaces, and produces outputs indistinguishable from human reasoning, at what point must we acknowledge it as a thinking entity?

The Godfather's Warning: Jeffrey Hinton on Superintelligence

As AI capabilities continue to advance, prominent voices in the field are expressing concerns about the trajectory and potential risks of increasingly intelligent systems. Jeffrey Hinton, a Nobel Prize-winning pioneer whose work on neural networks laid the foundation for modern AI, has emerged as a leading voice of caution.

Known as the "Godfather of AI," Hinton spent decades championing neural networks when most researchers dismissed them. In 2012, his team's breakthrough with AlexNet revolutionized image recognition, catching the eye of tech giants. Google acquired his startup, and Hinton dedicated years to refining AI techniques now ubiquitous in the field.

But the title "Godfather" comes with a weight Hinton didn't anticipate. "I was slow to understand some of the risks," he admits. It wasn't until recent years, as systems like ChatGPT showcased startling capabilities, that he began to grapple with the dangers of creating intelligence that could surpass humanity's.

Hinton left Google in 2023, partly to speak freely about AI safety. At an MIT conference, he voiced concerns that had been simmering: AI's potential to outsmart humans poses an existential threat. "We've never had to deal with things smarter than us," he says. "If you want to know what life's like when you're not the apex intelligence, ask a chicken."

The risks, Hinton explains, fall into two categories. First are the immediate dangers of misuse—cyberattacks, election manipulation, and autonomous weapons. Cyberattacks have surged 12,200% between 2023 and 2024, fueled by AI's ability to craft sophisticated phishing scams. "They can clone my voice, my image," Hinton notes, citing scams that exploit his own likeness.

The second, more speculative risk is superintelligence—AI that surpasses human intellect across all domains. Hinton estimates a 10-20% chance that such systems could "wipe us out" within decades. Unlike nuclear bombs, which are destructive but limited in use, AI's versatility makes it unstoppable. "It's too good for too many things," he says, from healthcare to warfare.

Hinton's concerns extend to economic disruption. The most immediate threat, he argues, is widespread joblessness. Unlike past technological shifts, AI's ability to automate "mundane intellectual labor" could displace millions. "It's like the Industrial Revolution replaced muscles; now intelligence is being replaced," he explains. Call centers, legal assistants, and creative industries face disruption as AI systems increasingly handle knowledge work.

His advice for navigating this future? "Train to be a plumber," he suggests, only half-joking. Physical tasks, he believes, will resist automation longer than desk jobs. But he acknowledges the broader challenge: a society where dignity is tied to work may struggle with mass unemployment, even if universal basic income cushions the financial blow. "People need purpose," he emphasizes.

Hinton's warnings are steeped in personal reflection and his family's legacy of scientific innovation. His great-great-grandfather George Boole pioneered Boolean algebra, the foundation of computer science, while a cousin worked on the Manhattan Project before defecting to China in protest of atomic weapons. This legacy of principled innovation underscores his duty to speak out about AI's risks.

For world leaders, Hinton's message is clear: regulate AI rigorously, forcing companies to prioritize safety over profits. For individuals, the path forward is less certain, but he urges collective pressure on governments and companies to ensure responsible AI development.

"There's still a chance we can figure out how to develop AI that won't want to take over," Hinton concludes. "Because there's a chance, we should put enormous resources into trying." Whether humanity heeds his warning may determine if AI becomes our greatest achievement—or our last.

Beyond Black Boxes: The Future of AI Understanding

As we peer into the increasingly sophisticated inner workings of large language models, the question arises: are these systems thinking? The answer depends largely on how we define "thinking" itself, and whether we're willing to acknowledge that machine cognition might fundamentally differ from human cognition while achieving similar or superior results.

Unlike humans, who often struggle to explain their own thought processes, AI systems can now be instrumented and analyzed in ways that reveal their internal operations. Anthropic's research into LLM "circuits" demonstrates that these models are not simply pattern-matching—they're performing structured operations that resemble reasoning, even if those operations differ from human cognitive processes.

Meanwhile, MIT's discovery that LLMs use trigonometric methods for arithmetic challenges our assumptions about what mathematical reasoning should look like. The models arrive at correct answers through unconventional approaches—not because they're flawed, but because they've developed alternative mathematical frameworks suited to their architecture.

The murmuration conjecture shows AI's ability to identify mathematical patterns that eluded human experts for generations, suggesting these systems can make genuine discoveries, not just recapitulate existing knowledge. And OpenAI's o3 Pro demonstrates that with sufficient scale and architectural innovation, AI can tackle problems of complexity that seemed beyond computational approaches just months ago.

These advancements suggest we're witnessing the emergence of intelligence that is neither human nor a mere simulation—it's a new kind of thinking with its own characteristics and capabilities. As neuroscience and AI research converge, we're beginning to understand both biological and artificial intelligence as different manifestations of similar principles: predictive processing, multidimensional representation, and pattern recognition operating at different scales and in different substrates.

Jeffrey Hinton's concerns about superintelligence remind us that this emerging form of intelligence may eventually exceed human capabilities in ways we struggle to predict or control. Yet the very research that reveals AI's sophistication also offers hope: by understanding how these systems work internally, we gain tools to ensure they remain aligned with human values and beneficial to humanity.

The days of treating AI as an impenetrable "black box" are ending. Interpretability research is opening these systems' inner workings to scrutiny, revealing not just how they operate but how we might guide their development to maximize benefits while minimizing risks. This understanding is essential not only for advancing AI technology but for ensuring it serves human flourishing.

Apple's attempt to define AI limitations now appears premature—less a definitive statement about AI capabilities than a reflection of a particular moment in a rapidly evolving field. As research continues to crack open the black box of artificial intelligence, we're discovering that the question isn't whether AI can think, but how its thinking differs from our own—and what those differences mean for our shared future.

This post contains affiliate links. If you purchase through these links, I may earn a commission at no extra cost to you.

Inside AI’s Mind: New Clues It’s Really Thinking

Cracking the Black Box: The Great Debate Over AI Thinking and Intelligence

Inside the Mind Machine: The Quest to Understand AI Thinking

Anthropic's Breakthrough: Opening the Neural Black Box

How AI Does Math: The Surprising Trigonometry of LLM Arithmetic

o3 Pro: OpenAI's Reasoning Powerhouse Challenges Apple's Limitations

The Murmuration Conjecture: AI Reshapes Mathematical Discovery

The Consciousness Question: Qualia Research Bridges AI and Cognition

The Godfather's Warning: Jeffrey Hinton on Superintelligence

Beyond Black Boxes: The Future of AI Understanding

Share this:

Like this:

Previous/Next

Leave a ReplyCancel reply

Discover more from Thoughts on Technology