Saturday, February 28, 2026

Foundation for self-designing artificial intelligence

The Recursive Paradigm: 2023–2026

Abstract
Until 2023, large language models (LLMs) were primarily imitative systems, constrained by the limits of human-generated training data. This paper reviews the paradigm shift initiated in late 2023, wherein LLMs were integrated into Evolutionary Algorithms (EAs) to act as semantic mutation engines. By replacing the blind, random mutations of traditional genetic algorithms with intelligent, logic-driven code mutations, AI systems crossed the threshold from imitating human knowledge to generating novel synthetic knowledge. We examine the foundational breakthroughs of DeepMind’s FunSearch and NVIDIA’s Eureka, the mechanics of LLM-generated reward functions, and the current 2026 frontier of Auto-AI (e.g., AlphaEvolve), outlining how this evolutionary loop serves as the primary mechanism for Recursive Self-Improvement and the pathway to Artificial General Intelligence (AGI).


1. Introduction: The "Data Wall" and the 2023 Paradigm Shift

Historically, AI progress was driven by scaling: building larger neural networks and feeding them more human data. By 2023, researchers recognized a looming limitation known as the "Data Wall." LLMs had consumed nearly all high-quality human text available on the internet. To achieve superintelligence, AI needed a mechanism to discover mathematical and algorithmic truths that humans did not yet possess.

The solution was found by marrying the generative creativity of LLMs with the ruthless, objective verification of Genetic Algorithms. Instead of asking an LLM for an "answer," researchers began asking LLMs to write programs that search for answers, testing those programs in secure sandboxes, and allowing the AI to iteratively mutate its own code based on the results.

2. Overcoming the Flaw of Traditional Genetic Algorithms

A Genetic Algorithm (GA) is a search heuristic inspired by Darwinian evolution. Traditionally, it operates by generating a population of solutions, evaluating their "fitness," and combining/mutating the best performers to create a new generation.

The Flaw: Historically, the mutation step was blind. A traditional GA mutates code by randomly altering characters (e.g., swapping a + for a -). Because computer code is highly sensitive, 99.9% of random mutations result in fatal syntax errors. Evolution was computationally expensive and painfully slow.
The LLM Solution: In the modern paradigm, the LLM acts as the mutator. Because the LLM understands programming semantics, it does not make blind typographical errors. It makes logical hypotheses (e.g., "Replacing this linear function with a sine wave might stabilize the output"). This transforms evolution from a random walk into a highly directed, intelligent search, accelerating the discovery of successful algorithms by orders of magnitude.

3. Case Study 1: FunSearch and the Discovery of New Mathematics (Dec 2023)

DeepMind’s FunSearch (Searching in the Function Space) demonstrated the first major victory of this architecture. Researchers tasked the system with solving the "Cap Set Problem," a famously complex puzzle in pure mathematics.

Instead of generating a mathematical proof directly, the LLM generated Python code to search for the solution. When the code failed, an automated evaluator fed the error logs back to the LLM, which semantically mutated the code and tried again. Ultimately, FunSearch discovered a novel algorithm that generated larger Cap Sets than human mathematicians had ever found. This marked the moment AI began generating verifiable synthetic knowledge.

4. Case Study 2: Eureka and the Evolution of Reward Functions (Oct 2023)

In Reinforcement Learning (RL), teaching a physical robot a complex task (like spinning a pen in its hand) requires a Reward Function—a mathematical formula that scores the robot's behavior. Humans are notoriously bad at writing these formulas. If a human programs a robot to "move forward," the robot might exploit the math by falling over and thrashing its legs—a failure known as Reward Hacking.

NVIDIA’s Eureka solved this by placing the reward function inside an LLM evolutionary loop:

  1. Teacher/Student Dynamic: The LLM (Teacher) writes 10 different mathematical reward functions.

  2. The Sandbox: Virtual robot hands (Students) attempt to spin a pen using those 10 formulas.

  3. Fitness Evaluation: Most fail, but one makes slight progress. The LLM analyzes the physics data from the successful attempt, mutates the underlying mathematical code, and writes an improved generation of reward functions.
    By iterating this loop, the LLM discovers highly complex, non-intuitive mathematical formulas that perfectly guide the robot without falling victim to reward hacking.

5. The Current Frontier: AlphaEvolve and Auto-AI (Feb 2026)

Building upon the foundations of 2023, the current frontier of research (exemplified by the February 2026 AlphaEvolve framework) applies this evolutionary loop directly to the fundamental algorithms of AI itself.

In this framework, the LLM treats the source code of an AI training algorithm as a genome. It proposes semantically meaningful code changes and auto-evaluates fitness on real benchmark tasks without human trial-and-error.

  • Game Theory Advancements: AI has autonomously evolved new meta-solvers for Multi-Agent Reinforcement Learning (MARL). For example, AI-generated algorithms like VAD-CFR (a variant of Counterfactual Regret Minimization) and SHOR-PSRO have been shown to outperform human-designed state-of-the-art solvers like Nash, AlphaRank, and PRD.

  • Alien Intuition: Because the LLM mutator does not possess human cognitive bias, it discovers highly non-intuitive mechanics. In the AlphaEvolve trials, the system autonomously discovered a "warm-start threshold" exactly at iteration 500 out of a 1000-iteration horizon—an optimization human researchers would not have manually coded, but which naturally survived the evolutionary fitness test.

6. The Pathway to Artificial General Intelligence (AGI)

The ultimate importance of this architecture is that it establishes the mechanical framework for Recursive Self-Improvement—an exponential loop often referred to as the "intelligence explosion."

  1. Step 1: An LLM acts as a mutation engine to write a highly optimized, superior machine learning algorithm.

  2. Step 2: Human researchers use this AI-invented algorithm to train the next generation of LLM.

  3. Step 3: Because the new LLM was trained on superior architecture, it is significantly more intelligent than its predecessor. It is then tasked with mutating and improving its own training code once again.

7. Conclusion

Since 2023, the integration of Large Language Models with Genetic Algorithms has solved the historic inefficiencies of evolutionary computation. By enabling AI to autonomously write, test, and mutate code—whether it is a reward function for a robotic hand, a mathematical heuristic, or the meta-solvers of its own neural architecture—we have moved beyond imitative AI. The system is now successfully generating synthetic knowledge, setting the foundation for self-designing artificial intelligence.


References

  1. Romera-Paredes, B., et al. (2023). "Mathematical discoveries from program search with large language models." Nature. (DeepMind's FunSearch, detailing LLM-guided evolutionary search for the Cap Set problem).

  2. Ma, Y. J., et al. (2023). "Eureka: Human-Level Reward Design via Coding Large Language Models." NVIDIA Research. (Detailing the Teacher-Student evolutionary loop for overcoming reward hacking in robotic simulations).

  3. Li, Z., Schultz, J., et al. (February 2026). "Discovering Multiagent Learning Algorithms with Large Language Models." arXiv:2602.16928. (The "AlphaEvolve" paper, demonstrating the automated generation of VAD-CFR and SHOR-PSRO solvers, the discovery of the 500-iteration threshold, and the transition of algorithmic design from humans to AI).

Monday, February 23, 2026

Artificial Intelligence: Beyond Rejuvenation.

Why we shouldn't want to be 18 again, and how AI will redefine aging.

Recently, at the 2026 World Governments Summit in Dubai, Dr. David Sinclair made a headline-shattering announcement: scientists have successfully reversed biological aging markers in animal tissues by up to 75% within weeks, and human trials are on the horizon.

The media immediately seized on the narrative of the "Fountain of Youth." But if we look closely at the intersection of biophysics, evolutionary biology, and immunology, a massive philosophical and scientific flaw emerges in the current longevity narrative.

The truth is, we shouldn't actually want a "perfect copy" of youth. Here is why true longevity won't be achieved by simply running a biological "factory reset"—and why artificial intelligence is the only tool that can save us.



The Hardware, The Software, and The Yamanaka Factors

To understand aging, we have to look at the Information Theory of Aging. Think of your DNA as the "hardware" of a computer. It stays largely intact your whole life. However, your epigenome—the chemical markers that tell your cells how to read that DNA—is the "software."

In 2006, Nobel laureate Shinya Yamanaka discovered four specific proteins (the Yamanaka factors) that can wipe a cell’s epigenetic software clean, turning an old skin cell back into a young embryonic stem cell. Sinclair’s lab later proved that by using a partial dose of these factors (OSK), we can run a "System Restore" on our cells, removing the aging markers without making the cell forget its identity.

But why does this software get corrupted in the first place?

It comes down to physics and thermodynamics. Every day, your DNA suffers millions of tiny breaks from UV light and metabolism. Epigenetic proteins leave their posts to fix the damage, but over time, they make mistakes and get lost. This creates biological entropy—or "epigenetic noise." Furthermore, because evolutionary selection pressure drops to zero after our reproductive years, nature has no incentive to keep our software perfectly maintained.

The "Perfect Youth" Paradox

This brings us to a massive, often overlooked problem in longevity research: If we execute a perfect "factory reset" to return our bodies to the exact state they were in at age 18, we will lose decades of acquired biological wisdom.

Your epigenetic software doesn't just accumulate damage as you age; it accumulates data.
Take your immune system. When your T-cells and B-cells fight off a virus, they undergo physical, epigenetic changes to "remember" that pathogen. That epigenetic priming is the fundamental basis of acquired immunity. Similarly, in the brain, synaptic pathways and learned memories are stabilized by localized epigenetic states.

If we use Yamanaka factors to blindly wipe the epigenetic slate back to "youth," we erase that memory. We would have the energetic bodies of teenagers, but the naive, highly vulnerable immune systems of newborns. A common cold could become lethal.

We don't actually want to be young. Youth is biologically fragile. What we want is an organism that is Strong (possessing the robust metabolic and DNA-repair capacity of a 20-year-old) but also Smart (retaining the immunological resilience and neurological complexities acquired by a 50-year-old).

The Computational Bottleneck: Signal vs. Noise

In physics and information theory, a system contains both Signal (useful information, like immune memory) and Noise (entropy and cellular damage). The Yamanaka factors are a biological sledgehammer—they erase both the noise and the signal.

To achieve the "Strong and Smart" body, we need to selectively edit the epigenome. We need a system capable of reading billions of chemical markers and saying: "Keep the methylation marks on Gene A (because that is immune memory), but erase the methylation marks on Gene B (because that is aging noise)."

Human cognition and traditional laboratory trial-and-error are fundamentally incapable of solving a mathematical problem of this magnitude.

The Hassabis Future: Why AI is the Engine of Longevity

This is where biology and physics hand the baton to computer science.

To resolve the paradox of the "backup copy," we must rely on advanced artificial intelligence. Deep learning models like AlphaFold 3, developed by Google DeepMind, have already revolutionized our ability to predict how proteins interact with DNA and RNA at an atomic level.

The next frontier—and the ultimate solution to aging—will likely come from AI-first biotech companies like Isomorphic Labs (founded by Demis Hassabis). Instead of injecting blunt-force gene therapies into humans, AI will simulate billions of small-molecule compounds to design drugs that perfectly mimic a selective Yamanaka effect.

These AI-designed molecules will act as a precision "software update," binding to the genome to clear the thermodynamic noise while strictly protecting the signal of our biological intelligence.

By combining the physical laws of molecular biology with the computational supremacy of artificial intelligence, we are moving away from the blind deterioration of evolution. The future isn't about rejuvenation. It is about AI-driven biological optimization. And that is a future far better than just being young again.


References & Further Reading

  1. Yamanaka, S., & Takahashi, K. (2006). Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors. Cell, 126(4), 663-676. (The Nobel-prize winning discovery of the Yamanaka factors).

  2. Lu, Y., ... & Sinclair, D. A. (2020). Reprogramming to recover youthful epigenetic information and restore vision. Nature, 588, 124–129. (The foundational paper on using OSK for partial reprogramming and age reversal).

  3. Bevington S.L., Cockerill P.N., et al. (2021). Stable Epigenetic Programming of Effector and Central Memory T Cells. Cell Reports. (Demonstrating that immunological memory is physically stored as epigenetic modifications).

  4. Shannon, C. E. (1948). A Mathematical Theory of Communication. The Bell System Technical Journal. (The foundational text of information theory, applied here to biological entropy and epigenetic noise).

  5. Ocampo A., Izpisua Belmonte J.C., et al. (2016). In vivo amelioration of age-associated hallmarks by partial reprogramming. Cell. (Proving the necessity of 'partial' rather than 'full' reprogramming to maintain cellular identity).

  6. Jumper, J., ... & Hassabis, D. (2024). Highly accurate protein structure prediction with AlphaFold. Nature. (Contextualizing the AI breakthrough capable of modeling protein-DNA interactions).

  7. Hassabis, D. (2025/2026). Vision statements on Artificial Intelligence in target discovery and Isomorphic Labs. (The predictive trajectory of using AI to simulate and design intelligent molecular therapies).