OpenAI has revealed the results of a five-month internal experiment in which its engineering team built a full-scale software product using entirely AI-generated code, marking a significant milestone in the evolution of AI-assisted development.
The project, led by OpenAI engineer Ryan Lopopolo, began in late August 2025 as an effort to test how far autonomous coding agents could be pushed in a real-world environment. Instead of writing code manually, engineers relied exclusively on Codex powered by GPT-5 to generate every component of the system.
Over the course of the experiment, the repository expanded to approximately one million lines of code, produced through roughly 1,500 pull requests. A team of three to seven engineers oversaw the process, averaging about 3.5 pull requests per engineer per day. Notably, all application logic, test coverage, continuous integration (CI) configuration, documentation, observability tools and supporting infrastructure were generated by AI systems.
Rethinking the Role of Engineers
Rather than acting as traditional programmers, engineers focused on defining intent, designing development environments and creating feedback loops for AI agents. The project began with an empty Git repository that was scaffolded by Codex, which then incrementally built out the product’s architecture.
Engineers emphasized making system outputs legible and debuggable. They improved visibility into AI-generated changes using local observability stacks, logs, metrics dashboards and Chrome DevTools. Documentation and repository knowledge were treated as the system of record, with a central AGENTS.md file serving as a structured guide for agent coordination.
Structured Architecture and Autonomous Agents
To maintain stability at scale, the team enforced strict layered domain boundaries and applied mechanical linters to prevent architectural drift. Autonomous agents were tasked with end-to-end responsibilities, including reproducing bugs, implementing fixes, submitting pull requests and merging approved changes.
Cleanup agents were deployed periodically to reduce entropy in the growing codebase, applying what the team described as “golden principles” to maintain clarity and consistency across modules.
From Experiment to Product
The resulting system is now used daily by internal OpenAI staff and a group of external alpha testers. While still in beta, the experiment demonstrates how AI coding agents can move beyond code suggestions and into coordinated, large-scale software production.
The initiative also highlights a broader shift in software engineering workflows. Instead of writing code line by line, developers may increasingly act as orchestrators—designing guardrails, defining objectives and supervising AI agents capable of executing complex technical tasks.
As AI models continue to improve, OpenAI’s experiment suggests that autonomous code generation could play a central role in how future software systems are built and maintained.
2 thoughts on “OpenAI Builds Million-Line Software Product Using Only AI-Generated Code”