OpenAI's New Codex Models: GPT-5.5, Codex-Spark, and Codex-Cyber Explained

OpenAI has released a wave of new models in Codex, headlined by GPT-5.5, alongside specialised variants GPT-5.3-Codex-Spark and GPT-5.2-Codex-Cyber, and a set of experimental frontier models identified only by codename.

For businesses using Codex for software engineering, automated code review, or agentic development workflows, this is the largest expansion of the Codex model set since the GPT-5.4 launch in March 2026. Choosing the right model at the right moment now has measurable implications for latency, unit economics, and the quality of output on security-sensitive or high-stakes tasks.

Some of these models will be available in the Owlpen platform once released and integrated, subject to their move out of research preview and into general availability. More on that below.

Important: none of these models are confirmed

OpenAI has not confirmed that any of the models named in this article (GPT-5.5, GPT-5.3-Codex-Spark, GPT-5.2-Codex-Cyber, arcanine, glacier-alpha, heisenberg, GPT-Rosalind, oai-2.1) will actually be launched, or that they will ship under these names. They were visible in the Codex app for only a very short period on 22 April 2026 before being withdrawn. Some, all, or none may reach general availability. Names, capabilities, and release timing are likely to change.

Purposes and positioning are industry best guesses

The purposes, use cases, and routing recommendations described for each model below are not official. They are our best guesses, drawn from naming conventions, the brief in-app appearance, and how OpenAI has positioned comparable models in the past. OpenAI has not published official descriptions for most of these models. Do not treat any of the "what it is good for" guidance in this article as authoritative.

The flagship GPT-5.x family

The core release is the GPT-5.x line, now spanning at least four distinct capability tiers in Codex. These are the models most teams will use for day-to-day work.

GPT-5.5 (new)

The headline release. Positioned as the highest-capability model now available in Codex, suited to the hardest agentic coding, deep repository work, complex debugging, and long-running tasks where quality matters more than speed or cost. This is the model to reach for when you need the strongest reasoning Codex now offers, and are willing to trade latency for it.

GPT-5.4

The everyday "serious work" default. Launched in March 2026 as the first mainline model to incorporate the frontier coding capabilities of GPT-5.3-Codex, it remains the sensible baseline where 5.5 is slower or more expensive. It also carries native computer-use capabilities and an experimental 1M context window in Codex.

GPT-5.4 Pro

The deep-reasoning, high-precision option. Use for architectural decisions, gnarly bugs, high-stakes reviews, and tasks where the cost of a wrong answer is materially higher than the cost of a slower turnaround. Slower and more expensive than baseline 5.4, but stronger on the hardest reasoning.

GPT-5.4 Mini

The fast, efficient variant for responsive coding tasks and subagents. Suitable for quick edits, short explanations, formatting, and simple review passes. Not intended for deep architecture or subtle security work.

GPT-5.2

The legacy general-purpose model. Still available in Codex for teams with established workflows tuned to its behaviour. Public documentation positions it as a flagship for complex reasoning, coding, and agentic tasks before 5.4 superseded it.

Sensible default for most teams

For most production Codex usage, GPT-5.4 remains the correct starting point. Escalate to GPT-5.5 or 5.4 Pro for specific high-stakes tasks and fall back to 5.4 Mini for latency-sensitive subagents. The new releases do not change that recommendation, they simply make the edges of the envelope easier to reach.

The Codex-specific variants

Three of the releases are Codex-specific, tuned or routed for particular categories of work rather than positioned as general-purpose flagships.

Reminder: the specialisations below (Spark for fast iteration, Cyber for security work) are best-guess interpretations based on the model suffixes. OpenAI has not published official descriptions confirming what these variants are actually tuned for.

GPT-5.3-Codex

The prior-generation Codex-optimised model. Solid for agentic coding but largely superseded by 5.4 and 5.5 on quality. Still useful where teams have prompts, evaluation harnesses, and CI integrations that were calibrated to 5.3-Codex behaviour.

GPT-5.3-Codex-Spark

A text-only research preview optimised for near-instant, real-time coding iteration, available only to ChatGPT Pro users at present. The use case is quick patches, small refactors, lint fixes, and focused implementation loops where the model needs to keep pace with a developer rather than produce a considered result. Not the right tool for architecture, subtle security review, or deep debugging.

GPT-5.2-Codex-Cyber

A Codex variant tuned or routed for security analysis. Best deployed on vulnerability hunting, authentication bugs, dependency and supply-chain review, threat modelling, and investigating suspicious CI or security failures. For teams running periodic red-teaming of their own code, this is the first Codex variant that has been explicitly positioned for the job.

The experimental codenames

Alongside the flagships, a handful of experimental models have surfaced in Codex, referred to only by codename. These are not positioned as production options and their public documentation is minimal. We do not recommend them for launch-critical work unless the team is intentionally comparing models.

These codenames may never ship, may be renamed, or may already have been replaced. Any inferred purpose below (security routing, life-sciences research, etc.) is industry speculation from naming convention alone, not a confirmed specification.

arcanine

An experimental frontier model identified only by its codename. Public positioning is unclear. Treat as a test target, not a production dependency.

glacier-alpha and glacier-alpha-block-cy3 / cy4

A family of experimental frontier models. The base glacier-alpha appears alongside block-cy3 and block-cy4 variants, whose naming suggests security or cyber-oriented routing, though this is an inference from the naming convention rather than a documented specification. Again, experimental.

heisenberg

A codename associated with life-sciences research work. Not relevant to general-purpose coding or business agentic workflows, and only expected to appear for clients with relevant research access.

GPT-Rosalind

A frontier reasoning model purpose-built for biology, drug discovery, and translational medicine, named after the chemist Rosalind Franklin. Available via ChatGPT Enterprise and Codex through a research preview. Out of scope for typical business coding or document-analysis work, but notable as evidence that OpenAI is now shipping domain-specific reasoning foundation models alongside general-purpose ones.

oai-2.1

Listed in Codex as a frontier agentic option. The naming suggests an internal or experimental family distinct from the GPT-5.x line. Test before trusting it for any production workflow.

How to choose between them

The routing recommendations that follow are conditional on the unconfirmed positioning described above. If OpenAI's eventual release materials differ from what briefly appeared in-app, this guidance will need to be rewritten. Do not migrate production workloads on the basis of this section alone.

A wider model set is only useful if the team has a clear heuristic for selection. For most Coaley Peak clients we would suggest the following default routing, adjusted after live benchmarking on representative tasks.

Use GPT-5.4 as the default for serious work. Escalate to GPT-5.5 for the small share of tasks where the model will run long and the cost of a wrong answer is material (complex refactors, multi-file migrations, large-scale debugging). Escalate to GPT-5.4 Pro for architectural reviews and high-stakes decisions. Drop to GPT-5.4 Mini for quick, bounded edits where latency matters.

Reach for 5.3-Codex-Spark when you need an interactive fast-feedback loop during development, and for 5.2-Codex-Cyber when the task is explicitly security-adjacent. Treat the experimental codenames as test candidates only and evaluate them against whichever production model they would replace before trusting them with anything that ships.

Governance and cost considerations

With this many models available, governance becomes more important than it was when "the default" was the only sensible choice. Two practical recommendations for teams running Codex at scale:

First, pin the model explicitly in automation rather than relying on default routing. Silent upgrades to a higher-cost model can change unit economics overnight, and silent downgrades can quietly degrade output quality. Any CI pipeline, scheduled job, or production integration should specify the model by API identifier and change it deliberately.

Second, treat research-preview models as non-production by default. Research previews can be throttled, re-priced, or withdrawn with limited notice. For anything that a customer, regulator, or auditor might depend on, stick to generally-available models and schedule a periodic review of whether the preview has graduated.

Owlpen and the expanded OpenAI lineup

Some of these models will be available in the Owlpen platform once they are released and integrated. Our current position, subject to review as OpenAI clarifies pricing, availability regions, and data-handling commitments, is as follows.

GPT-5.4, GPT-5.4 Pro, and GPT-5.4 Mini are the strongest near-term candidates for Owlpen integration, given that they are generally available, documented, and priced. GPT-5.5 is a probable addition once its documentation, pricing, and data-handling commitments are published, subject to independent evaluation on the document-heavy, cost-reduction workloads that Owlpen is built for.

We do not currently plan to expose research-preview models (GPT-5.3-Codex-Spark, the arcanine and glacier-alpha family, heisenberg, oai-2.1) inside Owlpen. Research-preview access is generally incompatible with the production-grade reliability and governance commitments Owlpen makes to clients. Domain-specialist models like GPT-Rosalind are similarly out of scope unless a specific client use case justifies a bespoke integration.

Owlpen availability (indicative)

Production-tier OpenAI models (GPT-5.4, 5.4 Pro, 5.4 Mini, and GPT-5.5 once generally available) are the candidates most likely to be integrated into the Owlpen platform, subject to evaluation and data-handling review. Research-preview and experimental models (Spark, Cyber, arcanine, glacier-alpha, heisenberg, oai-2.1) are not currently planned for Owlpen. Domain-specialist models such as GPT-Rosalind are out of scope unless a specific use case calls for a bespoke integration.

If you would like to discuss which OpenAI models are the right fit for your workflows, or how the Owlpen platform could support your business, contact us at enquiries@coaleypeak.co.uk or read more about the Owlpen platform.

Disclaimer. This article is published by Coaley Peak Ltd for general informational purposes only. The views expressed are those of the author, Stephen Grindley, and do not constitute legal, regulatory, financial, or technical advice. Nothing in this article should be relied upon when making procurement, investment, compliance, or technology decisions. References to third-party products, platforms, and companies are for informational purposes only and do not constitute endorsement. The article is explicitly speculative. The models and names described here appeared in the Codex app for a very short period on 22 April 2026 before being withdrawn from view, and have not been confirmed as a formal OpenAI release. Descriptions, positioning, and availability are inferred from that brief in-product appearance and have not been independently verified by Coaley Peak. Experimental codenames (including arcanine, glacier-alpha, heisenberg, and oai-2.1) may never reach general availability, may be renamed, or may not exist as described. Readers should seek independent professional advice appropriate to their specific circumstances and should not make procurement or migration decisions on the basis of this article. Information was accurate to the best of the author's knowledge at the date of publication. Coaley Peak Ltd and Stephen Grindley accept no liability for any loss or damage arising from reliance on the contents of this article. Readers should seek independent professional advice appropriate to their specific circumstances. Information was accurate to the best of the author's knowledge at the date of publication. Coaley Peak Ltd and Stephen Grindley accept no liability for any loss or damage arising from reliance on the contents of this article.