ENZHKO
Last updated on

Codex Can Now Generate Images. That Matters More Than It Sounds.


OpenAI’s latest Codex update includes a feature that sounds almost secondary at first: image generation.

But this is not just a small add-on.

According to OpenAI’s official announcement, Codex can now use gpt-image-1.5 to generate and iterate on images inside the same workflow where it already uses your browser, desktop apps, terminals, plugins, memory, and scheduled automations. On paper, that could sound like feature creep. In practice, it looks more like a signal that OpenAI no longer wants Codex to be understood as only a coding assistant.

It wants Codex to look like a broader work agent.

What changed in Codex

OpenAI’s own framing is revealing.

In its new product post, the company says Codex can now:

  • operate your computer alongside you,
  • browse inside an in-app browser,
  • use more than 90 plugins,
  • remember preferences and past context,
  • schedule work for later,
  • wake up automatically to continue long-running tasks,
  • and generate images with gpt-image-1.5.

That list matters because image generation is not arriving in isolation. It is arriving as one more capability inside a larger agent surface that now spans coding, browsing, desktop control, plugins, memory, and recurring automation.

So the real story is not “Codex got image generation.”

The real story is that OpenAI is expanding Codex from a coding-focused agent into something closer to a work environment for agents.

Why image generation is strategically interesting

Image generation matters here not because developers suddenly needed Codex to become an art tool.

It matters because visuals are part of real software and product work.

If you are using an agent to help with frontend design, product concepts, game assets, placeholder graphics, slides, mockups, or UI experimentation, then image generation is not outside the workflow. It is inside it.

That is the important shift.

Once a coding agent can also generate visuals, inspect screenshots, use the browser, comment on pages, work across desktop apps, and retain memory, the product stops looking like a narrow coding tool. It starts looking like a general-purpose execution surface for digital work.

OpenAI’s wording pushes in exactly that direction. The post is titled “Codex for (almost) everything.” That is not subtle branding.

Are other companies doing something similar?

Yes, but not in exactly the same way.

Anthropic

Anthropic has also been moving Claude beyond text-only interaction, but in a different pattern.

Its work on computer use showed a model that can interact with everyday software environments rather than waiting for specially designed tools. Claude Code expanded the coding-agent story toward tool use, long-running workflows, and more autonomous task execution. And across Anthropic’s broader product surface, you can also see adjacent expansion through things like Claude in Chrome and Claude Desktop Extensions.

But the emphasis has still been different.

Anthropic’s product and research framing tends to focus on:

  • tool use,
  • long-running agent structure,
  • harness design,
  • permissions and safety,
  • and making agents more autonomous without losing control.

That is clearly a broader agent direction. But it is still different from OpenAI’s current Codex move, which is trying to present coding, browser work, desktop control, memory, automations, plugins, and image generation as parts of one increasingly unified agent surface.

Cursor and similar coding products

Other coding tools have also been broadening their scope. Cursor, for example, has pushed harder on multi-file workflows, agent behavior, and IDE-centered execution. In the market more broadly, coding tools are expanding toward browser access, longer task loops, and more autonomous execution.

But OpenAI’s Codex move still stands out because of the combination.

The novelty is not that any single feature exists somewhere else. The novelty is that OpenAI is assembling all of these capabilities into one agent product story at once.

What makes Codex different right now

The clearest difference is the shape of the product.

Many coding-agent products still feel like specialized tools that are gradually adding more reach. Codex is starting to look like a unified workspace where multiple kinds of work can happen through the same agent surface.

That includes:

  • writing code,
  • reviewing GitHub comments,
  • using terminals,
  • connecting to remote devboxes over SSH,
  • operating desktop apps,
  • browsing the web,
  • remembering prior context,
  • scheduling future work,
  • and now generating images.

That combination matters more than any one feature.

A product with code + browser + computer use + memory + automations + image generation is not just trying to win the coding assistant category. It is trying to become the operating surface through which work gets delegated.

Why this matters for the market

This is one of the clearest signs that the coding-agent category is widening.

The old mental model was simple: a coding assistant helps write code faster.

The newer model is more ambitious: an agent can move across the tools that product and engineering work already depends on, hold context over time, and complete multi-step work without being reset after every interaction.

Image generation fits naturally inside that story.

It also hints at a larger product direction. Once an agent can create code, visuals, browser actions, and ongoing automations inside one loop, the boundary between coding assistant and work agent starts to blur.

That may be the real point of this release.

Our take

The most important part of the Codex update is not that OpenAI added one more feature.

It is that the feature set now looks deliberately broad enough to redefine what Codex is supposed to be.

If Anthropic has been emphasizing harness design, safety layers, and agent autonomy in structured workflows, OpenAI now seems to be emphasizing something slightly different: turning Codex into a wider agent surface that can touch more of the digital workspace directly.

Those are different bets.

Anthropic’s direction looks more like controlled capability expansion around agent behavior. OpenAI’s direction, at least in this release, looks more like product-surface expansion around agent reach.

That is why image generation matters here. Not because it is the flashiest feature, but because it reveals where Codex is going.

References