OpenAI’s Operator: A Step Forward or Just Hype?

Paul Grieselhuber

Paul Grieselhuber

Feb 4, 2025

OpenAI has launched a research preview of Operator, a web automation tool powered by a new AI model called Computer-Using Agent (CUA). Designed to interact with web browsers through a visual interface, Operator performs multi-step tasks by analyzing and interacting with on-screen elements, mimicking how a human would navigate the web.

Operator is currently available to ChatGPT Pro subscribers at operator.chatgpt.com for $200 per month. OpenAI plans to expand access to Plus, Team, and Enterprise users and eventually integrate CUA into ChatGPT and release an API for developers.

How Operator Works

Operator processes web pages by capturing screenshots, analyzing their content using GPT-4o's vision capabilities, and executing actions such as clicking, typing, and scrolling. The system continuously refines its approach, allowing it to recover from errors and handle multi-step tasks across various web applications.

While Operator is capable of automating repetitive tasks such as creating shopping lists and navigating e-commerce sites, it struggles with more complex interfaces like calendars and tables. OpenAI’s internal testing showed:

  • 87% success rate on the WebVoyager benchmark, which tests navigation on live websites like Amazon and Google Maps.
  • 58.1% success rate on WebArena, a benchmark that simulates offline test sites.
  • 38.1% success rate on the OSWorld benchmark, which evaluates AI performance on computer operating system tasks—surpassing prior models but still significantly behind human performance at 72.4%.

Despite its limitations, OpenAI hopes that user feedback will help improve Operator’s reliability across a broader range of tasks.

OpenAI Operator Demo

A Growing Market for AI Agents

The release of Operator aligns with a broader trend in AI development, where companies are pushing toward agentic AI systems capable of automating user tasks. Competitors have already introduced similar tools:

  • Google’s Project Mariner, announced in December 2024, enables automated browser tasks through Chrome.
  • Anthropic’s "Computer Use" tool, released in October 2024, allows developers to automate web-based interactions using AI.

AI researcher Simon Willison noted similarities between Operator and Anthropic's Claude Computer Use demo, stating on his blog, "The Operator interface looks very similar to Anthropic's Claude Computer Use demo from October, even down to the interface with a chat panel on the left and a visible interface being interacted with on the right."

Security and Privacy Concerns

OpenAI emphasizes that all browsing activities within Operator occur in a virtual environment, with multiple safety measures in place:

  • User Confirmation: Sensitive actions like sending emails or making purchases require explicit user approval.
  • Browsing Restrictions: Operator is blocked from accessing certain website categories, including gambling and adult content.
  • Real-Time Moderation: OpenAI has implemented detection systems to prevent prompt injection attacks and malicious jailbreak attempts.

However, Willison remains skeptical about the system’s security, writing, "Color me skeptical. I imagine we'll see all kinds of novel successful prompt injection style attacks against this model once the rest of the world starts to explore it."

OpenAI acknowledges these concerns in its System Card documentation, stating, "Despite proactive testing and mitigation efforts, certain challenges and risks remain due to the difficulty of modeling the complexity of real-world scenarios and the dynamic nature of adversarial threats."

To enhance user privacy, OpenAI has introduced additional safeguards:

  • Users can opt out of having their data used for model training.
  • Browsing data can be deleted with a single click in Operator’s settings.
  • A "takeover mode" ensures that sensitive details such as passwords and payment information are not captured.

Even with these measures, Willison advises users to be cautious, suggesting, "Start a fresh session for each task you outsource to Operator to ensure it doesn't have access to your credentials for any sites that you have used via the tool in the past. If you're having it spend money on your behalf, let it get to the checkout, then provide it with your payment details and wipe the session straight afterwards."

Looking Ahead

OpenAI's launch of Operator marks a step forward in AI-driven web automation, but its accuracy, security, and ethical implications remain key concerns. As agentic AI systems become more widely adopted, competition between OpenAI, Google, and Anthropic is set to intensify, with user feedback and security challenges shaping the future of web automation tools.

References

  • Benj Edwards (2025). OpenAI launches Operator, an AI agent that can do tasks on the web. Ars Technica. Available online. Accessed 2 February 2025.
  • Simon Willison (2025). Introducing Operator. Simon Willison’s Blog. Available online. Accessed 2 February 2025.
Paul Grieselhuber

Paul Grieselhuber

Founder, President

Paul has extensive background in software development and product design. Currently he runs rendr.

Book a discovery call with our product experts.

Our team of web and mobile application experts look forward to discussing your next project with you.

Book a call 👋