A couple of weeks ago, OpenAI launched Operator, an AI-powered research agent designed to perform tasks on behalf of users. Since I recently wrote about Google DeepMind’s AI agent, Project Mariner, I was eager to dive into OpenAI’s latest entry into the agentic AI space and share my thoughts.
At first glance, Operator looks very impressive. It features a slick ChatGPT-style chat interface, and can handle complex online tasks, with demos showcasing end-to-end booking of tickets for a basketball game, ordering ingredients for a recipe it found, and making a restaurant reservation - all with minimal user input.
How Operator Works
Working with you as your personal AI assistant, you can use Operator to delegate your time-consuming online chores to. At key steps, such as payments or decision-making (e.g. selecting seats at a venue), the user is asked to intervene, ensuring controlled decision making and security.
Similar to Project Mariner, Operator has the ability to interact with digital interfaces much like a human would—rather than simply analysing code, it navigates and controls the screen to complete tasks. Unlike traditional models that rely on APIs, Operator can access any website because it's interacting with the front-end.
Powering Operator is Computer-Using Agent (CUA), a model that combines GPT-4o's vision capabilities with advanced reasoning. On the OSWorld benchmark, CUA achieves a 38.1% success rate in completing tasks—while human performance stands at 72.4%, indicating there is some room for improvement!
OpenAI is collaborating directly with brands like OpenTable, Uber, and eBay to ensure Operator works well on these popular platforms. If no specific platform is mentioned in the prompt, Operator will aim to find the best available website to use.
Real-World Examples
One of the demo examples showcased Operator assisting with a linguine clam pasta recipe. The user asked Operator to find the necessary ingredients and purchase them via Instacart. The prompt specified which ingredients were already available in the kitchen—such as butter, vegetable oil, and pepper. Operator then proceeded to find a suitable recipe and add the required ingredients to the shopping cart.
What makes this revolutionary is how Operator navigates the web like a human. Instead of parsing website code, it directly interacts with platforms like AllRecipes.com directly. Users can observe its decision-making process through a text-based chain of thought and it also asks clarifying questions like, “I found a recipe. Which store would you prefer to use?.”
It ensures safety by prompting users to take control when logging in, completing CAPTCHAs or purchasing —all while handling the tedious work of browsing recipes and compiling shopping lists.
Another demo showed how easy it was to book a restaurant. Once a request is submitted, Operator navigates the site, clicks options, and checks availability. It returns with any necessary questions or actions best suited for the user to answer. For example, in the demo, Operator could not find a table at the preferred time, so it offered the closest match—15 minutes later than requested. Before completing the reservation, Operator confirmed with the user to ensure the new time was acceptable before finalising the booking.
Custom Instructions
You can add custom instructions to a website to personalise your experience. Such as, instructing Operator to always look for fully refundable rates and free breakfast when searching for hotels on the Priceline website. Once these preferences are set, the agent will keep them in mind when processing prompts. For instance, if you ask, “Find me a hotel in NYC for October 1st to October 7th,” Operator will consider your specific preferences while browsing Priceline. It will return with a list of matches and confirm with you before proceeding to checkout, ensuring the experience aligns with your needs.
Prompt Savings
Operator allows users to save tasks for future use. In the top right corner of the interface, there is a “Save task” button that lets users store common requests, such as “Book a Friday dinner reservation.” Based on preset preferences—such as location and number of guests, Operator will automatically find a reservation. In the demo, Operator asked which cuisine would be preferred before using OpenTable to complete the reservation seamlessly.
A Leap Forward in Digital Accessibility
One of the most exciting aspects of AI agents like Operator is their potential impact on accessibility. Imagine the benefits for individuals who struggle to use a keyboard or mouse; whether due to permanent disabilities, temporary injuries, or visual impairments like colour blindness that make website navigation difficult. Pair Operator with voice commands, and you have a powerful personal shopping and research assistant that removes digital barriers and enhances user independence.
Final Thoughts
Operator represents a major shift in how AI interacts with the web—one of the first agentic AI systems capable of directly taking action in the browser. Instead of just providing information, it actively completes tasks on behalf of users, making online interactions more efficient and accessible.
While the potential is exciting, there are also concerns. Safety remains a critical challenge, and OpenAI has implemented the same strict safeguards as developed for ChatGPT-4o, along with requiring user intervention for key actions like payments. However, broader testing will be needed to ensure security at scale.
Right now, Operator is still in research preview and only available to OpenAI Pro users in the U.S. I wasn’t able to test it firsthand, but OpenAI has confirmed plans to expand availability and introduce more AI agents over time.
Despite these early limitations, Operator has the potential to redefine how we interact with the internet. As AI continues to evolve, systems like this could remove digital accessibility barriers, automate tedious tasks, and bring us closer to a future where technology truly works for us.