OpenAI launches Operator, an AI agent capable of operating a computer

New automated assistant promises to perform online tasks independently and may revolutionize interaction with digital interfaces

OpenAI launches Operator, an AI agent capable of operating a computer

New automated assistant promises to perform online tasks independently and may revolutionize interaction with digital interfaces

Key points:

  • OpenAI launches Operator, an AI capable of executing tasks in a browser.
  • The system uses the Computer-Using Agent (CUA) model, based on GPT-4o.
  • Operates through a remote interface and is exclusive to ChatGPT Pro subscribers in the U.S.
  • OpenAI claims it outperforms competitors like Anthropic and Google DeepMind.
  • The tool is seen as an advancement in the automation of everyday tasks.
  • OpenAI has officially unveiled Operator, its first artificial intelligence agent designed to perform tasks in a browser, such as booking event tickets or placing online grocery orders. The tool, powered by the Computer-Using Agent (CUA) model, represents a breakthrough in the ability of language models to interact directly with digital interfaces.

Initially available only to ChatGPT Pro subscribers in the United States at a cost of $200 per month, the company plans to expand access in the future. According to OpenAI, Operator outperforms direct competitors like Anthropic's Computer Use and Google DeepMind's Mariner, demonstrating superior performance in benchmarks that assess an agent's ability to complete digital tasks.

Evolution of AI agents

Experts highlight that Operator represents an important step in the evolution of AI models, which are moving beyond just generating text and images to actually performing actions. Ali Farhadi, CEO of the Allen Institute for AI (AI2), emphasizes that interacting with digital interfaces is a promising approach, as it balances technological feasibility with practical impact for users.

Performance and operation

Operator works by interpreting screenshots of the user interface, analyzing graphical elements to decide what actions to take, similar to how a person interacts with a browser. This eliminates the need for specific APIs for each service, expanding its usability.

In OpenAI's internal tests, CUA achieved 38.1% accuracy in the OSWorld benchmark, surpassing Computer Use, which scored 22%, though still far from the 72.4% achieved by humans. In another test, WebVoyager, the model scored 87%, outperforming Mariner at 83.5% and Computer Use at 56%.

Security and collaborations

OpenAI ensures that security measures have been implemented to prevent misuse, including tests conducted by specialized teams to detect vulnerabilities and train the system to request confirmation before executing certain tasks.

Operator also has strategic partnerships with services like OpenTable, StubHub, Instacart, DoorDash, and Uber. Although details of these collaborations have not been disclosed, the AI automatically suggests websites for certain tasks, such as restaurant reservations and online shopping.

Future of digital automation

OpenAI plans to expand Operator's reach by allowing developers to integrate CUA into other applications via an API, further increasing automation possibilities.

The impact of this technology could be significant, reducing the time spent on repetitive tasks. Yash Kumar, an OpenAI researcher, shared his personal experience with the tool, highlighting how it simplifies grocery list organization and restaurant reservations. According to him, Operator streamlines daily processes by enabling tasks to be performed with a simple command, saving users time and effort.

With major AI companies increasingly investing in this type of technology, the competition for dominance in the sector is now shifting to autonomous agents capable of operating our computers. The future of digital automation may be closer to everyday reality than ever before.