Google AI Introduces Gemini 2.5 ‘Computer Use’ (Preview): A Browser-Control Model to Power AI Agents to Interact with User Interfaces

ByRicardo October 8, 2025October 8, 2025

Which of your browser workflows would you delegate at the moment if an agent may plan and execute predefined UI actions? Google AI introduces Gemini 2.5 Computer Use, a specialised variant of Gemini 2.5 that plans and executes actual UI actions in a dwell browser through a constrained motion API. It’s accessible in public preview by way of Google AI Studio and Vertex AI. The mannequin targets internet automation and UI testing, with documented, human-judged beneficial properties on normal internet/cellular management benchmarks and a security layer that may require human affirmation for dangerous steps.

What the mannequin really ships?

Developers name a brand new computer_use software that returns perform calls like click_at, type_text_at, or drag_and_drop. Client code executes the motion (e.g., Playwright/Browserbase), captures a recent screenshot/URL, and loops till the duty ends or a security rule blocks it. The supported motion house is 13 predefined UI actions—open_web_browser, wait_5_seconds, go_back, go_forward, search, navigate, click_at, hover_at, type_text_at, key_combination, scroll_document, scroll_at, drag_and_drop—and might be prolonged with customized features (e.g., open_app, long_press_at, go_home) for non-browser surfaces.

https://weblog.google/expertise/google-deepmind/gemini-computer-use-model/

What is the scope and constraints?

The mannequin is optimized for internet browsers. Google states it’s not but optimized for desktop OS-level management; cellular situations work by swapping in customized actions beneath the identical loop. A built-in security monitor can block prohibited actions or require person affirmation earlier than “high-stakes” operations (funds, sending messages, accessing delicate data).

Measured efficiency

Online-Mind2Web (official): 69.0% go@1 (majority-vote human judgments), validated by benchmark organizers.
Browserbase matched harness: Leads competing computer-use APIs on each accuracy and latency throughout Online-Mind2Web and WebVoyager beneath equivalent time/step/surroundings constraints. Google’s mannequin card lists 65.7% (OM2W) and 79.9% (WebVoyager) in Browserbase runs.
Latency/high quality trade-off (Google determine): ~70%+ accuracy at ~225 s median latency on the Browserbase OM2W harness. Treat as Google-reported, with human analysis.
AndroidWorld (cellular generalization): 69.7% measured by Google; achieved through the identical API loop with customized cellular actions and excluded browser actions.

Early manufacturing alerts

Automated UI check restore: Google’s funds platform crew experiences the mannequin rehabilitates >60% of beforehand failing automated UI check executions. This is attributed (and needs to be cited) to public reporting quite than the core weblog submit.
Operational velocity: Poke.com (early exterior tester) experiences workflows typically ~50% sooner versus their next-best various.

Editorial Comments

Gemini 2.5 Computer Use is in public preview through Google AI Studio and Vertex AI; it exposes a constrained API with 13 documented UI actions and requires a client-side executor. Google’s supplies and the mannequin card report state-of-the-art outcomes on internet/cellular management benchmarks, and Browserbase’s matched harness reveals ~65.7% go@1 on Online-Mind2Web with main latency beneath equivalent constraints. The scope is browser-first with per-step security/affirmation. These information factors justify measured analysis in UI testing and internet ops.

Check out the GitHub Page and Technical details. Feel free to take a look at our GitHub Page for Tutorials, Codes and Notebooks. Also, be happy to comply with us on Twitter and don’t overlook to be a part of our 100k+ ML SubReddit and Subscribe to our Newsletter.

The submit Google AI Introduces Gemini 2.5 ‘Computer Use’ (Preview): A Browser-Control Model to Power AI Agents to Interact with User Interfaces appeared first on MarkTechPost.

Agentic AI AI Agents

Meta’s ARE + Gaia2 Set a New Bar for AI Agent Evaluation under Asynchronous, Event-Driven Conditions
ByRicardo October 14, 2025

Meta AI has launched Agents Research Environments (ARE), a modular simulation stack for creating and operating agent duties, and Gaia2, a follow-up benchmark to GAIA that evaluates brokers in dynamic, write-enabled settings. ARE offers abstractions for apps, environments, occasions, notifications, and eventualities; Gaia2 runs on high of ARE and focuses on capabilities past search-and-execute. https://ai.meta.com/analysis/publications/are-scaling-up-agent-environments-and-evaluations/…

Read More Meta’s ARE + Gaia2 Set a New Bar for AI Agent Evaluation under Asynchronous, Event-Driven Conditions
Agentic AI AI Agents

OpenAI Just Released the Hottest Open-Weight LLMs: gpt-oss-120B (Runs on a High-End Laptop) and gpt-oss-20B (Runs on a Phone)
ByRicardo August 6, 2025

OpenAI has just sent seismic waves through the AI world: for the first time since GPT-2 hit the scene in 2019, the company is releasing not one, but TWO open-weight language models. Meet gpt-oss-120b and gpt-oss-20b—models that anyone can download, inspect, fine-tune, and run on their own hardware. This launch doesn’t just shift the AI…

Read More OpenAI Just Released the Hottest Open-Weight LLMs: gpt-oss-120B (Runs on a High-End Laptop) and gpt-oss-20B (Runs on a Phone)
Agentic AI Artificial Intelligence

What is Agentic RAG? Use Cases and Top Agentic RAG Tools (2025)
ByRicardo August 27, 2025August 27, 2025

Desk of contents What is Agentic RAG? Use Cases and Applications Top Agentic RAG Tools & Frameworks (2025) Open-source frameworks Vendor/managed platforms Key Benefits of Agentic RAG FAQ 1: What makes Agentic RAG different from traditional RAG? FAQ 2: What are the main applications of Agentic RAG? FAQ 3: How do agentic RAG systems improve…

Read More What is Agentic RAG? Use Cases and Top Agentic RAG Tools (2025)
Agentic AI AIAI

AIAI Toronto, 2025
ByRicardo December 5, 2025

Stream each session from AIAI Toronto, with periods from OpenAI, Nvidia, BMO Financial Group, Meta and extra.

Read More AIAI Toronto, 2025
Agentic AI AI Agents

Building AI agents is 5% AI and 100% software engineering
ByRicardo September 19, 2025

Production-grade agents stay or die on information plumbing, controls, and observability—not on mannequin alternative. The doc-to-chat pipeline under maps the concrete layers and why they matter. What is a “doc-to-chat” pipeline? A doc-to-chat pipeline ingests enterprise paperwork, standardizes them, enforces governance, indexes embeddings alongside relational options, and serves retrieval + era behind authenticated APIs with…

Read More Building AI agents is 5% AI and 100% software engineering
Agentic AI AI Shorts

From Gemma 3 270M to FunctionGemma, How Google AI Built a Compact Function Calling Specialist for Edge Workloads
ByRicardo December 30, 2025

Google has released FunctionGemma, a specialized version of the Gemma 3 270M model that is trained specifically for function calling and designed to run as an edge agent that maps natural language to executable API actions. But, What is FunctionGemma? FunctionGemma is a 270M parameter text only transformer based on Gemma 3 270M. It keeps…

Read More From Gemma 3 270M to FunctionGemma, How Google AI Built a Compact Function Calling Specialist for Edge Workloads

Google AI Introduces Gemini 2.5 ‘Computer Use’ (Preview): A Browser-Control Model to Power AI Agents to Interact with User Interfaces

What the mannequin really ships?

What is the scope and constraints?

Measured efficiency

Early manufacturing alerts

Editorial Comments

Meta’s ARE + Gaia2 Set a New Bar for AI Agent Evaluation under Asynchronous, Event-Driven Conditions

OpenAI Just Released the Hottest Open-Weight LLMs: gpt-oss-120B (Runs on a High-End Laptop) and gpt-oss-20B (Runs on a Phone)

What is Agentic RAG? Use Cases and Top Agentic RAG Tools (2025)

AIAI Toronto, 2025

Building AI agents is 5% AI and 100% software engineering

From Gemma 3 270M to FunctionGemma, How Google AI Built a Compact Function Calling Specialist for Edge Workloads

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!

What the mannequin really ships?

What is the scope and constraints?

Measured efficiency

Early manufacturing alerts

Editorial Comments

Similar Posts

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!