Critical steps to unlock our vision for a universal AI assistant: Demis Hassabis

ByVishal Mathur

05-21

“More intelligence is available, for everyone, everywhere. And the world is responding, adopting AI faster than ever before…What all this progress means is that we’re in a new phase of the AI platform shift. Where decades of research are now becoming a reality for people, businesses and communities all over the world,” said Sundar Pichai, chief executive officer (CEO) Google and Alphabet.

Pichai cited an example of Project Starline, a 3D video streaming technology from a few years ago, as the underlying technology for the new and precise Google Beam AI video communications platform that rolls out later this year on HP’s computing devices. One of its claimed party pieces — head movement tracking, to the millimetre.

AI agents prove to be a continuing theme, something OpenAI, IBM, Anthropic and Microsoft recently, too have made a case for.

“Our recent updates to Gemini are critical steps towards unlocking our vision for a universal AI assistant, one that's helpful in your everyday life, that's intelligent and understands the context you're in, and that can plan and take actions on your behalf across any device. This is our ultimate goal for the Gemini app, an AI that's personal, proactive and powerful,” noted Demis Hassabis, CEO of Google DeepMind, in a session of which HT was a part.

For Google, AI agents will be the result of a multi-pronged approach, one that sees Gemini 2.5 model imbibe enhanced reasoning, the Gemini app adding video understanding alongside Canvas for creative coding or creating podcasts, as well as availability of new video generation model Veo 3 and image generator Imagen 4, within the app, that eventually leads to a universal AI.

This builds on Project Astra, to give AI situational context, such as video understanding, screen sharing and memory.

Google said Gemini, and that also includes its apps for Android and iOS, has crossed 400 million monthly active users and 7 million developers worldwide are building apps with these models. This will be a culmination of Project Mariner, which as Hassabis explained, “explores the future of human-agent interaction, starting with browsers”.

This now includes a system of agents that can complete up to ten different tasks at a time. Hassabis said these tasks can include looking up information, making bookings, buying things, and researching a topic, in parallel.

Alongside, Gemini Live, with camera and screen sharing, is now available for all users on the free tier, on Android devices as well as the Apple iPhone. “In the coming weeks, Gemini Live will integrate more deeply into your daily life. Planning a night out with friends? Discuss the details in Gemini Live, and it instantly creates an event in your Google Calendar,” explains Hassabis, detailing integration plans for Google Maps, Tasks and Keep too.

Google estimated earlier that its rival OpenAI’s ChatGPT had roughly 600 million monthly users in March. Meta’s Mark Zuckerberg claimed in September that Meta AI was then nearing 500 million monthly users.

Incoming improvements for Gemini 2.5 Pro, add new reasoning capabilities with Deep Think mode. Its specific focus on complex math and coding tasks, will be relevant for Gemini’s march towards an ‘agentic AI’ vision. This focus on sophisticated reasoning aligns with a wider industry trend towards AI that can not only generate content but also perform complex problem-solving — OpenAI’s o1, Anthropic’s Claude and xAI’s Grok 3 are examples.

“Since incorporating LearnLM, our family of models built with educational experts, 2.5 Pro is also now the leading model for learning. In head-to-head comparisons evaluating its pedagogy and effectiveness, educators and experts preferred Gemini 2.5 Pro over other models across a diverse range of scenarios,” said Koray Kavukcuoglu, chief technology officer (CTO) of Google DeepMind.

The lighter Gemini 2.5 Flash receives improved reasoning, multimodality, code and long context. For now, the updated 2.5 Flash is available as ‘experimental’ in Google AI Studio for developers, in Vertex AI for enterprises, and the Gemini app for everyone — its final release is pegged for early June.

Playing a crucial part in Google’s universal AI assistant development, is the company’s Search platform. An AI Mode is being added to search, starting with users in the US, utilising Gemini’s frontier capabilities for advanced reasoning and multimodality.

Liz Reid, who is vice president, Head of Google Search, said the AI Mode will use query fan-out technique, to break down any question asked by a user, into further subtopics. “This enables Search to dive deeper into the web than a traditional search on Google, helping you discover even more of what the web has to offer and find incredible, hyper-relevant content that matches your question,” said Reid. Joining visual search pursuits alongside Google Lens is Search Live, which will allow a user to point the phone’s camera at anything around them to begin a search and carry it on conversationally.

免责声明：投资有风险，本文并非投资建议，以上内容不应被视为任何金融产品的购买或出售要约、建议或邀请，作者或其他用户的任何相关讨论、评论或帖子也不应被视为此类内容。本文仅供一般参考，不考虑您的个人投资目标、财务状况或需求。TTM对信息的准确性和完整性不承担任何责任或保证，投资者应自行研究并在投资前寻求专业建议。

老虎证券

Critical steps to unlock our vision for a universal AI assistant: Demis Hassabis

热议股票