The virtual assistants we know and love, such as Siri, Alexa, and Google Assistant, have become household names. They have given us a glimpse into the world of artificial intelligence (AI) and voice recognition technology. However, in light of recent advancements in generative AI, these legacy assistant bots seem less impressive than ever before. The integration of AI-powered chatbots like ChatGPT and Google Bard into virtual assistants holds the promise of a more exciting and capable future.
To get a taste of what lies ahead, I decided to test an experimental AI voice helper called vimGPT. This voice assistant demonstrated impressive skills when I asked it to “subscribe to WIRED.” It effortlessly found the correct web page and accessed the online subscription form. Although a simple task for a human, performing an online transaction involves numerous complexities beyond the capabilities of mainstream virtual assistants. From deciphering the user’s request to navigating the web and interacting with relevant pages and forms, there is a significant gap in their abilities. Nonetheless, my helper successfully navigated to WIRED’s subscription page, showcasing the potential of future AI integration.
The vimGPT voice assistant, powered by Google’s open-source browser Chromium, does not store user information. While it excelled at finding funny cat videos and cheap flights, it stumbled at the final hurdle of my subscription request due to a lack of credit card details. Although vimGPT, built by a lone developer named Ishan Shah, is currently an experimental open-source program, technology giants such as Apple and Google are likely conducting similar experiments to improve their own virtual assistants.
vimGPT stands on the shoulders of GPT-4V, the multimodal version of OpenAI’s renowned language model. By analyzing user requests, this AI-powered assistant can effectively determine what to click or type, surpassing the limitations of text-only software that struggles with deciphering complex website structures. With advancements like these, the future of using computers is poised to undergo a significant transformation. Shah predicts that within a year, browsing the web will involve less clicking and more conversing, with virtual assistants playing an integral role.
Shah is not alone in envisioning the next logical step in the evolution of virtual assistants. Ruslan Salakhutdinov, a professor at Carnegie Mellon University and former director of AI research at Apple, believes that Siri and other assistants are ripe for a powerful AI upgrade. According to Salakhutdinov, the future lies in agents capable of completing useful tasks. While connecting Siri to AI similar to that powering ChatGPT would be advantageous, the true impact lies in empowering virtual assistants to proactively solve problems on users’ behalf.
Salakhutdinov and his students have developed simulated environments, collectively known as VisualWebArena, to test and refine the capabilities of AI helpers. These simulated environments include a dummy ecommerce website, a Reddit-like message board, and a classified ads website. By subjecting AI agents to these virtual challenges, researchers aim to enhance their ability to accomplish tasks. This research indicates a shift towards agents that can navigate the web and complete complex actions independently.
The future of virtual assistants holds great promise. As technologies like generative AI and multimodal language models continue to advance, virtual assistants will undoubtedly undergo significant transformations. The limitations of today’s virtual assistants, like Siri and Alexa, will become a thing of the past. Users will experience a new era where virtual assistants seamlessly execute tasks, simplifying our lives and revolutionizing our relationship with technology. The road ahead is exciting, and the transformative potential of AI-powered virtual assistants is just beginning to unfold.
Leave a Reply