AI Application

Matt Wolfe: OpenAI has made its deep research feature available on the free plan, and new AI models and features are being developed across various platforms.

Matt Wolfe• 43 episodes

Matt Wolfe - Every Major AI Update This Week in One Video

OpenAI has introduced a lightweight version of its deep research feature to free users, allowing five uses per month. This feature performs in-depth internet searches and is also available in a limited capacity to Plus, Team, and Pro users. Additionally, OpenAI plans to release a new model in June that can be run locally, potentially outperforming models from Meta and Deepseek. Meanwhile, Microsoft has updated its 365 Copilot with new features like AI-powered search and agent stores, and Perplexity has launched a Perplexity Assistant in its iOS app, enhancing functionality similar to Siri. In other developments, the Washington Post has partnered with OpenAI for content search, and YouTube is testing an AI overview feature to highlight video clips. Enthropic has released essays on AI harms and interpretability, emphasizing the need for understanding AI's impacts beyond doomsday scenarios. New APIs from OpenAI and Grok are available, and Adobe has updated its Firefly app with new image models. Companies like Crea AI and Tencent are advancing in AI-generated 3D environments and models, while Character AI and Argil are enhancing AI avatars for interactive experiences. Descript is testing AI video editing features, and the Oscars have stated that AI tools neither help nor harm film nominations.

Key Points:

OpenAI's deep research feature is now available on the free plan with limited uses, enhancing internet search capabilities.
Microsoft's 365 Copilot has new AI-powered features for better data management and search.
Perplexity Assistant offers enhanced Siri-like functionalities in its iOS app.
YouTube is testing AI-generated video clip highlights to improve search results.
Enthropic emphasizes understanding AI's broader impacts and improving model interpretability.

Details:

1. 🔍 OpenAI's Deep Research Update

1.1. Deep Research Feature on Free Plan

1.2. Lightweight Version for Plus, Team, and Pro Users

2. 🆕 Upcoming OpenAI Model Rumors

The new OpenAI model is expected to be released around June or early summer and will be available for free download, allowing it to run on local machines without needing API access.
The model is anticipated to outperform open models from Meta and Deepseek, potentially equaling Meta's Lama 4 model's 10 million token context window.
A unique feature may enable the model to call upon other OpenAI models with APIs for complex queries, significantly boosting computational capabilities.
Uncertainty remains about the model's ability to perform web searches or image generation independently.
Running the model locally could mitigate privacy concerns, offering users the ability to disconnect from the internet and avoid cloud-based processing.
Local operation could enhance computational efficiency and user autonomy, addressing privacy and data security issues.

3. 📰 Washington Post Partners with OpenAI

The Washington Post and OpenAI partnership allows OpenAI's search features to access the Post's content, enhancing AI-driven content accessibility.
This collaboration is strategically aimed at reducing potential legal issues related to unauthorized content use by OpenAI.
OpenAI is proactively forming partnerships with news outlets to legally access content, preventing litigation and fostering cooperative relationships.
Such partnerships could redefine the media landscape by integrating AI capabilities more deeply into content distribution and accessibility, offering mutual benefits to both news organizations and AI developers.

4. 📱 Perplexity Assistant Launch

4.1. Key Features

4.2. Performance

5. 💼 Microsoft 365 Copilot Enhancements

AI-powered search capabilities have been introduced, allowing users to find information faster and more efficiently, potentially reducing time spent on manual searches by a significant margin.
The 'Create Experience' feature, including Copilot Notebooks, helps users maintain context and organize information effectively during interactions with AI, aiming to streamline workflow and improve productivity.
The new Agent Store provides access to a variety of agents, including connections to popular APIs with tools like monday.com, Dropbox, and Trello, expanding the integration possibilities for users.
Task-specific agents such as 'Researcher' and 'Analyst' are optimized for conducting deep research and analyzing data from Excel and Word documents, enhancing the analytical capabilities within Microsoft 365.
Demonstrated capabilities include creating data visualizations, such as bar charts, directly from Microsoft account data, showcasing the practical application of these AI enhancements.

6. 🖥️ Microsoft's Recall Feature

Microsoft's Recall feature is being rolled out, offering a computer-wide browser history functionality that allows users to revisit previous actions across multiple apps like Da Vinci Resolve or Microsoft Word.
Security concerns, such as password visibility, have been addressed with an opt-in feature that gives users control over participation and data saved.
Data is processed locally on the device, not shared with Microsoft or stored in the cloud, ensuring privacy.
Recall enhances productivity by allowing users to resume work seamlessly, without relying on memory, by retrieving digital history quickly.
AI and neural processing units improve search functionality to retrieve queries even without specific file names.
The 'click to do' feature enables summarizing, rewriting, or copying and pasting text and images, enhancing user productivity.

7. 📸 Grock Chatbot Gains Vision

Grock Chatbot now includes vision functionality similar to Gemini and OpenAI models, enhancing its capability to interpret visual data.
This vision feature is integrated into the Gro mobile app, allowing users to interact with their environment through the app.
Users can activate the camera feature by clicking the camera icon in the chat window, enabling Grock to analyze and describe the visual surroundings.
In a practical example, Grock accurately identified a user's workstation setup, including a monitor displaying a menu, a camera, and other tech gear.
Grock was also able to describe a scenic picture with specific details, such as a coastal landscape, beach, waves, and greenery, demonstrating its ability to understand and articulate visual content.
The model's limitation in identifying the exact location of the picture was noted, although it made an educated guess based on visual cues.
The Grock 3 model has been praised for its impressive capabilities, though it lacked an API for broader tool integration, which may have limited its recognition and application.

8. 🎥 LTX Studio's New Video Generation

8.1. Pricing and Cost Efficiency

8.2. Features and Flexibility

9. 🕶️ Ray-Ban Meta's Live Translation

Ray-Ban Meta has introduced live translation features in their glasses, allowing real-time translation through the integrated headphones.
Users can hear translations in their preferred language as someone speaks to them in another language, such as hearing English while being spoken to in Spanish.
The feature is available even offline if users download a language pack in advance, addressing connectivity issues.
The live translation feature was positively received during testing at MetaConnect, showcasing its potential for real-world application.

10. 🎬 YouTube's AI Overview Test

YouTube is testing an AI overview feature that highlights video clips relevant to user's search queries directly in the search results.
Unlike Google's AI text summaries, YouTube's AI will highlight and play short clips from videos which might reduce the need for users to click into videos.
The feature is currently being tested with a small group of YouTube Premium users in English.
The AI will not summarize the entire video but will pull specific clips that address the search query, potentially answering user questions directly from the search results.
This feature could disincentivize creators as it may reduce video clicks, impacting viewer engagement and potentially affecting creators' revenue.
The AI identifies relevant clips using advanced algorithms to ensure accuracy and relevance to the search query.
Feedback from the testing phase indicates that users find this feature helpful for quick information retrieval, while creators express concerns about reduced video engagement.
This initiative aligns with YouTube's strategy to enhance user experience by providing instant, relevant content in search results, differentiating it from traditional text-based summaries.

11. 📑 Anthropic's AI Safety Essays

Anthropic advocates for addressing a broad spectrum of AI harms, including physical, psychological, economic, societal, and impacts on individual autonomy, beyond just catastrophic scenarios.
Adjustments to Claude 3.7 have resulted in a 45% reduction in refusals of harmless prompts, while still effectively filtering out dangerous ones.
An article by Anthropic emphasizes the importance of detecting and countering malicious AI uses, such as political bot farms and malware creation, highlighting the necessity for consumer vigilance against AI-generated misinformation.
The CEO, Daario, in his essay 'The Urgency of Interpretability,' calls for enhanced understanding of AI models to better manage risks, suggesting the development of tools to map AI reasoning processes.
Anthropic prioritizes understanding AI mechanisms over rapid deployment, which may explain their slower pace compared to competitors like OpenAI.

12. 🖼️ OpenAI's Image Generation API

OpenAI has launched an image generation model in their API, enabling developers to incorporate advanced image creation features into their applications seamlessly.
The API includes sophisticated technology previously utilized for creating Studio Ghibli-style images and generating YouTube thumbnails within ChatGPT, showcasing its versatile application potential.
The introduction of the Gro 3 Mini API demonstrates significant improvements, outperforming the Gemini 2.5 and several other models like Flash 04 Mini, High Deepseek R1, and Claude 3.7 Sonnet Thinking model in specific benchmarks.
This enhancement suggests that Gro 3 Mini API is particularly effective in scenarios demanding high-quality image generation and creative visual tasks, providing developers with a competitive edge in these areas.

13. 🖌️ Adobe Firefly Update

13.1. Model Options and Features

13.2. User Feedback and Use Cases

14. 🎭 Crea AI's 3D Stage Feature

Crea AI's new feature allows users to edit images seamlessly within chat using the GPT image model, offering creative transformations such as 'giblified' and 'frogified' versions, enhancing user engagement and creative output.
The 'Stage' feature empowers users to generate and modify 3D environments from images or text prompts, such as creating a cowboy movie scene with various manipulatable assets, significantly enhancing the creative process and user interactivity by allowing personalization and dynamic scene adjustments.

15. 🆕 Tencent's Hunan 3D Model

Tencent released the Hunan 3D 2.5 model, expanding from 1 billion to 10 billion parameters, indicating a significant increase in complexity and capability.
The model features high-quality textures and an animation boost, enhancing its visual and functional performance.
The demo video highlights its impressive capabilities, suggesting potential applications such as integration into Korea's tools via API.
Compared to its predecessor, Hunan 3D 2.5 offers substantial improvements in both texture detail and animation fluidity, making it a powerful tool for developers.

16. 💡 Character AI's Video Generation

16.1. Character AI's Video Generation Feature

16.2. Character AI's Avatar Effects

17. 🛒 Argil's Product Holding Avatars

17.1. Introduction of AI Product Holding Avatars

17.2. Examples and Current Status

17.3. Use Cases for Branding

17.4. Creating Mascots with AI

17.5. Market Implications and Comparisons

18. 👄 Tavis Lip-Syncing Model

The Tavis Lip-Syncing Model is currently regarded as the best available model for lip-syncing.
Despite its high standing, the AI-generated lip-syncing still presents a sense of uncanniness, which some users find unnatural.
Extensive testing over 'a thousand prompts' reveals that the alignment of lip movements still appears slightly off, contributing to this uncanniness.
A specific demo using Donald Trump's voice highlighted that while improvements have been made, the synchronization of lip movements is not fully perfected, affecting the overall naturalness.
Comparatively, other models may offer better alignment or naturalness, suggesting areas for further development and benchmarks for improvement.

19. ✂️ Descript's AI Video Editing

Descript is developing AI features to enhance video editing, aiming to create an intuitive 'cursor' for AI-driven edits.
The AI can autonomously generate a script draft from user prompts, such as creating a concise one-minute video with tips for appearing natural on camera.
In practice, the AI executed 13 edits and successfully condensed a video by approximately 2 minutes when prompted with 'Can you edit this down?'
Specific AI capabilities include adding chapter title cards, stock overlays, screen recordings, and masking jump cuts with zooms to enhance video quality.
While the technology is not publicly available yet, Descript offers applications for testing these cutting-edge AI features.

20. 🏆 Oscars and AI in Films

The Academy focuses on human creative authorship, which means generative AI and digital tools neither help nor harm a film's chances for Oscar nominations.
AI-generated scenes can be part of films, but fully AI-generated movies are unlikely to win awards, indicating a clear boundary in the Academy's current acceptance of AI in creative processes.
The Academy's guidelines suggest that while AI can assist in production, the primary creative input must come from humans, ensuring that awards recognize traditional creative efforts.
The evolving role of AI in film production is acknowledged, yet the emphasis remains on maintaining human authorship and creativity as the core criterion for awards.

21. 🌐 OpenAI's Interest in Chrome

21.1. Motivations for Acquiring Chrome

21.2. Implications of an AI-First Browser

22. 🤖 DeepMind CEO on AI Consciousness

22.1. CEO's Insights on AI Self-Awareness

22.2. Implications of AI Developing Self-Awareness

23. 📈 AI News Recap and Future Plans

23.1. AI News Recap

23.2. Future Plans