Fireship: GPT 4.5's release is underwhelming, offering no significant advancements and being highly expensive.
Fireship - GPT-4.5 shocks the world with its lack of intelligence...
GPT 4.5, released by OpenAI, is the most expensive AI model to date, costing $150 per million output tokens. Despite its high cost, it fails to surpass benchmarks or introduce new capabilities. The model's main feature is its ability to chat in a more human-like manner, but this is subjective and not universally appreciated. Criticism includes its high expense and limited improvements over previous models. The model also has a lower hallucination rate but still makes errors. OpenAI's future plans involve scaling models with significant financial backing, but current advancements are seen as disappointing. The AI plateau is beneficial for computer science students, as AI coding tools remain useful for skilled programmers.
Key Points:
- GPT 4.5 is the most expensive AI model, costing $150 per million output tokens.
- The model offers no significant advancements or new capabilities, focusing on 'Vibes' for more natural conversation.
- Critics highlight its high cost and limited improvements over previous models.
- OpenAI plans to scale models with substantial financial backing, but current progress is seen as disappointing.
- The AI plateau benefits computer science students, as AI tools are still valuable for skilled programmers.
Details:
1. 🚂 The AI Hype Train Derailed: GPT 4.5's Underwhelming Release
- Open AI's GPT 4.5 is the most expensive AI model released yet it does not surpass existing benchmarks, win awards, or introduce novel capabilities.
- The primary feature of GPT 4.5 is its ability to chat in a more natural, human-like manner, which is marketed as 'Vibes.'
- Despite the high cost, GPT 4.5 fails to outperform previous models in key performance metrics such as language understanding benchmarks, raising concerns about its value proposition.
- The focus on 'Vibes' as a leading feature highlights a shift towards more qualitative improvements, rather than quantitative leaps in AI capabilities.
- GPT 4.5's release suggests a saturation point in current AI development trends, where newer models offer incremental improvements rather than groundbreaking innovations.
2. 🙅♂️ Sam Altman's No-Show: Prioritizing Family Over Launch
- Despite the anticipation, Sam Altman prioritized staying with his newborn over attending the product launch, reflecting a commitment to family over business obligations.
- Interns were sent to handle the product demo, highlighting the importance of delegation and trust within a team, especially during critical events.
- The launch was for Orion, indicating a significant event in the tech industry, yet Altman's choice suggests a shift in traditional leadership roles towards more personal work-life balance.
3. 📉 AI Progress Stagnation: A Disappointing Technological Plateau
- In 2023, tech leaders signed a petition to halt the training of large AI models, indicating significant concerns within the industry about the direction and implications of such technological advancements.
- Sam Altman, a prominent figure in the tech industry, appealed to the government for regulatory measures on AI, underscoring the urgency and seriousness of the situation.
- The release of GPT 4.5 was met with disappointment, suggesting that expectations for advancements in AI capabilities were not met and indicating a possible plateau in AI progress.
- There is speculation about reaching the limits of pre-training in generative transformers, pointing towards a need for new methodologies or innovations in AI development.
4. 💸 Steep Costs of GPT 4.5: A Pricey Benchmark
- GPT 4.5 costs $75 per million input tokens and $150 per million output tokens, significantly higher than Claude's $15 per million tokens, highlighting its expensive nature.
- Access to GPT 4.5 is limited to Pro users at a subscription cost of $200 per month, suggesting a premium positioning.
- OpenAI justifies the high cost with the introduction of the Vibes Benchmark, which aims to measure creative thinking, although the effectiveness of this benchmark remains a subjective matter. The Vibes Benchmark represents an innovative attempt to quantify creativity, but its impact on user experience and cost justification requires further evaluation.
5. 🤖 GPT 4.5's Mixed Capabilities: Natural Vibes with Flaws
- GPT 4.5 exhibits a significantly reduced hallucination rate compared to earlier versions, marking a substantial improvement in accuracy.
- Despite these advancements, GPT 4.5 still experiences occasional errors, such as making silly mistakes, indicating room for further refinement.
- The model lacks self-awareness and does not understand its own identity or version, as it cannot recognize itself as GPT 4.5.
- The training cut-off for GPT 4.5 is set at October 2023, which is essential for understanding the scope of its data coverage.
- An example of its capabilities includes accurately identifying the number of 'R's in the word 'Strawberry', demonstrating its proficiency in specific language tasks.
6. 🔧 Programming Challenges: GPT 4.5's Performance vs. Cost
- GPT 4.5 is less effective in programming and science tasks compared to deep thinking models like 03, indicating a potential gap in its design for these specific areas.
- It performs poorly on the AER polyglot coding Benchmark, being worse at programming than deep seek, which highlights a significant performance issue in coding tasks.
- GPT 4.5 is hundreds of times more expensive than alternatives, despite poorer performance, suggesting that its cost-effectiveness is questionable in scenarios requiring programming efficiency.
- For instance, deep thinking models outperform GPT 4.5 in complex problem-solving and coding tasks, making them more suitable for technical challenges.
- The high cost of GPT 4.5 does not correlate with its performance in programming, as evidenced by its lower benchmark scores and efficiency metrics compared to more specialized models.
7. 🔮 OpenAI's Future and Market Perception: Declining Odds
- OpenAI is currently favored to have the best AI model by the end of 2025, but their odds are declining, indicating growing competition and market skepticism.
- XAI's Gro has surpassed OpenAI's models in the betting markets, suggesting a shift in perception regarding AI leadership.
- OpenAI needs to raise billions for its transition to a for-profit model, requiring it to maintain a high valuation amidst increasing competition.
- Their strategy involves scaling models significantly, relying on substantial investments from entities like SoftBank and Saudi investors to remain competitive.
- There is a growing concern about the ability to improve GPT-5 meaningfully despite increasing parameters and computing power, which could impact OpenAI's strategic positioning.
- GPT 4.5 remains OpenAI's largest model to date, with GPT-5 expected to function more as a routing system, which has been seen as underwhelming by some in the industry.
- The declining odds may influence OpenAI's future fundraising and strategic partnerships, impacting its overall market trajectory.
8. 🎓 Embracing AI Education: Learning with Brilliant
- AI coding tools are most useful to human programmers who have a foundational understanding of programming.
- Brilliant provides a platform with interactive, hands-on lessons that simplify deep learning concepts.
- Users can understand the math and computer science behind AI technology with minimal daily effort.
- The platform offers a 30-day free trial at brilliant.org/fireship.
- It is recommended to start with Python and explore the course on how large language models work for deeper understanding of AI technologies like ChatGPT.