Digestly

Jan 30, 2025

3D Models in a Snap & AI Privacy Concerns 🚀🔒

AI Application
Two Minute Papers: NVIDIA's InstantSplat AI can create 3D models from just three photos, revolutionizing 3D modeling with speed and quality.
Fireship: OpenAI accuses Deep Seek of intellectual property theft, alleging they used OpenAI's outputs to fine-tune their models, amidst a competitive AI landscape with emerging Chinese models.
Skill Leap AI: The video discusses privacy concerns with the Deep Seek website and offers alternatives for safer usage.

Two Minute Papers - NVIDIA Unveils AI For 150x Faster 3D Modeling!

NVIDIA has introduced a groundbreaking AI technique called InstantSplat, which can generate a fully immersive 3D model from just three photos. This represents a significant advancement over previous methods, which struggled to produce clear images due to insufficient data. InstantSplat synthesizes the missing information, creating high-quality 3D models in seconds. This technique is not only faster but also more accurate, capturing complex visual effects like specular reflections that require an understanding of material properties. Users can try this technology by uploading their images, and it even works with unposed photos, making it highly accessible. The method is based on Gaussian Splatting, which efficiently represents surfaces with minimal data storage. While it excels at modeling surfaces, it is less effective for volumetric objects like smoke. However, complementary research is addressing these limitations with advanced light transport simulations. The technology is available for free, offering immense potential for applications in gaming and virtual reality.

Key Points:

  • NVIDIA's InstantSplat AI creates 3D models from three photos, synthesizing missing data for high-quality results.
  • The technique is 150 times faster than previous methods, producing models in seconds instead of minutes.
  • InstantSplat captures complex visual effects, such as specular reflections, enhancing realism.
  • The technology is accessible, allowing users to upload unposed photos and still achieve accurate models.
  • InstantSplat is based on Gaussian Splatting, efficiently storing surface data, and is available for free.

Details:

1. 📸 From 2D Photos to 3D Models: The Challenge

  • Converting 2D photos to 3D models using only three images is currently ineffective, as it results in poor quality models due to insufficient data. This is akin to trying to bake a cake with minimal ingredients, indicating the inadequacy of existing technologies for this task.
  • The primary limitation is the lack of depth information and detail that three images provide, which are crucial for accurate 3D reconstruction. More comprehensive data or advanced algorithms are needed to improve model fidelity.
  • Examples of current methods often result in models with missing parts or incorrect proportions, demonstrating the need for more sophisticated solutions or additional input data.
  • Exploration of alternative approaches, such as using more images or integrating AI-driven techniques, could potentially address these shortcomings and enhance 3D model accuracy.

2. 🚀 NVIDIA's InstantSplat: A Game Changer

2.1. Introduction to InstantSplat

2.2. Performance Comparison with NoPe-NERF

3. ⚡ Paradigm Shift in 3D Modeling

  • A 3D model was created in just 9 seconds, indicating a significant improvement in speed over traditional methods.
  • The model was generated using only two photos taken on Mars, showcasing the potential of new AI techniques to simplify complex processes.
  • NASA's previous approach involved deploying billion-dollar rovers, whereas this AI method requires minimal resources, highlighting cost-effectiveness.
  • The generated models, despite having some visual artifacts, exhibit advanced features like specular reflections, demonstrating sophisticated AI capabilities in understanding material properties.
  • This advancement suggests potential applications in fields requiring rapid and resource-efficient modeling, such as space exploration and archaeology.
  • Challenges include improving visual artifacts and ensuring the accuracy of material representation in diverse environments.

4. 🌐 Creating Detailed Virtual Worlds

4.1. Virtual World Creation: Technical Improvements

4.2. Practical Implications and Industry Impact

5. 🛠️ Revolutionary Techniques and Their Impact

  • The technique leverages Gaussian Splatting to construct high-quality scenes efficiently, focusing on surfaces for better data storage management, as opposed to traditional volume-based methods.
  • A new method for cloud modeling using light transport simulation and ray tracing introduces Gaussian lumps as a data structure, effectively reducing noise and enhancing clarity over time.
  • An innovative approach to photo reconstruction eliminates the traditional Structure from Motion technique, presenting a significant advancement in the field.

6. 🌟 Future of 3D Modeling: Unlimited Possibilities

  • The Nintendo Switch 2 is set to be released soon, indicating an advancement in gaming technology.
  • It's now possible to create a virtual world by taking just three photos, highlighting a significant leap in 3D modeling capabilities.
  • The ability to model complex phenomena like explosions, smoke, and haze is now accessible, potentially revolutionizing visual effects and game design.
  • All these advancements are accessible for free, driven by research paper innovations, making them widely available.
  • The speed of these developments is 150 times faster than before, demonstrating rapid technological progress.
  • These advancements allow for the creation of video games set in real-world places using minimal input, opening new possibilities for game design.

Fireship - DeepSeek stole our tech... says OpenAI

OpenAI is accusing Deep Seek of intellectual property theft, claiming they used OpenAI's outputs to fine-tune their models through a process called distillation, which is against OpenAI's terms of service. This accusation comes as Deep Seek, a Chinese hedge fund-backed AI model, reportedly surpassed OpenAI's capabilities with significantly less investment. The situation is further complicated by the emergence of other competitive Chinese AI models, such as Alibaba's Quen 2.5 Max and Kim 1.5, which are challenging OpenAI's dominance. Despite the accusations, no concrete evidence has been provided, though Microsoft has reported suspicious data extraction activities linked to Deep Seek. The video also highlights the growing trend of open-source AI models, which are becoming increasingly efficient and accessible, encouraging developers to leverage these tools for innovation.

Key Points:

  • OpenAI accuses Deep Seek of using their outputs for model fine-tuning, violating terms of service.
  • Deep Seek reportedly developed a superior AI model with minimal investment, challenging OpenAI.
  • Emerging Chinese AI models are intensifying competition, potentially surpassing OpenAI.
  • Microsoft observed suspicious data extraction activities possibly linked to Deep Seek.
  • Open-source AI models are gaining traction, offering developers new opportunities for innovation.

Details:

1. 🌐 OpenAI vs Deep Seek: The IP Battle

1.1. OpenAI's Accusation of IP Theft

1.2. Impact on Business Relations

2. 🤖 Chinese AI Models Disrupting the Market

  • A Chinese hedge fund developed a state-of-the-art reasoning model that surpassed Open AI's capabilities, showcasing advanced AI features.
  • The development cost of the Chinese model was $5.5 million, significantly lower than typical industry costs, demonstrating a cost-effective approach to AI development.
  • The model was offered to the public with a 100% discount, challenging the business models of major tech companies, including Open AI, and altering market dynamics.
  • Open AI and other tech giants have been promoting the narrative that AI development is expensive, requiring investments like $500 billion Stargate data centers, which is contradicted by the Chinese model's cost efficiency.
  • Chinese companies are employing competitive strategies in the AI market that include offering superior technology at lower costs, thus posing a significant threat to established players.

3. 🕵️‍♀️ Allegations of IP Theft and Irony

  • David Sachs, part of the PayPal Mafia, accuses Deep Seek of stealing OpenAI's outputs to fine-tune their models, contravening OpenAI's terms of service.
  • Deep Seek's method, known as distillation, is explicitly prohibited by OpenAI, highlighting a direct violation.
  • OpenAI has faced its own criticisms for using internet data, including copyrighted material, without explicit permissions, adding an ironic dimension to these allegations.
  • Understanding distillation: This technique involves compressing a larger model's knowledge into a smaller one, which in this case, allegedly involved unauthorized use of OpenAI's data.
  • The broader implications: This case underscores ongoing tensions in AI about data usage rights and ethical AI development practices.

4. 💼 Tech Industry's Shady Practices and Copyright Battles

  • Tech companies often engage in questionable practices, opting to ask for forgiveness rather than permission. This strategy is exemplified by companies like Uber and Airbnb, which have disrupted traditional industries by initially ignoring regulations.
  • OpenAI has largely succeeded in its copyright infringement battles, demonstrating that tech companies can prevail in legal disputes despite engaging in controversial practices. This success may inspire other tech firms to adopt similar tactics.
  • A conspiracy theory suggests OpenAI used Deep Seek as a marketing strategy, illustrating the complex and sometimes opaque strategies employed by tech companies to gain public attention and market dominance.
  • Tech leaders, such as Sam Altman of OpenAI, are perceived as persuasive and potentially deceptive. This reflects a broader industry culture where strategic manipulation is common to maintain a competitive edge.
  • For instance, Uber's initial growth relied heavily on operating in legal grey areas, while Airbnb often clashed with local housing laws, both highlighting a willingness to prioritize growth over compliance.

5. 📊 Deep Seek's Distillation Controversy

  • Deep Seek is accused of using distillation, transferring knowledge from larger models like GPT-3 to smaller models, by OpenAI and Microsoft.
  • No conclusive evidence has been presented, but screenshots show Deep Seek's responses closely resemble those of ChatGPT, implying unauthorized use.
  • Microsoft detected substantial data extraction from OpenAI's API by accounts linked to Deep Seek, suggesting potential misuse.
  • While distillation is common and not inherently controversial, it becomes problematic when used to create a competing model directly from an API, which is the focus of OpenAI's complaint.
  • This controversy highlights the ethical and legal challenges in AI development, particularly around fair use and intellectual property.

6. 🚀 AI Race: China vs China and Global Implications

  • Alibaba's release of Quen 2.5 Max, an open model, outperforms DeepSeeker, Claude, and GPT 40 on benchmarks, highlighting significant advancements in AI capabilities.
  • The new Chinese model Kim 1.5 reportedly surpasses OpenAI's earlier models, indicating China's rapid progress in AI technology.
  • The AI competition within China is intensifying, suggesting a shift where the U.S. might be falling behind, while Europe focuses on different technological innovations.
  • DeepSeeker faces criticism for its high censorship levels, although it can be bypassed by skilled prompt engineers, which raises concerns about content control.
  • DeepSeeker has launched the Jan series models for diffusion-based image generation, which are open for commercial use, marking a step forward in accessible AI applications.

7. 🔍 Deep Seek's Technical Prowess and Privacy Concerns

  • Deep Seek achieved 10x better efficiency than other models by bypassing Nvidia's Cuda and using Nvidia parallel thread execution directly, akin to building a website with assembly code.
  • A major criticism of Deep Seek is that using it on the web sends all prompts, data, and keystrokes to China, raising privacy concerns.
  • Open source is gaining traction, and developers are encouraged to build products with open source tools like Post Hog.
  • Post Hog is an open-source, self-hostable tool with a free plan, offering features like product analytics, session replay, and AB testing, with easy implementation through web, mobile, and server-side SDKs.

Skill Leap AI - Watch This Before Using DeepSeek

The speaker highlights significant privacy issues with the Deep Seek website, emphasizing the lack of transparency in data collection and retention policies. The privacy policy indicates extensive data collection, including sensitive biometric data, without clear retention timelines or anonymization processes. Data is stored in China, raising compliance concerns with international privacy laws like GDPR. Users have limited control over their data, with no options to opt-out of data sharing with third parties. The speaker suggests alternatives, such as using a local version of Deep Seek R1 or accessing it through US-based services like Perplexity, to mitigate privacy risks.

Key Points:

  • Deep Seek collects extensive user data, including biometric data, without clear retention policies.
  • Data is stored in China, raising compliance issues with international privacy laws.
  • Users cannot opt-out of data sharing with third parties, impacting privacy.
  • Local installation of Deep Seek R1 or using US-based services can reduce privacy risks.
  • Consider alternative AI providers for sensitive or critical data usage.

Details:

1. 🔍 Understanding Deep Seek's Privacy Policies

  • Users are advised to thoroughly read Deep Seek's privacy policy and terms of use prior to using the website.
  • There are three significant concerns within Deep Seek's privacy documents: data collection practices, user consent clarity, and third-party data sharing.
  • A focused video provides a separate, in-depth examination of these privacy issues, distinct from regular content reviews.

2. 🚨 Key Privacy Concerns Unveiled

  • The policy outlines various types of data collected, including profile data, user input, automatically collected data, and third-party data.
  • Extensive user input collection involves text, audio, uploaded files, chat history, and feedback, but the policy does not clarify data retention duration or anonymization before analysis.
  • There is no explicit mention of whether user data is used to train AI models or for model improvements.
  • The policy mentions collecting keystroke patterns and rhythm, raising potential privacy concerns.
  • Data retention and anonymization processes are not clearly defined, potentially impacting user trust.
  • The policy lacks clarity on the use of user data for AI model development, which could affect transparency.

3. 🇨🇳 Data Handling and Storage in China

3.1. Data Handling and Storage Practices

3.2. Legal and International Implications

4. 🔒 Issues with Data Retention and Security

  • China's data laws, such as the Cyber Security Law and Data Security Law, mandate companies to share user data with Chinese authorities upon request, potentially conflicting with international privacy standards.
  • Companies with data centers in China, like DeepSeek, face risks as the government can request access to stored information at any time, challenging their ability to protect user privacy.
  • For companies with users in the EU and the US, compliance with GDPR and other international privacy laws is a significant concern due to potential conflicts with China's data requirements.
  • The lack of clarity on data processing compliance for EU users under GDPR highlights the need for transparent data management practices.
  • Vague data retention policies, stating data is kept 'as long as necessary,' often fail to meet compliance standards, risking penalties and loss of trust.
  • Companies like Facebook and Google have faced significant fines for GDPR non-compliance, emphasizing the importance of adhering to clear data retention timelines.
  • Implementing more precise data retention guidelines and ensuring alignment with international laws can mitigate risks and enhance user trust.

5. 👶 Concerns Over Age Verification

5.1. Data Security Concerns

5.2. Age Verification Challenges

6. 📜 Legal Challenges in Terms of Use

  • User data collection and sharing practices are a significant concern, particularly due to data storage in China and a lack of international transfer safeguards.
  • The updated terms of use from January 20, 2025, continue to raise issues regarding data collection and usage, offering no explicit opt-out option for users.
  • There is no clear indication of compliance with global privacy laws such as GDPR, and users are held fully responsible for AI-generated output.
  • Dispute resolution is restricted to China, limiting users' legal recourse options, and it is unclear if users can request complete data deletion.
  • Transparency regarding moderation decisions and data retention periods is lacking, suggesting data may be retained indefinitely.
  • It is advisable to exercise caution when using the service for sensitive legal or business-critical purposes.

7. 🔄 Exploring Alternatives to Deep Seek

  • Consider alternative AI providers that offer stronger privacy protection, legal compliance, and user rights safeguards than Deep Seek.
  • One practical solution is to install a local version of Deep Seek R1, which is compatible with Mac, PC, and Linux and available on platforms like ama.com.
  • While the online version of Deep Seek uses a 671 billion parameter model, this is impractical for local use due to its massive 400 GB size and high resource demands.
  • Users can opt for a smaller, feasible option like the 70 billion parameter model, which can be run locally and has been tested successfully.
  • Using Deep Seek locally allows users to bypass online terms and conditions, maintain privacy, and control their data by operating offline.

8. 💻 Local Installation Options for Privacy

  • Local installation ensures full privacy by operating offline on Local Host 3000, without needing an internet connection.
  • Users can download and switch between models like 32b, 671b, and 7B locally, although running the full 671b model may cause errors due to its size.
  • For those unable to run the 671b model locally, Perplexity offers a Pro subscription with US-hosted model integration for $20, enhancing search with reasoning capabilities.
  • Perplexity's R1 reasoning leverages information from multiple websites to deliver comprehensive answers, adding value to standard search functions.

9. 🛡️ Recommendations for Safe Usage

  • Host the software locally in the US to address privacy concerns, ensuring better control and management of sensitive information.
  • Avoid using the Deep Seek website for interactions involving sensitive data; opt for local alternatives or US-based partners to enhance data privacy.
  • A planned comparison between ChatGPT's 01 model and Deep Seek's R1 model will assess their reasoning capabilities, aiding in selecting the most secure and effective tool.