AI in devices and Robots

Humane AI Pin is officially dead; Rabbit shows off Android AI Agent; OpenAI cofounder Ilya Sutskever’s new AI startup valuation move from $5 billion to $30 billion; Can LLMs earn $1 Million from software engineering? Figure’s Vision Language Action Model for Humanoid Robot

What we have for you today

  • Humane AI Pin is officially dead

  • Rabbit shows off Android AI Agent

  • OpenAI cofounder Ilya Sutskever’s new AI startup valuation move from $5 billion to $30 billion

  • Can LLMs earn $1 Million from software engineering?

  • Figure’s Vision Language Action Model for Humanoid Robot

Humane AI Pin is officially dead

Humane AI Pin is officially dead

Humane has been acquired by HP for $116 million, leading to the immediate discontinuation of its AI Pin, a wearable AI assistant. Sales have stopped, and existing AI Pins will cease functioning on February 28, 2025, as they will no longer connect to Humane’s servers. Customers are advised to back up their data, and only those who purchased the device within the last 90 days are eligible for a refund.

Founded by ex-Apple employees, Humane had raised over $230 million and launched the AI Pin in April 2024, positioning it as a smartphone replacement. However, the device faced widespread criticism, poor sales, and a fire risk with its charging case. By mid-2024, return rates exceeded sales, and the company sought a buyer, initially aiming for a valuation between $750 million and $1 billion.

HP’s acquisition includes most of Humane’s employees, its CosmOS AI operating system, and over 300 patents, but not the AI Pin business itself. Humane’s founders will lead a new AI division within HP, focusing on integrating AI into HP’s PCs, printers, and other devices.

Rabbit shows off Android AI Agent

Rabbit is developing a cross-platform general agent system to enable smart, autonomous agents to act on users' behalf. They previously introduced LAM Playground, a web-based agent, and now they've showcased an Android agent capable of controlling apps and system settings.

The demo featured tasks like adjusting app notifications, sending AI-generated content via WhatsApp, searching YouTube, managing grocery lists, generating business plans in Google Docs, and even downloading and playing a game. While the agent successfully performed these tasks, it still has noticeable limitations in speed and intelligence, sometimes executing actions in an inefficient manner.

Rabbit’s AI remains a work in progress, especially after the failure of the Humane AI Pin and the initial shortcomings of the Rabbit R1. However, the company continues refining its system and promises more updates on its cross-platform multi-agent capabilities in the coming weeks.

Link to full article

OpenAI cofounder Ilya Sutskever’s new AI startup valuation move from $5 billion to $30 billion

Ilya Sutskever, a cofounder of OpenAI, has launched a new AI company, Safe Superintelligence (SSI), which is rapidly gaining traction in the AI industry. SSI is in the process of raising over $1 billion, with a valuation surpassing $30 billion, led by Greenoaks Capital Partners. This is a significant jump from its previous $5 billion valuation in September 2024.

Founded in June 2024 by Sutskever, Daniel Gross, and Daniel Levy, SSI remains secretive, with its only stated goal being the development of a “safe superintelligence.” The company has no product on the market and emphasizes that it will not be pressured by commercial demands.

Sutskever, a respected AI researcher, played a key role in OpenAI’s early success but clashed with CEO Sam Altman over AI commercialization. He was involved in Altman’s brief ousting in late 2023 before reversing his stance, which led to his own departure from OpenAI.

Despite limited public details about SSI’s approach, Sutskever’s track record and reputation have fueled strong investor interest. He has spoken about AI’s potential to solve major global issues while also warning of risks like fake news, cyberattacks, and AI-driven authoritarianism.

Can LLMs earn $1 Million from software engineering?

The paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?" introduces SWE-Lancer, a benchmark that evaluates large language models (LLMs) on 1,488 real freelance software engineering tasks from Upwork, collectively valued at $1 million USD in payouts. The benchmark assesses LLMs in two categories:

  1. Individual Contributor (IC) SWE Tasks: Models generate code patches to fix real-world software engineering problems. Performance is evaluated using end-to-end tests validated by professional engineers.

  2. SWE Manager Tasks: Models act as software engineering managers by selecting the best implementation proposal for a given issue, compared against real-world hiring decisions.

Key Findings:

  • The best-performing model, Claude 3.5 Sonnet, earned $208,050 on the benchmark's open evaluation set but still failed the majority of tasks.

  • Models performed better at SWE Manager tasks (decision-making) than IC SWE tasks (coding).

  • End-to-end tests provided a more realistic and robust evaluation than unit tests.

  • Higher test-time compute and multiple attempts improved model performance but did not close the gap to human freelancers.

Contributions:

  • Real-world economic evaluation: Maps LLM capabilities to actual monetary value.

  • Comprehensive testing: Uses end-to-end tests instead of unit tests to prevent AI "hacks."

  • Open-source benchmark: Facilitates research on AI's economic impact in software engineering.

Future Directions:

  • Further analysis of AI's economic impact on freelancing and software development.

  • Multimodal support (e.g., handling screenshots and videos).

  • Expansion to diverse repositories and engineering domains beyond Expensify and Upwork.

The study highlights that while LLMs have significant capabilities, they cannot yet replace human software engineers in real-world freelancing due to inconsistent correctness and reasoning gaps.

Figure’s Vision Language Action Model for Humanoid Robot

Helix is a Vision-Language-Action (VLA) model that integrates perception, language understanding, and control for advanced robotic manipulation. Key innovations include:

  • Full Upper-Body Control: First VLA to control a humanoid robot’s wrists, fingers, head, and torso in real-time.

  • Multi-Robot Collaboration: Enables two robots to work together on unseen objects using natural language commands.

  • Zero-Shot Generalization: Can pick up thousands of novel household items without prior training.

  • Unified Neural Network: Learns all behaviors without task-specific fine-tuning.

  • Commercial-Ready: Runs efficiently on embedded GPUs for real-world deployment.

Breakthrough Architecture

Helix uses a dual-system approach:

  • System 2 (S2): A vision-language model for scene understanding and planning.

  • System 1 (S1): A fast visuomotor policy for real-time execution.

This enables intelligent decision-making and dexterous, real-time control.

Impact

Helix allows robots to perform complex, collaborative tasks instantly via natural language, marking a major leap toward scalable, household-capable robotics.