• The 79
  • Posts
  • AI agents with PhD-level intelligence are coming

AI agents with PhD-level intelligence are coming

Welcome back, AI lovers! Here’s what you need to know about AI today:

👉 Chinese DeepSeek-R1 beats OpenAI’s o1

👉 AI agents with PhD-level intelligence are coming

👉 Humanoid robots are going to assemble iPhones in China

and many more!

📧 Did someone forward you this email? Subscribe here for free to get the latest AI news everyday!

Read time: 4.9 minutes

DEEPSEEK

Open-source Chinese AI models continue surprising the world

Source: DeepSeek | The blue hatched bars represent R1 and the bars to the right of them are related to o1

What’s going on: DeepSeek, a Chinese AI company, has made its reasoning-focused AI model, DeepSeek-R1, open-source and they claim that it either matches or outperforms OpenAI's o1 on several benchmarks. These benchmarks include AIME (American Invitational Mathematics Examination), MATH-500, and SWE-bench Verified, which assess capabilities in advanced mathematics, problem-solving, and code verification respectively.

What does it mean: Open-source AI models are getting so good, they are getting ahead of their closed-source/commercial competitors. That means, companies like OpenAI should be cautious their top models might lose financial relevance in the near future. Talking about DeepSeek-R1’s strength, it has a self-fact-checking mechanism, which enhances the reliability of its reasoning in complex domains like physics, science, and mathematics.

More details:

  • R1 contains 671 billion parameters but DeepSeek has also introduced distilled versions of R1, which range from 1.5 to 70 billion parameters. Even the smallest of these models can run on a laptop, making advanced AI reasoning more accessible.

  • DeepSeek offers API access to R1 at a cost significantly lower than OpenAI's o1. (For instance, o1 costs $15 for 1M input tokens compared to R1 that only costs $0.55!)

  • To this date 3 Chinese AI labs, DeepSeek, Alibaba, and Kimi have launched models that they claim are a direct competitor for o1.

OPENAI

OpenAI is working on AI agents with PhD-level intelligence

Source: X

What’s going on: OpenAI appears to be on the verge of releasing a new agentic AI tool named "Operator," according to insights from software engineer, Tibor Blaho, who is known for accurately leaking upcoming AI products. Operator is designed as an AI agent capable of autonomously executing tasks on a user's computer and is marketed as an agent with PhD-level human abilities. This tool aims to perform complex activities such as coding or booking travel arrangements by understanding natural language instructions and interacting with software interfaces autonomously.

What does it mean: Agentic AI is a big deal. AI agents will probably shape a new category of employees in companies and might drive the next wave of layoffs in many sectors in 2025 onwards. Many companies such as Nvidia (with its NIM microservices platform and Eureka), Microsoft (with its Microsoft 365 Copilot), Google (with its Project Mariner), Anthropic (with Claude’s “computer use” feature), and others are working on AI agents. While OpenAI is the biggest player in today’s AI landscape (with more than 200 million active users), it has not yet released a serious product in this category.

More details:

  • Blaho's investigation into the code repositories of projects like OSWorld and WebVoyager, where Operator's capabilities were benchmarked, suggests that OpenAI might be preparing for a release.

  • Benchmarks have shown that while Operator can surpass human performance in some tasks, it struggles in others, particularly those requiring intricate navigation or decision-making in a real-world context.

  • Registration of a domain named "operator.ai" and OpenAI’s referencing it, indicate active development and potential marketing efforts for Operator's launch.

  • The AI agent market is expected to grow significantly, with projections estimating a market size of $47.1 billion by 2030.

⚖ President Trump has just revoked a 2023 executive order signed by former President Biden aimed at reducing the potential risks posed by AI.

🤖 Foxconn has partnered with UBTech to deploy humanoid robots for assembling iPhones in China, aiming to increase efficiency and address labor shortages.

📉 Sam Altman has urged AI fans to lower their expectations, clarifying that rumors of OpenAI being on the brink of superintelligence are just a hype. He emphasized the need for patience and realistic views on the progress of AI development. He also said “we are not gonna deploy AGI next month, nor have we built it” in a recent post on X/Twitter.

⚪ Friend, an AI startup that wants to give you a friend, has delayed the shipment of its $99 AI-powered personal companion necklace to late 2025 due to necessary design refinements.

🖥 Microsoft countered Salesforce's claims (that they have beaten Microsoft to market on agents) by stating that over 100,000 organizations had utilized Copilot Studio to create AI agents by October 2024.

💰 The CEO of Abu Dhabi's $330 billion sovereign wealth fund, Mubadala, warned that the level of disruption AI will cause is not fully appreciated, potentially impacting every sector including employment and human life.

AI and financial advice

Can you give me a rundown on the different types of financial investments and what to consider when choosing them? For each option, please provide accredited sources to learn more about that particular investment approach.

Gemini’s answer

Thank you for staying with us like always! If you are not subscribed, subscribe here for free to get more of these emails in your inbox! Cheers!