• The 79
  • Posts
  • Claude chatbot plans ahead and sometimes lies

Claude chatbot plans ahead and sometimes lies

Hi Pals! Here’s what you need to know about AI today:

👉 Anthropic researchers revealed how Claude thinks

👉 Google adds new AI features to Search and Gemini for travel planning

👉 Grok is now available on Telegram as a bot

and many more!

📧 Did someone forward you this email? Subscribe here for free to get the latest AI news everyday!

Read time: 4.8 minutes

ANTHROPIC

Now, we know how Claude and other AI models 'think'

Source: Transformer Circuits | Attribution Graphs

What’s going on: Anthropic has developed new techniques to understand how large language models like Claude process information and make decisions, as detailed in two papers. By adapting neuroscience-inspired methods, dubbed “circuit tracing” and “attribution graphs”, researchers can now map the specific pathways of neuron-like features that activate during tasks.

What does it mean: This approach has uncovered that Claude exhibits surprising sophistication, such as planning ahead when composing poetry by selecting rhyming words before writing and occasionally working backward from desired outcomes rather than building linearly from given facts. These findings mark a significant step in AI interpretability, offering a clearer view of how these systems operate internally.

More details: 

  • The research also shows why AI models sometimes hallucinate or produce incorrect information. Anthropic identified a “default” circuit in Claude that prompts it to refuse answers, which is overridden when it recognizes familiar entities.

  • When this mechanism fails, recognizing something but lacking specific knowledge, hallucinations can occur, explaining confident yet wrong outputs about known figures versus refusals for obscure ones.

  • Claude is also capable for deception, as it can hide its true reasoning process, raising concerns about safety and alignment with human values.

  • Want to read the papers? Read the Circuit Tracing paper here, and Attribution Graphs paper here.

GOOGLE

Google enhances vacation planning with new AI features

Source: Google

What’s going on: Google has introduced a suite of vacation-planning tools across its Search, Maps, and Gemini platforms. In Google Search, the AI Overviews feature now generates detailed trip ideas for regions and countries, not just cities.

What does it mean: Users can input queries like “create an itinerary for Costa Rica with a focus on nature” to receive tailored suggestions, complete with photos, reviews, and an interactive map. These itineraries can be exported to Google Docs or Gmail or saved as custom lists in Maps, making it easier to organize and share plans.

More details: 

  • This functionality is currently available for English queries in the US, responding to the growing trend of users seeking AI-driven travel solutions like those offered by ChatGPT.

  • The Gemini platform, now offers a free “Gems” feature, allowing users to create custom AI experts for specific tasks. For travel, this means setting up a personalized trip planner to select destinations and recommend packing lists, enhancing flexibility and user control.

  • Google Search also extends its price-tracking capabilities, previously limited to flights to hotels. Users can monitor hotel prices for specific dates and locations, applying filters like star ratings or amenities, and receive email alerts when rates drop.

🔮 Bill Gates recently predicted in a popular TV show that AI will automate most tasks within a decade, potentially reducing the workweek to only 2 days and ushering in a five-day weekend!

📱 Grok is now officially available on Telegram messenger. Grok's Telegram username is @GrokAI and is free for Telegram Premium users.

🤝 Lockheed Martin and Google Cloud have partnered to enhance generative AI capabilities for national security, focusing on accelerating AI adoption and deployment for defense applications.

💰 OpenAI is close to securing a record-breaking $40 billion funding round, led by SoftBank, to further advance its AI research and development.

🤖 Alibaba released Qwen2.5-Omni, a new multimodal AI model, with enhanced instruction-following and task performance across text, images, and other data types, outperforming its predecessors and some proprietary models.

AI + Cold emailing

Draft a professional email to a potential client pitching a <service you provide>. Highlight three unique selling points, include a pricing teaser, and end with a question to prompt a response.

GPT-4o-mini’s answer

Thank you for staying with us like always! If you are not subscribed, subscribe here for free to get more of these emails in your inbox! Cheers!