• The 79
  • Posts
  • Amazon is working on its own reasoning AI model

Amazon is working on its own reasoning AI model

Hi everyone! Hereā€™s what you need to know about AI today:

šŸ‘‰ Amazon is working on its own reasoning AI model

šŸ‘‰ Super Mario is the new benchmark used to evaluate AI models

šŸ‘‰ Elon Musk faces judgeā€™s rejection in OpenAI for-profit prevention case

and many more!

šŸ“§ Did someone forward you this email? Subscribe here for free to get the latest AI news everyday!

Read time: 5 minutes

AMAZON

Amazon is working on its own hybrid AI model

Source: The Tennessean

Whatā€™s going on: Amazon is working on its own reasoning AI model under the brand name ā€˜Novaā€˜, expected to be released in June 2025. This initiative aims to position Amazon as a serious competitor for industry giants like OpenAIā€™s ChatGPT, Googleā€™s Gemini, Anthropicā€™s Claude, and Meta AI. The model is designed to tackle advanced reasoning tasks, breaking down complex problems systematically, evaluating multiple possibilities, and delivering logical conclusions. Unlike its predecessors, this new model promises a ā€œhybrid reasoningā€ approach, blending quick responses with deeper, extended problem-solving capabilities, a feature that could set it apart in a crowded market already featuring models like OpenAIā€™s o3-mini and DeepSeek-R1.

What does it mean: Amazon is putting a lot of money in AI, and by saying a lot we mean ā€œa hundred billion dollarsā€. Their key focus is to reduce costs for the end-users, with aiming to undercut competitorsā€™ pricing while targeting top-tier performance on benchmarks like SWE, Berkeley Function Calling Leaderboard, and AIME, which test coding and math skills. They are going after both expensive models like GPT-4 and budget-friendly models like DeepSeek-R1.

More details: 

  • The hybrid architecture is potentially inspired by Anthropicā€™s Claude 3.7 Sonnet, which was recently released.

  • This development follows Amazonā€™s recent rollout of Alexa+, an AI-enhanced version of its voice assistant. With over 600 million Alexa-enabled devices already in homes, Amazon has a massive platform and a ton of data to leverage, and the Nova reasoning model could further strengthen its position in the AI race.

  • Interested in learning more about the Amazonā€™s Nova Foundation Models? Check out this page.

SUPER MARIO

Super Mario is the new way to benchmark AI models now

Source: Hao Lab

Whatā€™s going on: Researchers at Hao AI Lab, based at the University of California San Diego, have turned to an unexpected tool to test AI modelsā€™ capabilities: Super Mario Bros. In a recent experiment, they pitted several advanced AI models against the classic video game, running it through an emulator integrated with their custom-built framework, GamingAgent. This setup allowed the AI to control Mario by interpreting basic instructions and in-game screenshots, generating real-time inputs via Python code. The results showcased a range of performances, with Anthropicā€™s Claude 3.7 leading the pack, followed by Claude 3.5, while Googleā€™s Gemini 1.5 Pro and OpenAIā€™s GPT-4o falling behind.

What does it mean: The experiment shows the challenge of timing in real-time gaming, where even a secondā€™s delay in decision-making could spell disaster for Mario, which is a key limitation in current reasoning models. The choice of Super Mario Bros as a benchmark is because of its demand for quick decision-making and complex strategizing, pushing AI beyond static problem-solving into dynamic, unpredictable environments.

More details:

  • GamingAgent provided the AI with simple directives, like dodging obstacles or enemies, yet the models had to independently ā€œlearnā€ how to navigate the gameā€™s fast-paced challenges.

  • OpenAIā€™s Andrej Karpathy, have pointed to an ā€œevaluation crisisā€ in AI, arguing that flashy gaming feats might not translate to real-world utility, given gamesā€™ controlled simplicity and infinite training data compared to the messiness of reality.

  • Interestingly, reasoning models like OpenAIā€™s o1, which ā€œthinkā€ through problems step by step to arrive at solutions, performed worse than ā€œnon-reasoningā€ models.

ā˜ LlamaIndex, a platform for building AI agents that work with unstructured data, has launched LlamaCloud, an enterprise cloud service, after raising $19 million.

šŸ‘ Cohere has released Aya Vision, a multimodal AI model they claim is best-in-class for tasks like image captioning and translation, available in two versions (32B and 8B parameters) and freely accessible for non-commercial use. Check it out here.

šŸ¤– OpenAI has launched NextGenAI, a consortium of 15 universities including Harvard, MIT, and Oxford, providing $50 million in research grants, compute funding, and API access to support AI-assisted research.

šŸ™Žā€ā™‚ļø Meta is expanding its test of anti-fraud facial recognition tools to the UK after receiving regulatory approval. These tools aim to combat scams using likenesses of famous people and help users regain access to compromised accounts.

āš– A federal judge rejected Elon Musk's request for an attempt to stop OpenAI's transition to a for-profit company, stating Musk didn't provide enough evidence, but the court is prepared to hold an expedited trial on the claim that OpenAI's conversion is unlawful.

šŸ“š Alec Radford, a key former OpenAI researcher and lead author of the GPT research paper, has been ordered to testify in a copyright case against OpenAI, brought by authors alleging copyright infringement for using their work to train AI models like ChatGPT.

šŸ’° Quantexa, a London-based startup providing an enterprise platform using AI and data analytics for anti-money laundering and fraud detection, has raised $175 million at a $2.6 billion valuation to expand its AI-driven data analytics services.

AI + Web traffic

I'm seeking expert guidance to devise highly effective strategies tailored to my website's niche/industry and specific goals/targets aimed at amplifying website traffic.

Provide a comprehensive list of potential strategies to significantly enhance visitor engagement and drive more traffic to my website.

Niche/Industry: [Insert here]

Source: Prompts Daily

Grok 3ā€™s answer

Wispr Flow - ML Engineer

Thank you for staying with us like always! If you are not subscribed, subscribe here for free to get more of these emails in your inbox! Cheers!