Googler reacts to MKBH’s Rabbit R1 Review, discusses the benchmark that shows humans can accomplish 72% of the tasks, while the best models can only achieve 12.24%. The hosts emphasize the importance of the AI agent consistently delivering results and express disappointment in the Rabbit R1’s performance. They also discuss the challenges in AI agent research and the limitations of current large language models.
Show hosts Joe (Eng VP) and Jordan (M&A Deal Lead) worked at various companies from Google, Apple, Facebook, Microsoft, Salesforce, Slack, Carta, Splunk, Wealthfront, Adobe, and more.
Find more AI Agent research here: https://www.youtube.com/watch?v=5Vsm6I3SbdY&t=745s
Chapters
00:00 Introduction and Overview of Rabid R1
07:01 The Limitations of Current AI Agent Research
14:05 Exploring Alternative Options for Task Automation
#mkhb #rabbit #ai #aiagents