Agentic AI Weekly | Berkeley RDI | March 11, 2026
AgentX–AgentBeats Phase 2 in Progress, Prize Pool Updates, Berkeley Xcelerator, Agentic AI Summit Early-Bird Tickets and CFP Open
AgentX–AgentBeats Highlights: Phase 2, Sprint 1 in Progress, Prize Pool Updates
Phase 2, Sprint 1 of the AgentX–AgentBeats competition kicked off last week, and we’re excited to see what you build! For Phase 2, participants are building purple agents to tackle the select top green agents from Phase 1 and compete on the public leaderboards.
Unlike Phase 1, where participants competed across all tracks throughout the entire duration, Phase 2 introduces a sprint-based format. The competition is organized into four rotating sprints.
Sprint 1 Details:
📋 March 2 – March 22, 2026
Three tracks and associated benchmarks/green agents are live for the first sprint:
Game Agent Track
Finance Agent Track
Business Process Agent Track
We have opened up the submission form for your Phase 2, Sprint 1 projects, which you can access by clicking the button below!
🗓️ Upcoming Sprints
Sprint 2 (3/23 – 4/12): Research Agent, Multi-agent Evaluation, τ²-Bench, Computer Use & Web Agent
Sprint 3 (4/13 – 5/3): Agent Safety, Coding Agent, Cybersecurity Agent
Sprint 4 (5/4-5/24): General Purpose Agents, the grand finale of AgentBeats Phase 2, where everything culminates.
AgentX–AgentBeats is the first competition to explicitly spotlight general-purpose agents, testing broad capability, adaptability, and robustness across diverse tasks rather than a single domain. While earlier sprints emphasize depth, this final sprint showcases breadth and real-world readiness.
Participants are encouraged to compete in multiple tracks across multiple sprints during Phase 2. Teams and team members who submit purple agents in any sprint will also be eligible to enter a raffle for free tickets to the Agentic AI Summit later this year.
For more details on each sprint and how to compete in Phase 2, please refer to the AgentX–AgentBeats website!
🏆 Prize Pool Updates
We are also excited to announce that we have updated the Phase 2 prize pool with support from several amazing partners!
Highlights include:
DeepMind: Up to $50K in GCP/Gemini credits
Nebius: Up to $50K in inference credits
OpenAI: $10K / $5K / $1K prizes for top teams in the Research and Finance Agent tracks
Lambda: $750 cloud credits per winning team + $3,000, $2,000, and $1,000 in cloud credits awarded to the top 3 winners in the Agent Safety track
Amazon: Up to $10K in AWS credits
Snowflake: Software access + credits for winning student teams
Agentic AI Summit: Up to 2 complimentary tickets for each winning team
You can learn more about the full prize pool via the AgentX-AgentBeats website!
Lastly, we want to sincerely thank all of our Phase 1 participants. This year’s competition has brought together over 3,000 individuals and 1,300 teams, all pushing forward the frontier of Agentic AI. We recently announced some of the incredible statistics from Phase 1 and celebrated our Phase 1 winning teams, as you can see below. We encourage you to spread the word by sharing our posts on LinkedIn and X!
Berkeley Xcelerator — Applications Now Open!
The Berkeley Xcelerator, a non-dilutive accelerator program designed to support pre-seed and seed-stage startups building at the forefront of Agentic AI, is now open for applications!
The Xcelerator is built in partnership with Berkeley RDI’s research community and ecosystem partners, offering selected teams the support, resources, and guidance to take their startup to the next level! In addition, the Xcelerator is open to everyone; you do not need to be affiliated with UC Berkeley to apply!
Why apply to the Xcelerator?
Unparalleled access to frontier research and expertise through close collaboration with Berkeley RDI’s community across agentic AI, AI safety and security, and the broader AI landscape.
Practical enablement through industry partnerships, including cloud, GPU, and API credits provided by industry partners such as Google Cloud, Google DeepMind, OpenAI, and Nebius, with more to be announced!
Visibility and network effects through the Berkeley ecosystem and Berkeley RDI’s global community of 56,000+ developers and builders, including the rapidly growing Agentic AI MOOC community
A culminating Demo Day at the Agentic AI Summit (August 1–2, 2026), bringing together 5,000+ in-person attendees and placing your startup directly in front of top-tier VCs, leading AI researchers, industry executives, and strategic partners.
We’re looking for AI and Agentic AI startups at the pre-seed or seed stage. If you think that you or your team are a good fit, we encourage you to learn more and apply via the Xcelerator website and form below!
📅 Applications close on Fri, March 20, 2026!
Our sincerest thanks to all of our sponsors and partners:
Agentic AI Summit 2026 (Early-Bird Pricing and CFP are Live!)
Save the date! The Agentic AI Summit returns to Berkeley on August 1–2, 2026, welcoming 5,000+ expected in-person attendees for two days of insights and innovation. Building on last year’s sold-out success—with 2,000+ in‑person attendees and 40,000+ global livestream participants—the summit will bring together researchers, builders, industry leaders, and the global agentic AI community for keynotes, technical talks and panels, hands-on workshops, live demos, and more!
🎟️ Early‑Bird Pricing (Limited Capacity)
A limited number of early‑bird tickets are still available:
Student Early-Bird: $99
Standard Early-Bird: $249
If you’re looking to secure the best ticket price and be part of the conversation shaping the future of Agentic AI, we encourage you to register early. We look forward to welcoming you to Berkeley this August.
We’re also thrilled to share that the Call for Speaking Proposals (CFP) for the Agentic AI Summit 2026 is now open!
If you’re interested in sharing your work through a technical talk, panel discussion, workshop, or tutorial, or poster presentation—and helping advance the frontiers of the Agentic AI—we warmly invite you and/or your team to apply and be part of the conversation at the Summit.
Please complete the form below to submit your proposal. The program committee will review submissions on a rolling basis.
Trends This Week
OpenAI released GPT-5.4, its latest frontier model designed for professional and developer workloads, now available in ChatGPT, the API, and Codex. The model comes in multiple versions, including GPT-5.4 Thinking, a reasoning-focused variant, and GPT-5.4 Pro, optimized for maximum performance on complex tasks. GPT-5.4 also posted strong benchmark results, including record scores on OSWorld-Verified and WebArena Verified computer-use benchmarks and an 83% score on OpenAI’s GDPval test for knowledge-work tasks. The API version supports context windows up to 1 million tokens, the largest offered by OpenAI to date. Alongside the release, OpenAI introduced a new API feature called Tool Search, which allows models to dynamically retrieve tool definitions instead of loading them into prompts, improving speed and lowering costs for applications that rely on large tool libraries.
Google introduced Gemini 3.1 Flash-Lite, a new lightweight model designed for high-volume developer workloads that require low latency and low cost. Available in preview through the Gemini API, the model is priced at $0.25 per million input tokens and $1.50 per million output tokens, making it one of the most cost-efficient models in the Gemini lineup. Google reports that Flash-Lite delivers 2.5× faster time-to-first-token and roughly 45% faster output speeds than the previous Gemini 2.5 Flash while maintaining comparable quality. The model also includes adjustable “thinking levels,” allowing developers to control how much reasoning the model performs for tasks such as large-scale translation, content moderation, UI generation, and other high-frequency application workflows.
Anthropic released new research analyzing the labor market impacts of AI, using a metric called “observed exposure” to estimate job displacement risk by combining theoretical LLM capabilities with real-world usage data from Claude interactions. The research finds that AI adoption remains far below its theoretical potential. For example, LLMs could theoretically cover about 94% of tasks in computer and math occupations, but current observed usage is closer to 33%. While the study finds no clear increase in unemployment in highly exposed occupations since ChatGPT’s launch, it identifies early signals of disruption, including a roughly 14% decline in entry-level hiring among workers aged 22–25 in more AI-exposed fields.
AI researcher Andrej Karpathy released Autoresearch, an open-source framework that allows AI agents to autonomously run and iterate on machine learning experiments using a single GPU. The project condenses Karpathy’s Nanochat training core into roughly 630 lines of PyTorch code, small enough for an LLM to read and modify directly. In the system, a human defines the research goals in a Markdown instruction file, while the AI agent edits the training script, tests the architecture or hyperparameter changes, and runs short training experiments to evaluate performance. As Karpathy explained, “The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement.”
Alibaba’s Qwen team released the Qwen 3.5 Small Model Series, a set of lightweight open models designed for efficient deployment and experimentation. The lineup includes Qwen3.5-0.8B, 2B, 4B, and 9B parameter models, built on the same Qwen3.5 architecture with multimodal capabilities and reinforcement learning improvements. The smaller 0.8B and 2B models are optimized for fast performance on edge devices, while the 4B model supports multimodal tasks with a 262K token context window, making it suitable for lightweight agents and applications. All models are released under an Apache 2.0 license, with weights available for research and commercial use.
Don’t miss the developments shaping Agentic AI. Subscribe for weekly coverage of groundbreaking research, emerging trends, and critical insights across Agentic AI and the broader AI landscape.












