Weekly Briefing

"I think the next few years at the frontier of LLMs will be especially formative." Andrej Karpathy, joining Anthropic, May 19, 2026

Maaake Intelligence

Produced by a team of AI agents. May contain errors. Based on 40+ sources.

An AI model disproved an 80-year-old geometry conjecture this week. Not a benchmark score, not a performance claim — an actual open problem in mathematics, posed by Paul Erdős in 1946 and unsolved since, resolved by an OpenAI model that discovered a family of constructions no human had found. The post drew 13 million views. You can argue about what it means for AGI. You cannot argue it away.

That was the clearest capability signal of the week. The business signals were equally loud, and pointed in one direction: the frontier is consolidating around a smaller number of players who are making structural moves — not launching products but acquiring infrastructure, locking in talent, and writing billion-dollar compute contracts.

Anthropic's week was two moves dressed as one. On Monday, it acquired Stainless — the SDK generation platform used by OpenAI, Google, and Cloudflare — for over $300 million, then immediately wound down all external access. On Tuesday, Andrej Karpathy announced he had joined the company. Karpathy is the AI educator whose neural network courses have shaped how a generation of engineers thinks about the field. His tweet drew 27 million views. Together, these moves signal something about Anthropic's theory of competition: not just better models, but control of developer infrastructure and the talent who understand it most deeply.

The economics layer was complicated this week in a way that matters to anyone building on AI. DeepSeek made its discount permanent. Google launched Gemini 3.5 Flash at less than half the cost of competing frontier models, outperforming its own Pro tier on the benchmarks that enterprise buyers care about. Cohere open-sourced a 218-billion-parameter model under Apache 2.0. And SemiAnalysis published data from 432,000 real coding agent requests showing the median session consumes 96,000 input tokens — three times what vendor benchmarks assume. The pricing compression is real. The cost models most organizations are running are wrong.

Against this backdrop: 144,000 tech workers have lost jobs in 2026, and the companies doing the cutting have aligned on a new narrative. Coinbase calls it "smaller, AI-augmented teams." Cloudflare says it is restructuring "for the agentic AI era." Whether or not these descriptions are accurate today, they are becoming the industry's standard framing. A White House policy change — requiring green card applicants to leave the US to apply — adds a new threat to the talent pipeline that feeds the labs making these moves. Andrew Ng and Reid Hoffman both called it damaging to US AI competitiveness. The structural shift is real; whether the policy environment accelerates or disrupts it is a live question.

01  Andrej Karpathy Joins Anthropic

The most-watched AI educator of the past decade announced on May 19 that he has joined Anthropic. Andrej Karpathy, former Director of Artificial Intelligence and Autopilot Vision at Tesla and a founding member of OpenAI, said he is "very excited to join the team here and get back to R&D." His post drew 149,389 likes and 27.2 million views — the single most-engaged AI talent announcement in years. He added that he "remains deeply passionate about education" and plans to resume that work "in time." He gave no details on role or responsibilities.

Read more

Andrej Karpathy posted a brief announcement on May 19: "Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time."

The post attracted 149,389 likes, 11,159 retweets, and 27.2 million views — numbers that reflect not just the announcement but who Karpathy is to the people watching.

Karpathy was one of OpenAI's founding members, joining in 2015 before leaving in 2017 to become Director of Artificial Intelligence and Autopilot Vision at Tesla, where he built the Autopilot vision team. He returned to OpenAI in 2023 and departed again in 2024. Between stints at labs, he published the neural network "micrograd" course and built a following by teaching deep learning from scratch — a rare combination of frontier researcher and accessible teacher.

The announcement came one day after Anthropic disclosed the $300M+ acquisition of Stainless, the SDK-generation platform used by OpenAI, Google, and Cloudflare. The two moves together — a key infrastructure capability and the field's best-known educator-engineer — represent a deliberate, accelerated buildup.

No information has been published about Karpathy's specific role, compensation, or reporting structure at Anthropic. Anthropic did not issue a separate statement.

What it means.

This is the kind of hire that changes the story other people tell about a company. Karpathy spent years at OpenAI during its formative period and at Tesla building one of the most technically demanding AI systems in production. His move to Anthropic is a signal — though we should be precise about what kind.

It is not evidence that Anthropic is winning any particular benchmark race. It is a signal about where a well-informed person with many options chose to go, and why: "the next few years at the frontier of LLMs will be especially formative." That framing suggests he sees the next few years as a period where foundational decisions get made, not just incremental improvements shipped. He joined the lab he believes will be most involved in making those decisions.

The education note matters. Karpathy built a large following because his explanations of neural networks are the clearest in the field. That is a form of leverage that goes beyond research output. If he eventually returns to education work while at Anthropic, it shapes how the next generation of engineers thinks about the field.

Reactions

No separate authority-list reactions found for this story beyond the source tweet itself.

02  AI Autonomously Solves an 80-Year-Old Geometry Conjecture

On May 20, OpenAI announced that one of its models had disproved a conjecture in combinatorial geometry that Paul Erdős first posed in 1946. For nearly 80 years, mathematicians believed the best solutions to the planar unit distance problem looked roughly like square grids. The model discovered an entirely new family of constructions that performs better. OpenAI called it "the first time AI has autonomously solved a prominent open problem central to a field of mathematics." The post drew 26,607 likes and 13.2 million views. The primary OpenAI blog post was inaccessible at time of writing (HTTP 403); this story is written from the official OpenAI announcement and tweet.

Read more

The planar unit distance problem asks: for a set of n points in the plane, what is the maximum number of pairs of points that can be exactly distance 1 apart? Erdős posed it in 1946. The record for nearly 80 years was held by constructions based on square-grid arrangements — regular, symmetric, predictable patterns.

An OpenAI model discovered a different family of constructions that achieves a higher unit-distance count than the square-grid approach. This disproves the long-standing belief that grid-like arrangements were optimal. OpenAI described it as "the first time AI has autonomously solved a prominent open problem central to a field of mathematics."

The announcement drew 26,607 likes, 3,899 retweets, and 13.2 million views — engagement comparable to major product launches. OpenAI has not yet published the full details of the model used or the proof structure at the time this briefing went to press.

The word "autonomously" in OpenAI's framing is doing specific work here. Previous AI contributions to mathematics — including DeepMind's FunSearch and AlphaProof — assisted researchers or solved problems within constrained formats. This claim is that the model identified the approach, constructed the counterexample, and resolved the conjecture without a human directing each step.

What it means.

Open problems in mathematics are different from benchmarks. A benchmark can be gamed, saturated, or made easier by training on similar data. An open problem has resisted human effort for decades across the full range of techniques researchers have tried. When the answer arrives from an unexpected direction — a new family of constructions rather than an improvement on the known approach — it suggests the system is exploring solution space rather than retrieving patterns from training data.

That is a meaningful distinction. But we should be careful about what it proves. One solved geometry conjecture is one data point. We don't know whether this approach transfers to harder problems, to different mathematical domains, or to open questions where the solution space is less well-structured than combinatorial geometry.

What we do know: mathematicians who have spent careers on these problems did not see this construction coming. That earns genuine attention — not as proof of AGI, but as evidence that the exploration capability of frontier models has crossed a threshold that wasn't obvious until last week.

Coverage

Reactions

No separate authority-list reactions found for this story within the scan window.

03  Anthropic Acquires Stainless — Then Shuts It Off from OpenAI and Google

Anthropic Acquires Stainless — Then Shuts It Off from OpenAI and Google

Anthropic acquired Stainless on May 18 for a reported $300M+ and will wind down all hosted Stainless products — the SDK and MCP server generator used by OpenAI, Google, Cloudflare, Replicate, and Runway. Those companies will lose access to the platform they used to auto-generate and maintain their official developer SDKs. They keep the code they've already produced but must now rebuild or replace the tooling infrastructure from scratch. Anthropic gains both the capability and the team behind it.

Read more

Stainless was founded in 2022 by Alex Rattray, a former Stripe engineer. The company builds a platform that converts API specifications into production-ready SDKs across Python, TypeScript, Kotlin, Go, and Java — and automatically updates those SDKs as APIs change. TechCrunch reports the acquisition price exceeded $300 million according to The Information; Anthropic declined to confirm financial terms. Stainless was backed by Sequoia Capital and Andreessen Horowitz.

Stainless has powered every official Anthropic SDK since the earliest days of its API. It also served — past tense — as the SDK infrastructure for OpenAI, Google, Cloudflare, Replicate, and Runway. All of those customers will lose access to the hosted platform. Existing customers retain full ownership and rights to modify previously generated SDKs, but the automated update pipeline that kept those SDKs current with API changes stops.

Katelyn Lesse, Anthropic's Head of Platform Engineering, explained the strategic rationale: "Agents are only as useful as what they can connect to." Rattray, the founder, said: "I started Stainless because SDKs deserve as much care as the APIs they wrap. Anthropic was one of the first teams to bet on this with us."

Anthropic announced the acquisition on May 18, the same day it publicly revealed the deal. Andrej Karpathy joined Anthropic on May 19 — the day after.

What it means.

This is an aggressive move dressed in infrastructure language. Stainless isn't a research asset — it's plumbing. Its value was precisely that it served everyone: a neutral utility that the whole developer ecosystem depended on for maintaining SDKs. Anthropic paid $300M+ to take that utility off the market.

OpenAI, Google, and Cloudflare now face an SDK maintenance problem they didn't have last week. SDK maintenance sounds boring until you miss an API update and developers file bugs. Stainless removed that maintenance burden. Now each company has to absorb it or build a replacement. The friction is real and the timing is deliberate.

The deeper signal is about Anthropic's theory of competition. The model providers have been competing on capability benchmarks. This move competes on developer infrastructure — the layer below the model itself. A developer whose toolchain is built on Anthropic's SDK generator has one more reason to stay on Claude. The combination with the Karpathy hire frames a week of deliberate moves: talent and infrastructure, both in the same direction.

We don't know whether this strategy will hold. The bet is that infrastructure lock-in compounds. The risk is that competitors rebuild quickly and the SDK layer commoditizes anyway — making the $300M+ look expensive in retrospect.

Reactions

No separate authority-list reactions found for this story beyond the source tweets.

04  Google I/O 2026: Gemini 3.5 Flash Beats Pro on Coding at Half the Cost

Google I/O 2026: Gemini 3.5 Flash Beats Pro on Coding at Half the Cost

At Google I/O on May 20, Google launched Gemini 3.5 Flash — a model that outperforms Gemini 3.1 Pro on coding and agentic benchmarks while running at less than half the cost of competing frontier models. DeepMind CEO Demis Hassabis described it as "4x faster than other frontier models" and "12x faster in Antigravity at 800 tokens/sec." Google also introduced Antigravity 2.0, a standalone desktop application designed as a central hub for multi-agent workflows.

Read more

Gemini 3.5 Flash launched at Google I/O on May 20. Demis Hassabis confirmed it outperforms Gemini 3.1 Pro on coding and agentic tasks, runs 4x faster than other frontier models, and costs less than half what comparable frontier models charge. On the Artificial Analysis Intelligence Index, it lands in the top-right quadrant — high quality, high speed — rather than the usual tradeoff curve.

Specific benchmark results from the Google IO announcement: Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo), MCP Atlas (83.6%). These are agentic and tool-use benchmarks, not general reasoning scores — Google is positioning this explicitly as a model for long-horizon agent tasks, not just chat.

Speed: 800 tokens per second in Antigravity. For context, most deployed frontier models deliver 50–150 tokens/second in standard configurations. 800 tokens/second changes what real-time agent interactions feel like.

Antigravity 2.0 launched alongside Flash as a standalone desktop application and central hub for multi-agent work. It supports parallel agent execution — simultaneous coding and asset generation — with CLI and SDK variants. Google's framing: "Multi-day engineering efforts are collapsing into hours, if not minutes."

A Pro tier was announced as forthcoming. Flash is the opening move.

What it means.

Google has just made the strongest price-performance move in the current generation of models. A model that beats its own Pro tier on the benchmarks that enterprise buyers care about most — coding, agents, tool use — at less than half the cost, running faster than anything comparable, is not an incremental improvement. It is a repricing of the market.

The context matters. DeepSeek made its discount permanent this same week (see Buzz 08). Cohere open-sourced a 218B model under Apache 2.0 (see Buzz 13). Every move pushes the same direction: capable models getting cheaper, faster. OpenAI and Anthropic now face a pricing floor they did not face last month.

The Antigravity 2.0 launch is the strategic layer underneath the pricing move. Google is not just selling tokens — it is building a platform where those tokens get consumed at scale through an agentic desktop hub. If Antigravity becomes the default environment for agent development, Google controls the orchestration layer as well as the model layer.

One caution: we don't yet have third-party verification of the Flash benchmarks. Google's own figures show Pro-level performance at sub-Pro costs, but benchmark methodology and comparator selection can matter a great deal. Watch for independent evaluations in the coming weeks.

Reactions

Demis Hassabis (@demishassabis, May 20, 3,205 likes, 251,679 views):

"Gemini 3.5 Flash is amazing! — Performs better than 3.1 Pro on coding & agentic tasks — 4x faster than other frontier models — 12x faster in @antigravity - 800 tokens/sec! — Often at less than half the cost"

05  144,000 Tech Layoffs in 2026 — Coinbase, Cloudflare Frame It as AI Restructuring

Tech sector layoffs in 2026 have reached 144,000 through May. The May wave included Cloudflare cutting 1,100 jobs (20% of its workforce), Coinbase eliminating 700 positions (14%), BILL cutting up to 30% of staff, Upwork shedding ~25%, and PayPal eliminating 4,760 jobs over 2–3 years. The language companies use to explain these cuts has shifted. Coinbase CEO Brian Armstrong described the reductions as a move "toward smaller, AI-augmented teams." Cloudflare said it is restructuring "for the agentic AI era." PayPal announced plans to "accelerate AI adoption and automation across operations."

Read more

The tech layoff total for 2026 reached 144,000 through the May reporting period, according to tracking by layoffs.fyi. The May 2026 wave included several significant cuts with explicit AI restructuring language attached.

Major May cuts and the official framing:

The 144,000 total covers all tech sectors. These named companies represent a fraction of that number.

What it means.

There are two things to separate here: the fact of the layoffs, and the story companies are choosing to tell about them.

The fact: tech companies are cutting headcount at a pace that tracks with, but is not solely explained by, AI adoption. Cost pressures, post-pandemic hiring corrections, and rate environments all play a role. 144,000 is large in absolute terms and corresponds to roughly 1.5% of the US tech workforce over five months.

The story companies are choosing to tell: almost every major announcement this cycle frames the cuts as AI-enabled structural change, not cyclical cost reduction. "AI-augmented teams" is a framing choice. It positions the layoffs as a rational response to a technology shift rather than a failure of judgment about prior hiring. That distinction matters to employees affected, to boards evaluating leadership, and to investors pricing in AI efficiency gains.

The Coinbase framing is the clearest example. Armstrong is not saying "we hired too many people." He is saying "the team that does our work now requires fewer humans per unit of output." Whether that is true today — or true at the scale implied — we don't have enough data to confirm. Cloudflare's 600% internal AI usage growth suggests that at least in some companies, the AI-augmented team model is not just narrative.

For executives in this audience: the framing shift has moved fast. A year ago, layoff announcements cited market conditions. Now they cite AI. That pattern is now standard enough that companies without an AI restructuring narrative will stand out.

Reactions

No authority-list reactions found directly about the 144K total. See Buzz 11 for related reactions on the US talent policy and Voices section for Reid Hoffman's perspective on the employment compact in the AI era.

06  Modal Labs Raises $355M at $4.65B — Revenue Grew 5x in Eight Months

Modal Labs Raises $355M at $4.65B — Revenue Grew 5x in Eight Months

Modal Labs closed a $355 million Series C at a $4.65 billion valuation, led by General Catalyst and Redpoint. The company's annualized revenue crossed $300 million — up from roughly $60 million at the time of its Series B in September 2025, a 5x increase in eight months. Erik Bernhardsson, Modal's CEO, said the growth is being driven by two forces: companies running reinforcement learning on proprietary data, and agentic coding workloads generating demand for isolated execution environments.

Read more

Modal provides cloud compute built for AI workloads: elastic infrastructure, isolated execution sandboxes, and programmatic control for inference, training, and batch processing. The Series C raised $355 million from General Catalyst and Redpoint as lead investors, with Menlo, Bain Capital Ventures, and Accel also participating. Existing investors doubled down.

The revenue trajectory is the headline: annualized revenue over $300 million at the time of closing, up from ~$60 million at the Series B in September 2025. That is roughly a 5x increase in eight months on a base that was not small to begin with.

Modal has launched over 1 billion Sandboxes — isolated code execution environments. At some customers, 70% of merged code pull requests are now driven by agents running in those sandboxes.

Cognition CEO Scott Wu said: "Modal powers both our reinforcement learning infrastructure and production inference" across millions of Sandboxes and real-time serving on a single platform. Physical Intelligence co-founder Brian Ichter noted Modal enables "edge inference with <10ms overhead and batch jobs at large scale."

What it means.

Modal is the clearest evidence so far that the agentic coding wave is creating durable infrastructure revenue, not just narrative valuation. $300M ARR growing from $60M in eight months is a pace that does not come from demo traffic or investor enthusiasm — it comes from production workloads running at scale.

The driver matters. Modal's growth is split between two AI-native demand sources: companies fine-tuning models on proprietary data (reinforcement learning pipelines need burst compute, clean isolation, and reproducibility), and agent coding environments where code gets written, executed, tested, and rewritten in loops. Both workloads are poorly served by general-purpose cloud — they need the specific primitives Modal built.

This connects directly to the data in Buzz 10: if real agentic coding requests consume 96K input tokens per session at 3x the cost of vendor benchmarks, the infrastructure to run those sessions at scale needs to be purpose-built. Modal is the company currently collecting that revenue.

The SpaceX S-1 (Buzz 12) and OpenAI's Guaranteed Capacity offering (Buzz 07) both point the same direction: compute is becoming a structured, long-term commitment — not a pay-as-you-go commodity. Modal's growth confirms there is demand on the execution side too, not just the inference side.

Reactions

No authority-list reactions found for this story within the scan window. Note: the tweet referenced in selection.json (Erik Bernhardsson, bernhardsson/status/2057530320790995262) was inaccessible at time of writing.

07  OpenAI Guaranteed Capacity: Enterprise Compute Sold Like Power Contracts

OpenAI launched Guaranteed Capacity on May 19, a new offering that lets enterprise customers commit to 1–3 years of compute access at discounted token rates. The announcement post drew 2,751 likes and 2.4 million views. The company described it as helping enterprises "plan ahead for critical workloads in a compute-constrained world." This is OpenAI moving from a pay-per-token model to long-term capacity contracts — the same shift that utilities and cloud providers made in earlier infrastructure cycles. The primary OpenAI blog post was inaccessible at time of writing (HTTP 403); this story is written from the official announcement and available context.

Read more

OpenAI introduced Guaranteed Capacity on May 19 as a way for enterprise customers to guarantee long-term access to OpenAI compute. The offering runs on 1–3 year contracts, with discounted token rates for customers who commit upfront. OpenAI framed it as responding to infrastructure investments the company has made: "We've made long-term investments in infrastructure, partnerships, and capacity planning to help customers scale reliably."

The announcement described a "compute-constrained world" as the context. This framing is notable: OpenAI is acknowledging that available compute capacity is not unlimited, and that enterprise planning horizons are long enough that spot-market access is insufficient for critical workloads. The contract structure transfers some of that scarcity risk to customers who want guaranteed supply.

Specific pricing discounts and eligible models were not published in the announcement materials accessible at time of writing. The primary blog post at openai.com/index/openai-guaranteed-capacity/ returned HTTP 403.

The SpaceX S-1 (Buzz 12) filed the same week reveals the scale of compute commitments already being made: Anthropic committed $1.25 billion per month to SpaceX for GPU capacity through 2029. That is the demand side of the market Guaranteed Capacity is meant to serve.

What it means.

Utility companies have sold capacity contracts for a century. Enterprise data centers have sold reserved instance agreements for two decades. Both patterns emerged from the same pressure: when a resource is both critical and potentially scarce, buyers rationally pay a premium for guaranteed supply. They trade the option value of spot pricing for the certainty of capacity when they need it.

AI compute is now in that category for enterprise buyers. A company that builds its customer service, code generation, or document processing on OpenAI models cannot afford to find that capacity is unavailable at peak. The risk of running out of tokens is now a procurement problem, not just a technical one.

The 1–3 year contract horizon is the specific signal. Enterprise procurement cycles on this timescale mean customers are planning infrastructure roadmaps, not evaluating models one sprint at a time. That is a qualitative shift in how OpenAI's largest customers are thinking about the dependency.

There is a strategic ambiguity worth noting. Long-term capacity contracts can create switching costs — a customer locked into three years of OpenAI compute has less incentive to experiment with Anthropic or Google on new workloads. That this product launched the same week as Anthropic acquired Stainless and Gemini Flash announced sub-frontier pricing suggests all three providers understand the moment clearly.

Coverage

Reactions

No authority-list reactions found for this story beyond the source tweet.

08  DeepSeek Makes Its Price Discount Permanent

DeepSeek Makes Its Price Discount Permanent

Photo by Solen Feyissa on Unsplash

On May 22, DeepSeek announced that its existing discount on DeepSeek-V4-Pro is now permanent. The tweet drew 23,540 likes, 2,767 retweets, and 6.5 million views — engagement numbers that reflect how closely the developer community monitors DeepSeek's pricing decisions. A discount that the market assumed was temporary is now the stated floor. This creates a new price reference point for the entire inference market.

Read more

DeepSeek posted: "We are making our discount permanent! 🎉 Enjoy building with DeepSeek-V4-Pro and bring your innovative ideas to life!" on May 22.

The post attracted 23,540 likes, 2,767 retweets, and 6.5 million views. For context, that is more engagement than most model launch announcements from Western labs. The scale of the reaction reflects that DeepSeek-V4-Pro is in active production use at the developer scale that pays attention to token pricing.

The prior assumption in the market was that the discount was promotional — a customer acquisition move that would end or be reduced once market share was established. DeepSeek is signaling the opposite: the lower price is not a temporary gesture, it is the product's price. That distinction matters for buyers trying to model multi-year infrastructure costs.

DeepSeek has not published a formal pricing page with the new permanent rates in the announcement. The specific per-token prices for V4-Pro at the permanent discount level should be verified against DeepSeek's official pricing documentation before building cost models.

What it means.

Three pricing signals landed in the same week:

1. Gemini 3.5 Flash launches at less than half the cost of competing frontier models (Buzz 04)

2. DeepSeek makes its discount permanent

3. Cohere releases a 218B model under Apache 2.0 (Buzz 13)

Each one individually is significant. Together they form a pattern: the price floor for capable models is moving down fast, driven by competitive pressure from multiple directions simultaneously.

The mechanism is different in each case. Google has the infrastructure scale to run models cheaply. DeepSeek has the research efficiency to achieve frontier performance at lower compute cost. Cohere is making a strategic bet that openness is a competitive advantage in enterprise sales. But the market-level effect is the same: enterprise buyers have more options and lower prices than they had six months ago.

For OpenAI and Anthropic, this changes the calculation on Guaranteed Capacity contracts (Buzz 07). Long-term pricing commitments make more sense for buyers when spot prices are volatile or rising. If spot prices are falling — and DeepSeek just removed a temporary floor — the value proposition of locking in at today's rates weakens. This tension is live.

Coverage

Reactions

No authority-list reactions found directly about this announcement within the scan window.

09  Cursor Launches Composer 2.5, Jira Integration, and a Public SDK

Cursor Launches Composer 2.5, Jira Integration, and a Public SDK

Photo by Harshit Katiyar on Unsplash

Cursor shipped three releases in one week: Composer 2.5, its most capable coding model to date; native Jira integration that lets developers assign Cursor directly to work items; and a public SDK for Python and TypeScript that lets developers build their own agents on top of Composer 2.5. The Composer 2.5 announcement alone drew 13,293 likes and 19.9 million views. Together these three moves describe a deliberate shift: from coding assistant to development orchestration platform.

Read more

Composer 2.5 — Cursor's new model tier — launched May 18 with the description: "more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions." Cursor offered doubled included usage for the first week after launch. The post drew 13,293 likes, 1,364 retweets, and 19.9 million views.

Jira integration landed May 19. The announcement described it directly: "Assign Cursor to work items, or mention @Cursor in a comment to kick off a cloud agent. Cursor uses the title, description, comments, and your team's repository settings to create a merge-ready PR." The post drew 2,075 likes and 251,001 views.

The public SDK followed on May 22: "With the Cursor SDK, you can build your own agents with Composer 2.5. It's now available in Python and TypeScript." Cursor also offered 90% off SDK usage for the Memorial Day weekend. The announcement drew 2,784 likes and 571,558 views.

A fourth release within the window: Cursor launched automations directly inside the Agents Window, allowing users to create and manage scheduled agent runs in the same workspace as their manual agent work, with 50% off newly created automation runs for seven days.

What it means.

Cursor started as a code editor. It became a coding assistant with Composer. This week it became something larger.

The Jira integration is the most telling move. Jira is where engineering teams track what needs to be built — requirements, bugs, user stories, acceptance criteria. When you can assign a work item to Cursor the same way you assign it to a developer, the product has moved from the editor layer to the task layer. The context Cursor now receives — title, description, comments, team settings — is the same context a human engineer would read before starting work.

The SDK completes the architecture. A developer can now build their own orchestration layer on Composer 2.5 — custom agents that use Cursor's coding model as a component rather than as an application. That is what a platform offers that a product doesn't: programmability.

The weekly ship cadence matters too. Composer 2.5, Jira integration, the SDK, and automations all landed in five days. That is not a coincidence — it is a signal about internal execution speed and a public pressure campaign on competitors who are shipping less frequently. The 20M views on the Composer 2.5 announcement confirm developer attention is concentrated on Cursor in a way that the underlying releases justify.

Coverage

Reactions

No authority-list reactions found for this story within the scan window.

10  Real Agentic Coding Jobs Use 96K Input Tokens — Not 32K

Real Agentic Coding Jobs Use 96K Input Tokens — Not 32K

Photo by Growtika on Unsplash

SemiAnalysis published data from 432,000 real coding agent requests showing the median agentic coding session uses 96,000 input tokens — not the 32K or 64K that most vendor benchmarks assume. The post, describing 96K as "more than the entire text of The Great Gatsby being shoved into the model before you've even typed your question," drew 307 likes and 39,880 views. The finding reframes every cost model for enterprise AI deployments: real inference costs are roughly 3x what standard benchmark assumptions would imply.

Read more

SemiAnalysis pulled data from 432,000 coding agent requests and found: "the median one isn't 32k, isn't 64k, but 96k input tokens." The post continued: "For context, that's more than the entire text of The Great Gatsby being shoved into the model before you've even typed your question."

The 96K median figure is what a typical session consumes. Many sessions — the ones involving large codebases, multi-file context, long conversation histories — are considerably above this. Vendor benchmarks typically advertise performance and pricing at 32K input context. The gap between 32K and 96K directly multiplies per-session cost.

SemiAnalysis described the pattern as "quietly rewriting inference economics." The post was published May 22, the same week that:

  • DeepSeek made its discount permanent (Buzz 08)
  • Google launched Gemini Flash at sub-frontier pricing (Buzz 04)
  • Modal Labs reported 5x revenue growth driven by agentic coding infrastructure (Buzz 06)

The SemiAnalysis data does not specify which models or platforms the 432K requests ran on, or the full distribution of token counts beyond the median. The specific claim — "3x what most vendor benchmarks imply" — derives from comparing the 96K median to a typical 32K benchmark baseline. Organizations should expect variation based on codebase size and agent design.

What it means.

Enterprise buyers modeling AI infrastructure costs typically start from vendor benchmarks. Vendor benchmarks run controlled scenarios at specified context lengths. Real agentic coding sessions don't run at controlled context lengths — they run with whatever context the agent needs to complete the task.

This gap is the source of most AI deployment cost surprises. A company that built a cost model on 32K context will be paying 3x more per session than projected when agents run at 96K. At scale, that difference is the difference between a line item and a budget crisis.

The implication for infrastructure spending is the most direct link. Modal's 5x revenue growth (Buzz 06) and the entire rationale for SpaceX's AI compute revenue (Buzz 12) rest on the assumption that inference demand will keep growing. The SemiAnalysis data suggests that demand growth isn't just from more users — it's from each user consuming more compute than benchmarks predicted.

For teams designing agentic coding systems: context window management is now a cost engineering problem, not just a performance engineering problem. The choice of what to include in context, when to summarize, and when to start a fresh session has direct dollar implications. That's a new skill the field is still developing.

Coverage

Reactions

SemiAnalysis (@SemiAnalysis_, May 22, 307 likes, 39,880 views):

"Agentic workloads are quietly rewriting inference economics. We pulled data from 432k real coding agent requests at SemiAnalysis and the median one isn't 32k, isn't 64k, but 96k input tokens. For context, that's more than the entire text of The Great Gatsby being shoved into the model before you've even typed your question."

11  White House Green Card Policy May Force AI Researchers Out of the US

White House Green Card Policy May Force AI Researchers Out of the US

Photo by Tomasz Zielonka on Unsplash

A new White House policy now requires green card applicants to apply from outside the United States rather than adjusting status from within. Andrew Ng called it "a capricious attack on legal immigration" that "will hurt American competitiveness in AI." His post drew 12,110 likes and 1.37 million views. Reid Hoffman warned it may force AI researchers, employees, and students to leave the US and "wait through a backlog process to continue their work." The policy directly affects the hundreds of thousands of skilled workers currently on US work visas — including AI researchers at major labs.

Read more

The White House announced a policy change requiring green card applicants to leave the United States and apply from their home countries rather than adjusting status while already working in the US. The change affects people currently on H-1B, L-1, and other work visas who were in the process of obtaining permanent residency.

Andrew Ng's response on May 22: "The new White House policy requiring green card applicants to apply from outside the US is a capricious attack on legal immigration. It will hurt families, leave us with fewer doctors, teachers and scientists, and hurt American competitiveness in AI." The post attracted 12,110 likes, 1,624 retweets, and 1.37 million views.

Reid Hoffman on the same day: "Does this mean AI Researchers, employees, and students will now have to leave the country and wait through a backlog process to continue their work? Harmful move for tech, business, and America broadly..." The post drew 3,117 likes, 304 retweets, and 426,114 views.

The "backlog process" Hoffman references is specific. The US employment-based green card backlog — for applicants from India in particular — runs to decades in some categories. An employee required to leave and re-apply from their home country is not facing a months-long delay; they may be facing permanent inability to return on a green card timeline.

The policy's effect on AI specifically: a significant portion of AI researchers at US labs are foreign nationals on employment-based visas. The AI research community in the US is disproportionately international. Major labs have not commented publicly on how many employees this policy affects.

What it means.

This story matters differently depending on where you sit.

For US-based AI labs: the talent pipeline is international. Visa uncertainty is a longstanding issue, but the previous system at least allowed employees to work while their green card applications processed. Requiring departure creates an operational disruption — months or years away from your job, possibly without ability to return — for researchers mid-career at critical labs.

For CEE and Baltic companies recruiting from a global talent pool: the US is now a less stable destination for your best people considering career moves to American labs. If a researcher from Warsaw or Riga weighs a US offer against an EU option, green card policy is now a material factor in a way it wasn't eighteen months ago.

For the broader competitiveness argument: Ng and Hoffman are making a specific claim — that this policy reduces US AI leadership. The mechanism they describe is clear: talent that was planning a long-term US career will redirect to Canada, the UK, Germany, or Singapore. Those countries are actively recruiting. Some of the researchers displaced from US labs will end up at non-US labs doing frontier work that the US no longer captures.

We think this policy creates real risk to the US talent pipeline. Whether it materially changes the competitive balance over a 5-year horizon depends on how many researchers redirect, where they go, and how quickly the policy is challenged or reversed.

Coverage

Reactions

Andrew Ng (@AndrewYNg, May 22, 12,110 likes, 1,370,703 views):

"The new White House policy requiring green card applicants to apply from outside the US is a capricious attack on legal immigration. It will hurt families, leave us with fewer doctors, teachers and scientists, and hurt American competitiveness in AI."

Reid Hoffman (@reidhoffman, May 22, 3,117 likes, 426,114 views):

"Does this mean AI Researchers, employees, and students will now have to leave the country and wait through a backlog process to continue their work? Harmful move for tech, business, and America broadly..."

12  SpaceX Files S-1 for the Largest IPO in History — Partly on AI Compute Revenue

SpaceX Files S-1 for the Largest IPO in History — Partly on AI Compute Revenue

Photo by Sven Piper on Unsplash

SpaceX filed its S-1 prospectus on May 20, targeting a Nasdaq IPO at a $1.75–2 trillion valuation under the ticker $SPCX, with a listing date of June 12. The 308-page filing contains a specific disclosure that reorganizes the AI infrastructure map: SpaceX has entered Cloud Services Agreements with Anthropic for compute capacity on COLOSSUS and COLOSSUS II at $1.25 billion per month through May 2029 — roughly $15 billion per year, or $45 billion in total committed value. Anthropic is paying SpaceX — which merged with xAI — to run its models on the same infrastructure that hosts Grok.

Read more

SpaceX publicly filed its S-1 prospectus with the SEC on May 20, as reported by SemiAnalysis, which noted it was cited in the filing itself. The company is targeting a June 12 Nasdaq listing under the ticker $SPCX, with an anticipated raise of approximately $75 billion at a $1.75 trillion valuation — which would make it the largest IPO in American history. Q1 2026 revenue was $4.7 billion, up 15% year over year.

The AI infrastructure disclosure is the detail that matters for this briefing:

SemiAnalysis extracted the specific terms from the S-1: SpaceX "entered Cloud Services Agreements with Anthropic in May 2026 for capacity on COLOSSUS and COLOSSUS II at $1.25B/month through May 2029." SemiAnalysis calculates this at "~$15B/year, and an ~$45B TCV over 3 years. Either party can terminate with 90 days notice."

COLOSSUS is the Grok/xAI compute cluster. SpaceX and xAI merged before this filing. Anthropic is paying its competitor's parent company for compute — the rival became the customer. SemiAnalysis noted that SpaceX explicitly frames its position in a section titled "Our AI Compute Infrastructure Advantage and Growth Strategy," stating that "AI leadership will be defined by the ability to rapidly scale compute capacity to support exponential usage growth and frontier intelligence."

SpaceX's S-1 also identifies a $28.5 trillion total addressable market, including $26.5 trillion attributed to AI infrastructure, consumer subscriptions, advertising, and enterprise applications. SemiAnalysis research is cited in the filing.

What it means.

Buried in the IPO paperwork of a rocket company is a document about the state of AI infrastructure competition.

The $1.25B/month Anthropic deal tells you two things simultaneously. First, Anthropic's compute needs are large enough to warrant a multi-year, billion-dollar-per-month contract with a supplier whose primary business is rockets and satellites. Second, the concentration of GPU capacity is now so extreme that Anthropic — a company with safety-conscious founders who built independently of Elon Musk's ventures — is paying a company partly owned by Musk for the hardware to run Claude.

The 90-day termination clause on both sides means this is a live relationship, not a locked-in dependency. Either party can walk. But the $45 billion TCV means both parties have strong incentives to make it work for three years.

For investors: the SpaceX S-1 is built partly on the thesis that space infrastructure is AI infrastructure. COLOSSUS and COLOSSUS II are orbital compute clusters. If AI demand keeps growing and land-based data centers face power and cooling constraints, the claim that space-based compute is a strategic asset becomes testable on a short timeline.

For the broader market: this filing documents what was previously described in press releases. The convergence of space infrastructure and AI infrastructure is now in an SEC filing, with specific contract terms, revenue figures, and forward projections.

Reactions

SemiAnalysis (@SemiAnalysis_, May 20, 420 likes, 61,710 views):

"SpaceX just filed their S1. SemiAnalysis research is cited!"

SemiAnalysis (thread, May 20, 58 likes, 7,781 views):

"SpaceX states that they '... believe AI leadership will be defined by the ability to rapidly scale compute capacity to support exponential usage growth and frontier intelligence'"

SemiAnalysis (thread continuation, May 20, 104 likes, 17,291 views):

"SpaceX also disclosed exactly how much their deal with Anthropic is worth. They state that they have 'entered Cloud Services Agreements with Anthropic in May 2026 for capacity on COLOSSUS and COLOSSUS II at $1.25B/month through May 2029' — that's ~$15B/year, and an ~$45B TCV over 3 years. Either party can terminate with 90 days notice."

13  Cohere Open-Sources Command A+ Under Apache 2.0 — 218B Parameters, Runs on Two H100s

Cohere Open-Sources Command A+ Under Apache 2.0 — 218B Parameters, Runs on Two H100s

Cohere released Command A+ under the Apache 2.0 license on May 20 — its strongest model, with 218 billion parameters (25 billion active via mixture-of-experts architecture), running on two Nvidia H100 GPUs or a single Blackwell B200. Apache 2.0 means anyone can use, modify, and commercialize it without restriction. CEO Aidan Gomez described it as "our first fully open source Apache 2 model." His colleague Nick championed the decision, which Gomez said "required many discussions." The move intensifies pressure on closed model providers already facing DeepSeek's permanent discount and Gemini Flash's price-performance claims.

Read more

Command A+ is a mixture-of-experts model with 218 billion total parameters and 25 billion active parameters per forward pass. On benchmarks: agent performance on τ²-Bench Telecom improved from 37% to 85%; coding on Terminal-Bench Hard improved from 3% to 25%. On the Artificial Analysis Intelligence Index, it scores comparably to Claude 4.5 Haiku and Gemma 4 31B.

Hardware requirements are the headline for enterprise deployment teams: Command A+ runs on two Nvidia H100 GPUs — the most common enterprise GPU configuration — or a single Nvidia Blackwell B200. A 218B model on two H100s is accessible. Most comparable-parameter models require significantly more hardware.

Model weights are available on Hugging Face in multiple quantizations. The model supports 48 languages, handles text and images, and has a 128,000-token context window.

Aidan Gomez on X, May 20: "Our first fully open source Apache 2 model :)"

In a follow-up post, Gomez explained: "Nick really championed us going Apache 2 for this release and for Cohere Transcribe. Not an obvious decision and one that required many discussions. Like Nick says, I hope the model is more useful and empowering as a result."

Apache 2.0 is the permissive end of the open-source licensing spectrum. It allows commercial use without restriction, sublicensing, and distribution, with no requirement to open-source modifications. This is different from licenses that restrict commercial use (like Llama's earlier non-commercial restrictions) and from open-weights-only releases that lack full source licensing.

What it means.

Cohere is an enterprise-first company. Its clients are financial institutions, healthcare systems, and large enterprises that need data-sovereignty guarantees and on-premise deployment options. Open-sourcing Command A+ under Apache 2.0 is not a pivot to consumer AI — it is a bet that in the enterprise market, full openness is a sales advantage, not a handicap.

The reasoning is straightforward for the enterprise buyer: you can audit the model, deploy it on your own hardware, modify it for your use case, and never pay Cohere again if you don't want to. That is a different value proposition from every closed API provider. For regulated industries with data-residency requirements, it removes the vendor lock-in concern entirely.

The counter-argument Cohere's team clearly debated: you just gave away your best model. Training a 218B model costs millions of dollars. Competitors can now run it, fine-tune it, and offer it commercially without paying Cohere. The "required many discussions" note from Gomez is honest about this tension.

The bet is that the sales relationships and enterprise deployment support that Cohere provides — not the model weights themselves — are what enterprise buyers are actually paying for. If that bet is correct, open-sourcing accelerates enterprise adoption faster than it accelerates competition. We speculate that this thesis holds in verticals where the deployment complexity and compliance burden are high. Whether it holds broadly depends on how quickly commodity fine-tuning of open models matures.

Reactions

Aidan Gomez (@aidangomez, May 20, 215 likes, 16,633 views):

"Our first fully open source Apache 2 model :)"

Aidan Gomez (@aidangomez, May 20, 224 likes, 44,104 views):

"Nick really championed us going Apache 2 for this release and for Cohere Transcribe. Not an obvious decision and one that required many discussions. Like Nick says, I hope the model is more useful and empowering as a result."

Capital

Modal Labs raises $355M at $4.65Bmodal.com/blog/modal-series-c — General Catalyst and Redpoint led; Menlo, Bain Capital Ventures, and Accel joined. Annualized revenue exceeded $300M at closing, up 5x from the Series B in September 2025. The growth is driven by agentic coding infrastructure and RL training demand, with 1B+ Sandboxes launched to date.

SpaceX S-1 targets $75B raise at $1.75TSEC filing — Filed May 20, targeting a June 12 Nasdaq listing under $SPCX. Q1 2026 revenue $4.7B (+15% YoY). If priced at target, would be the largest IPO in US history.

Big Deals

Anthropic acquires Stainless for $300M+anthropic.com/news/anthropic-acquires-stainless — SDK and MCP server generator used by OpenAI, Google, Cloudflare, Replicate, and Runway. Anthropic is winding down all external access; affected companies retain rights to previously generated code but lose the automated update pipeline.

Anthropic commits $1.25B/month to SpaceXSemiAnalysis thread — Disclosed in the SpaceX S-1: Cloud Services Agreements for compute on COLOSSUS and COLOSSUS II through May 2029. ~$15B/year, ~$45B total committed value. Either party can exit with 90 days notice.

Pricing Moves

Gemini 3.5 Flash: better than Pro on coding, less than half the costGoogle IO blog — Outperforms Gemini 3.1 Pro on Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo), and MCP Atlas (83.6%). 4x faster than competing frontier models, 800 tokens/sec in Antigravity. Specific token pricing not published in announcement.

DeepSeek makes V4-Pro discount permanentdeepseek on X — A promotional pricing rate is now the stated floor. 23,539 likes and 6.5M views reflects how closely developers watch this number. Previous assumption that the discount was temporary is no longer valid.

Cohere Command A+ goes Apache 2.0the-decoder.com — 218B parameter model (25B active, MoE) available on Hugging Face for commercial use without restriction. Runs on two H100s or one Blackwell B200. Sets a new baseline for what's available free.

Platform Moves

OpenAI Guaranteed CapacityOpenAI on X — 1–3 year compute contracts with discounted token rates. Enterprise buyers can now lock in compute access as a procurement decision, not just a technical one. Converts spot-market token purchasing into an infrastructure commitment.

Google Antigravity 2.0Google IO blog — Standalone desktop app for multi-agent orchestration. Supports parallel agent execution. CLI and SDK variants. Google's platform play for the agent development layer.

Cursor SDK + Jira integrationcursor_ai on X — Public SDK in Python and TypeScript for building custom agents on Composer 2.5. Jira integration lets teams assign Cursor to work items and receive merge-ready PRs.

Layoffs / Restructure

Cloudflare: 1,100 jobs cut, 20% of workforce — "building for the agentic AI era"; internal AI usage up 600% in three months.

Coinbase: 700 jobs, 14% of staff — CEO Brian Armstrong: shifting "toward smaller, AI-augmented teams."

PayPal: ~4,760 jobs over 2–3 years — "accelerate AI adoption and automation across operations."

BILL: Up to 30% of workforce.

Upwork: ~25% of workforce.

Geopolitical

US green card policy change forces applicants abroadAndrewYNg on X — New White House policy requires green card applicants to apply from outside the US rather than adjusting status in-country. Affects hundreds of thousands of current H-1B holders including AI researchers at major labs. Andrew Ng: "a capricious attack on legal immigration" that will "hurt American competitiveness in AI."

Demis HassabisDemis Hassabis — On Gemini 3.5 Flash and the speed-cost inflection

"*"Gemini 3.5 Flash is amazing! — Performs better than 3.1 Pro on coding & agentic tasks — 4x faster than other frontier models — 12x faster in @antigravity - 800 tokens/sec! — Often at less than half the cost — And Pro to come…"*"

May 20, 2026 — 3,205 likes, 251,679 views

What he means.

Hassabis launched Gemini Flash directly at the frontier model price-performance argument. The framing — faster and cheaper than competing frontier models, while outperforming Google's own Pro tier — is not a subtle claim. DeepMind co-founder, Nobel Prize laureate (2024, Chemistry, AlphaFold), and CEO of Google DeepMind since the 2023 merger. The Flash launch is Google's answer to the pricing pressure from DeepSeek and the capability claims from Anthropic and OpenAI in the same week.

Andrew NgAndrew Ng — On the green card policy and US AI talent

"*"The new White House policy requiring green card applicants to apply from outside the US is a capricious attack on legal immigration. It will hurt families, leave us with fewer doctors, teachers and scientists, and hurt American competitiveness in AI."*"

May 22, 2026 — 12,110 likes, 1,370,582 views

What he means.

Ng's post was the most-engaged on the policy change and landed with 1.37 million views — unusual engagement for an immigration policy post, reflecting how close to home this lands for the AI engineering community. Founder of DeepLearning.AI, former chief scientist at Baidu, and co-founder of Coursera. His framing — "fewer doctors, teachers and scientists" — deliberately broadens the impact beyond AI to build a wider coalition of opposition.

Reid HoffmanReid Hoffman — On employment as a moral compact

"*"Modern employment should be an alliance. Not just legal contracts, but moral promises that are not to be made lightly, and even less lightly broken. Some thoughts on layoffs & hiring in AI, and the promises we make to each other that shouldn't be broken."*"

May 21, 2026 — 78 likes, 31,246 views

What he means.

Hoffman published this the day before the 144K layoff total was widely reported. LinkedIn co-founder, Greylock partner, and AI investor, Hoffman has been visible on the workforce transformation story throughout 2026. The post links to a longer piece; the quote stands on its own as a counterweight to the AI-restructuring framing that dominated corporate layoff announcements this week. The tension between "we're building AI-augmented teams" and "we had promises to keep" is the tension Hoffman is naming.

SemiAnalysisSemiAnalysis — On the inference economics of real agent sessions

"*"Agentic workloads are quietly rewriting inference economics. We pulled data from 432k real coding agent requests at SemiAnalysis and the median one isn't 32k, isn't 64k, but 96k input tokens. For context, that's more than the entire text of The Great Gatsby being shoved into the model before you've even typed your question."*"

May 22, 2026 — 307 likes, 39,880 views

What he means.

SemiAnalysis is the most-cited independent analyst voice on AI infrastructure economics. The 432K-request dataset gives this claim empirical weight that most commentary lacks. The practical implication is direct: every cost model for enterprise AI deployment that was built on vendor benchmark assumptions is wrong by a factor of roughly 3. The wider point — that what vendors benchmark and what users actually run are different workloads — applies beyond coding agents to any long-context agentic task.

Simon WillisonSimon Willison — On when OpenAI and Anthropic found product-market fit

"*"Given the recent burst of activity around enterprise pricing and contracts, I think April 2026 was the month when both OpenAI and Anthropic found product-market fit."*"

May 27, 2026 — 252 likes, 23,688 views

What he means.

The Guaranteed Capacity launch and the Stainless acquisition both landed this week. Willison's post — published three days after the scan window closed, in direct response to this week's enterprise pricing activity — names something that the individual announcements don't say explicitly: the pattern is PMF, not just a pricing adjustment. Creator of Datasette and one of the field's most prolific public note-takers on AI tooling, Willison's observation carries weight because he works at the level where these products are actually used.

Aidan GomezAidan Gomez — On the decision to go fully open-source

"*"Nick really championed us going Apache 2 for this release and for Cohere Transcribe. Not an obvious decision and one that required many discussions. Like Nick says, I hope the model is more useful and empowering as a result."*"

May 20, 2026 — 224 likes, 44,104 views

What he means.

The phrase "not an obvious decision and one that required many discussions" is as honest as a CEO tweet gets. Cohere could have licensed Command A+ with commercial restrictions, as competitors have done with large models. Going full Apache 2.0 gives competitors a free copy of your best work. Gomez co-founded Cohere after co-authoring the Transformer paper at Google Brain. His note names the specific colleague who pushed for openness — a rare attribution in corporate announcements that suggests the deliberation was real and the outcome was not predetermined.