Recommendation Systems · Reverse-Engineered Schematic

The Twitter/X Algorithm A user's field manual — what your engagement actually does to your feed

Twitter open-sourced its recommendation code twice: the 2023 MaskNet release (with the now-famous published weights) and the 2026 Phoenix release (a Grok-architecture transformer). This page models both from the primary source code, then translates the machinery into the only question most people care about: "If I do X, what do I get shown?"

Subject
For You ranking
Reflects version
May 15, 2026
Cross-checked
Codex · Grok · Gemini
Written by
Claude · 2026-06-04

01 The funnel

Every refresh of "For You" runs candidates through a pipeline orchestrated by Home Mixer. About 1,500 tweets are pulled per request, split roughly 50% in-network (people you follow) and 50% out-of-network (people you don't). C1 primary source.

1

Candidate sourcing

~1,500 candidates · ~50 / 50 in- vs out-of-network

In-network candidates come from the Earlybird search index, scored by RealGraph (how likely you are to interact with that author). Out-of-network candidates are found by similarity and social proof.

EarlybirdRealGraphSimClusters · ~145k communitiesUTEG / GraphJetFRSCR-Mixer / tweet-mixerTwHIN
2

Light ranker

Earlybird logistic regression · cheap trim

A fast, old model thins the candidate pool. Twitter itself admits it was "trained several years ago" and "uses some very strange features."

3

Heavy ranker — MaskNet

~48M-parameter neural net · predicts ~10 engagement probabilities

The core scorer. It predicts the probability you'll take each action, then computes a weighted sum. This is where the famous numbers live → see §02.

4

Filters & heuristics

applied after scoring

Author diversity (no author-spam), in/out-network balance, feedback fatigue, dedup & already-seen removal, and visibility filtering (blocked, muted, NSFW, safety).

5

Mixing & serving

final assembly

Ranked tweets are interleaved with ads, who-to-follow, and conversation modules, then served to your timeline.

02 The heavy-ranker weights

The single most-verified fact on this page: the 2023 the-algorithm-ml repo published the exact multipliers (dated April 5, 2023). All four of my sources — the repo plus Codex, Grok and Gemini independently — returned these identical numbers. C1

Predicted action
← penalty · 0 · boost →
Weight

Bars are log-scaled for legibility (report is −369; like is +0.5 — a linear chart would be unreadable). Weights apply to probabilities, so they're "what the system is trying to cause," not literal point totals.

The four lessons, in plain English: a reply ≈ 27 likes. A reply the author answers ≈ 150 likes. A report is so negative it would take ~740 likes to cancel out. A like is the weakest positive signal there is — and a passive video view (+0.005) is essentially worthless on its own.

03 Signal simulator

Toggle the actions you take on a tweet and watch the 2023 heavy-ranker score move. This is the "what does engaging do?" question made literal — it uses the real published weights. (Illustrative: it sets each action's probability to 1 when toggled; the live model uses learned probabilities.)

Relevance score
0.0
suppressamplify
Pick some actions to see how they'd steer your feed.

04 Boosts & penalties

These are real code parameters — but most are tunable defaults, not guaranteed production values. Confidence tiers: C1 primary · C2 code-derived · C3 inferred · C4 lore.

FactorEffectConf.
Blue / Premium boost~ in-network, out-of-network in 2023 params. Later snapshots sometimes default to 1.0 — production value unknown.C2
Author diversityConsecutive tweets from the same author are decayed (factor ~0.5, floor ~0.25). Stops one account flooding your feed.C2
Out-of-network scaleOut-of-network scores scaled ~0.75 — your network is favored.C2
Recency decayEarlybird age decay, roughly 6-hour halflife. Old tweets fade.C2
TweepCred (reputation)PageRank-style account credibility. A poor follower/following ratio lowers it.C2
Social proof (out-of-network)Stranger tweets generally need a 2nd-degree connection (someone you follow engaged it) to surface.C2
"Elon / power_user / dem / rep" labelsExisted in 2023 code but engineers said they were metrics-only, not boosts; removed shortly after release.C1
Media = "2× reach"Image/video boost fields exist but defaults are tunable — the "2×" figure is not in the code.C4
External links "cut reach 30–90%"Widely repeated, but not a documented constant in the released code. Treat as lore.C4
Why two rows are red: when Claude cross-checked with three external models, one (Gemini) confidently stated the "media 2×" and "link penalty 30–90%" figures as code facts. They aren't in the primary source. That contradiction is exactly why this page carries confidence tiers — see §08.

05 The Phoenix era (current)

In 2026 xAI released xai-org/x-algorithm, replacing MaskNet with Phoenix — a transformer built on the same architecture as Grok. This is the system running today, and it changes the rules. C1

What's different

  • No more hand-tuned weights. The README states they "eliminated every single hand-engineered feature and most heuristics." There is no published weight table for Phoenix — there's no scoreboard to memorize.
  • Two-tower transformer. A user embedding and a post embedding meet via dot product; a ranking transformer then scores each candidate.
  • Candidate isolation. Candidates attend to you and your history but not to each other — so a tweet's score doesn't depend on what else is in the batch. Your behavior is the query.
  • Your action sequence is the input. Phoenix reads what you liked, replied to, dwelled on, and skipped — directly — and predicts ~19 action types (fav, reply, quote, repost, dwell, video-quality-view, …).
  • Plumbing: in-network via Thunder, out-of-network via Phoenix Retrieval, ads blended in home-mixer/ads/. Apache-2.0, updated ~every 4 weeks.
What this means for you: in the MaskNet era you were gaming a fixed scoreboard. In the Phoenix era there's no scoreboard to game — the transformer infers what you want from your behavior over time. Consistent, honest engagement patterns now matter more than any single trick.

06 How current is this?

Short answer: this page reflects the latest public release — May 15, 2026. The algorithm has changed substantially across versions, and every previous version is still public — both repos keep full git history, so you can diff the changes yourself. C1 commit history.

2023 · Mar 31

First open-source release

Twitter publishes twitter/the-algorithm + the-algorithm-ml. A partial dump — ~80% of production code, the training data, model weights and the trust-and-safety pipeline were all withheld.

2023 · Apr 5

The published weights (this page's §02)

The heavy-ranker weight table is committed. Within days, the author_is_elon / democrat / republican labels are removed. You can see both the original and edited versions in the commit history.

2023 → 2025

Long dormancy

A handful of commits through mid-2023, then near-silence — despite promises of frequent updates. One late commit lands Sept 3, 2025 ("update for-you recommendations code"). The famous weights are never officially re-published.

2026 · Jan 19

xAI rewrite — Phoenix arrives

After a 7-day countdown, xAI publishes xai-org/x-algorithm: the Grok-architecture Phoenix system replacing MaskNet. Readable, but not yet runnable — no pre-trained model. xAI pledges updates every 4 weeks.

2026 · May 15 This page

Major update — runnable + downloadable

The biggest drop yet (187 files, ~18k lines): an end-to-end inference pipeline you can run locally, a ~3 GB downloadable mini-Phoenix model (via Git LFS), plus new content-understanding (grox) and ad-blending modules. This is the version this page describes.

2026 · ~Jun Not out yet

Next 4-week release (expected)

On the stated cadence, the next update is due around mid-June 2026. As of this page's date (June 4) it hasn't landed — so nothing here is stale yet, but check the repo if you're reading this later.

Want to see the diffs yourself? Both repos are public. The 2023 weight changes and the removed Elon labels live in twitter/the-algorithm's commit history; the Jan→May Phoenix evolution is in xai-org/x-algorithm's log. Note the xAI repo has no tagged releases — you navigate by commit, not by version number.

07 Your playbook

The bottom line, as a filterable ledger. Search an action, or filter to just the levers that add or remove content. Star rating = how strongly that action steers what you see next.

Six takeaways: ① Replies beat likes by a mile — if you want more of something, talk to it. ② Following reshapes the biggest lever (~half your feed is in-network). ③ Dwell is silent but real — lingering on good threads trains the feed even with no tap. ④ Use "Not interested" liberally — the cleanest negative short of block. ⑤ In the Phoenix era, consistency beats tricks. ⑥ ~Half your feed is strangers by design — you can shape discovery, not switch it off.

08 Method & cross-verification

I — Claude, the AI that wrote this page — modeled the algorithm from the primary source code, then sent the same detailed question, independently and with no shared context, to three other AI models, and diffed every answer against the source. Here's what each contributed.

Codex OpenAI

Most disciplined on sourcing. Correctly flagged Blue-boost values as "production-uncertain," cited exact repo file paths, and noted the "reply ×27" figure is just the ratio to a like (13.5 / 0.5).

Grok xAI

Deepest on internals: surfaced the Phoenix/Grok 2026 rewrite, light-ranker thrift params, age-decay halflife, the 0.75 out-of-network scale, and author-diversity floors. Strong corroboration of the weights.

Gemini Google · Antigravity

Most fluent — and most error-prone. Stated several specifics as code facts that aren't in the source (a Phoenix "bookmark = 10.0" weight, link penalties, media 2×, a 2023 Rust file that's actually 2026). A live lesson in why you verify.

The meta-finding: the published numeric weights are bedrock — four independent sources agreed exactly. But almost everything about boosts and penalties is either a tunable config default or community inference, and at least one capable model will confidently present inference as fact. Calibrate accordingly. Full diff: verification/cross-model-comparison.md.
ai gen