Three rounds of adversarial stress-testing, March 2026
The starting thought was something like: maybe the Repugnant Conclusion is basically right but with a higher threshold. Everyone should be above some bar of "very good," and beyond that people don't need to sacrifice more. This felt right but also vague, so three rounds of red-team / blue-team argumentation were run to find out where it breaks and what survives.
Starting position: Leans person-affecting. Prefers fewer extraordinary lives over many mediocre ones. Rejects the Repugnant Conclusion. Death is much worse than non-existence (brute axiom). A new life on the margin adds positive value, but creating new lives is not morally equivalent to improving existing ones. Keeping the same people alive indefinitely is better than cycling through new ones.
The update: Maybe this is just a critical-level view with a high threshold? Everyone above "very good," and beyond that no one needs to sacrifice more.
Pick a critical level. Call it "very good." A billion people at just below that bar — lives full of meaning, love, and accomplishment, missing it by an epsilon — have zero moral weight on this view. A universe with a trillion such lives is no better than an empty universe. That's far more repugnant than the Repugnant Conclusion.
And the line is arbitrary. "Idk" is not a specification. Every critical-level theorist in the literature has this problem and none have solved it.
Worse: if the level is set high enough to avoid the Repugnant Conclusion, there exist cases where adding people with negative welfare is preferred to adding people with positive-but-below-threshold welfare. This is a proven formal result. You've traded one repugnant conclusion for an actually sadistic one.
If death is bad because it deprives someone of future goods, the same mechanism makes non-existence bad for potential people. You can't coherently say deprivation is catastrophic for existing people but neutral for potential ones. That's not a principled distinction — it's special pleading.
And if a new life "on the margin is increasing in happiness," whose happiness? If the new person's, you've admitted potential people have interests before they exist. If aggregate happiness, you've abandoned the person-affecting view entirely.
You prefer 1 billion immortals over a rolling population. But the "you" at age 900 million shares no memories, personality, values, or relationships with the "you" at 25. By any psychological continuity account, you died millions of times. You've privileged biological continuity — the same atoms being replaced over and over — as the morally decisive feature of the universe.
"Intuitions about cases but no underlying theory" means the position produces contradictions the moment someone constructs the right thought experiment. At what point does adding one more good-but-not-extraordinary life stop being positive? You need a function. You don't have one. This isn't an underdeveloped position — it's a non-existent one.
The asymmetry isn't mysterious. A person who exists has preferences, projects, relationships, and a trajectory. Ending that trajectory destroys something with structure and momentum. A person who never exists has none of these. It's the difference between demolishing a building and never constructing one. You don't need metaphysics — just the observation that existing persons are embedded in webs of value.
Critical-Level Person-Affecting Consequentialism: existing persons enter the social welfare function at full weight, potential persons at a discount. The critical level c is set at genuinely flourishing — not bare contentment, but real engagement, autonomy, meaningful relationships. This blocks the Repugnant Conclusion because barely-worth-living lives fall below c.
Longevity preference: The existing-person premium means sustaining an excellent life is worth more than replacing it with an equivalent new one. Replacement destroys real value and creates merely equivalent value.
Marginal new lives: If expected welfare of a new life exceeds c, creating it is good — just less good than equivalently improving an existing life.
X-risk: Handled through two channels: the death of 8 billion existing people is catastrophic (full weight), and foreclosing trillions of potential above-c lives has real if discounted negative value.
Setting c can't be derived from anything deeper. But total utilitarianism smuggles in c = 0, and average utilitarianism smuggles in c = current average. At least this framework makes the axiom explicit.
The discount for potential persons must be high enough to make x-risk catastrophic (trillions of future lives generating massive moral urgency) and low enough that creating billions of above-threshold-but-mediocre lives doesn't swamp existing welfare. These pull in opposite directions. There is no value that delivers everything this framework promises.
The Hermit Problem: A person with no relationships, no projects, no trajectory — on this grounding, their death is less bad than a socially embedded person's. The framework says killing lonely people is less wrong than killing popular ones.
The Newborn Inversion: A three-day-old infant has almost no web of value. The framework entails that infant death is far less bad than adult death — exactly backwards from most moral intuitions.
Most humans who have ever lived — and a substantial portion alive today — live below "genuinely flourishing." The framework treats their creation as morally equivalent to creating no one at all. That is not blocking the Repugnant Conclusion. That is moral blindness wearing a technical hat.
The death channel and the future-person channel describe the same event. Presenting them as "two independent reasons" is two leaky buckets, not robustness.
The critical level c, the discount rate, the web-of-value grounding, and the aggregation structure — each doing enormous normative work, none derivable from anything deeper. Enough degrees of freedom to match any set of intuitions. That's not a moral theory. It's motivated engineering.
Lives below c don't drop to zero — they receive diminishing but positive weight. The critical level marks the inflection point, not a cliff. A universe of trillions of meaningful-but-not-flourishing lives is better than an empty one — it's just not as good as the same number genuinely flourishing. This is not ad hoc; it reflects the actual structure of the intuition.
Every population ethics bites a bullet somewhere. Total utilitarianism gets the Repugnant Conclusion. Average utilitarianism says adding one ecstatic person to ten billion is wrong if they're below average. Person-affecting views get the Non-Identity Problem. This position bites a softened Sadistic Conclusion in exotic cases. It refuses the Repugnant Conclusion in everyday cases. That's a defensible trade.
A person who dies has a subjective perspective interrupted. A potential person has no subject to bear the deprivation. The arrow of reference runs one direction: after death, we point backward to the subject who lost their future; before creation, there is no one to point to. The Epicurean insight correctly applied — it works for pre-natal non-existence but fails for death, because death has a victim.
The person at year N is continuous with year N-1. Death severs the chain at whatever link you currently occupy. The 25-year-old who dies loses year 26 — that's the harm. Whether the chain extends to year 900 million is irrelevant. The claim is only that involuntary death is a distinctive harm because it severs an ongoing chain.
V = Σ(existing)(wi − c) + Σ(potential) σ(wj) · (wj − c), where σ is a sigmoid discount. This is fully formal. It aggregates. It produces specific answers under continuous variation.
The sigmoid eliminates the cliff-edge absurdity. The local continuity move on identity is defensible. Those are real patches. That's it.
Under the hard threshold, the Sadistic Conclusion was confined to a boundary region. Under the sigmoid, for any potential person with welfare below c, the term σ(wj) · (wj − c) is negative. Every potential person below flourishing actively counts against the value of a world. A world with ten billion decent-but-not-flourishing potential lives is evaluated as massively negative. The blue team didn't soften the Sadistic Conclusion — they generalised it from a cliff to a slope covering the entire sub-flourishing population.
The round 2 blue team restated the grounding without confronting what happens when it's thin. A hermit with no relationships, no projects — their death is less bad on this account. A newborn with minimal subjectivity — barely bad at all. The grounding was repeated, not defended.
Existing persons get raw (wi − c). Potential persons get σ(wj) · (wj − c). The person-affecting motivation is about how bad death is. But the aggregation function uses the existing/potential distinction to determine how much welfare counts in evaluating a world. These are different questions smuggled into one operator.
Person-affecting intuition and aggregative framework are in permanent tension. Person-affecting says value is for someone. Aggregation says value is summed across someones. The sigmoid makes the contradiction differentiable. You can take the gradient of the inconsistency now. You still can't eliminate it.
X-risk urgency comes entirely from the existing-person channel: 8 billion real people face annihilation, at full moral weight. The sigmoid handles future persons separately and can be calibrated conservatively. The apparent contradiction dissolves once you stop asking one parameter to do two jobs.
Subject-disruption does not require rich social webs. It requires any ongoing subjective perspective with temporal continuity — a first-person experience of being-in-time that death interrupts. A newborn has this. A hermit has this. The "web of value" language was imprecise and is discarded. What grounds the asymmetry is the bare fact of there being something it is like to be this creature, and that something having a forward-looking trajectory.
Saying the creation of a life of unrelenting suffering was morally neutral or negative is what most reflective people already believe in extreme cases. The sigmoid means no cliff edge — lives modestly above subsistence receive some positive weight. The framework doesn't say those people didn't matter once they existed (full existing-person weight). It says the act of bringing them into existence contributed less moral value. These are different claims.
The two x-risk channels aren't independent. The existing-person channel dominates and is sufficient. The future-person channel provides supplementary, not additive, reason. In practice this changes nothing about policy conclusions.
Every ethical framework has free parameters — the utilitarian chooses a welfare function, the Kantian chooses how to formulate maxims. The question is whether they do principled work. c marks where existence stops being a benefit. σ reflects genuine moral uncertainty near the threshold. The existing/potential asymmetry reflects the metaphysical difference between actual and possible subjects. These are structural commitments with independent motivation, not arbitrary tuning knobs.
The sigmoid's exact shape is underdetermined. The framework gives less action-guidance than totalism for large potential populations. It requires judgment in hard cases rather than algorithmic outputs. But no population ethics is clean. This one is honest about where it's messy, and it's messy in places that matter less than where the alternatives are messy.
Sarah is in a reversible coma. No ongoing first-person experience, no subjective continuity — just a body with a 95% chance of full recovery in six months. Under the framework, she has no "bare subjective continuity." She is, phenomenologically, exactly like a potential person.
Compare Sarah to a 30-week fetus, which does have rudimentary subjective continuity. The framework says the fetus counts at full existing-person weight. Sarah counts at… what? You need either to say Sarah is a potential person (monstrous) or introduce "identity persistence through unconsciousness" — which is no longer bare subjective continuity.
This isn't an edge case. Hundreds of thousands are under general anaesthesia at any moment. The ontological category flickers on and off every time someone goes under for surgery.
A dictator announces: "I will create 100 million new lives at welfare c−ε" (just below flourishing). Under the sigmoid, each life's term is net negative. The dictator says: "Unless you pay me $1 trillion, I'll do it." The framework says paying the ransom is correct — those decent-but-imperfect lives genuinely make the world worse. A policy apparatus that registers decent lives as damage to be prevented.
Team A sets c at "healthcare, education, meaningful work, social connection." Team B sets c at "deep purpose, creative fulfilment, strong community." Under Team A, creating a new person in Denmark is net positive. Under Team B, the same person might be net negative. The framework's action-guidance on whether to bring people into existence flips sign based on a parameter you cannot empirically determine.
A machine creates a perfect physical duplicate with continuous subjective experience from activation. Moral weight in the universe doubles instantly. The potential-person discount evaporates in one second. The same physical object evaluated totally differently based on whether a consciousness has flickered into being yet.
Climate policy 2026: spend $50 trillion on aggressive decarbonisation (reduces current welfare, ensures 10B future people live well) vs $10 trillion on moderate adaptation (preserves current welfare, future generations face hardship). How much should existing people sacrifice for future people? This is precisely the trade-off declared unrankable. Every serious question in longtermism, climate, pandemic preparedness, and infrastructure is an existing-vs-potential welfare trade-off. A framework that can't rank these is vacuous on its own domain.
The sigmoid is not a moral discount on persons. It is an epistemic confidence function about whether a moral relationship obtains. When σ(wj) = 0.1, we are not saying "this person's suffering counts 10%." We are saying "we are 10% confident that the moral situation is one where this person's welfare generates obligations for us at all." This dissolves the slope problem: at low σ, we lack confidence that obligations exist to either count suffering or count happiness.
A generation ship will create 10,000 people; 100 will have genetic conditions causing net-negative lives. Total utilitarianism says launch (9,900 happy > 100 suffering). Average utilitarianism says launch if average is positive. This framework says: we cannot offset 100 suffering lives by pointing to 9,900 happy ones. Serious obligations to solve the genetic problem before launching, even at significant cost. This matches reflective judgment better than either competitor.
Climate: Produces a Stern-type discount curve that is morally grounded rather than based on arbitrary time preference. Existing people harmed get full weight; near-certain future people (next two generations) get near-full weight; the further future generates weaker but non-zero claims.
Reproductive ethics: σ for a merely hypothetical child is low. No strong obligation to create happy people. But once the decision crystallises, σ rises sharply and obligations to ensure welfare intensify — capturing the intuition that there's no duty to procreate but increasing duties of prenatal care.
Space colonisation: Cautiously supportive. Trillions of merely possible space-dwellers at very low σ generate some moral pull but cannot swamp existing people's claims. The strongest case comes from the existing-person x-risk channel.
Bare subjective continuity requires: (a) something it is like to be the being, (b) functional integration where states at t causally influence t+1, (c) temporal phenomenology — experience of duration. Most vertebrates qualify. Current LLMs almost certainly don't. A future AI with a continuous phenomenal stream would. Gradual upload preserves continuity; destructive scan-and-copy doesn't (the copy is a new person, the original's continuity is destroyed — that destruction is a death).
Save 1,000 existing people from a flood, or use the same resources to create a paradise colony of 1,000,000 supremely happy people? Total utilitarianism says create. This framework says save — existing persons at full weight with the death asymmetry generate claims that a million potential persons at low σ cannot override. Nearly every human confronted with this dilemma chooses save.
"We are 10% confident that this person's welfare generates obligations" — what kind of uncertainty? Not purely empirical (existence probability is already in the inputs). So it must be moral uncertainty about whether a being of this type generates obligations. But moral uncertainty about obligation-generation is a moral discount, redescribed in epistemic vocabulary while performing the identical mathematical operation.
Suppose σ = 0.1 for a future person. We then discover with certainty that this person will exist and suffer horribly. The epistemic reading demands we update σ toward 1. But the sigmoid is defined as a function of proximity, relational thickness, and bare subjective continuity — none of which changed. The framework's own inputs resist the update the epistemic reading demands.
Modify the Colony Ship: halfway through a 200-year journey, the community discovers it could engineer 10,000 future colonists, but doing so requires diverting medical resources such that 5 existing colonists with chronic conditions will die. The easy case (100 known genetic conditions) was handled well. The hard case — 5 existing vs future colony extinction — the framework goes catastrophically silent.
Every night, during dreamless NREM sleep, condition (a) arguably fails and (c) certainly does. Moral status flickers with sleep cycles. A framework that makes moral status oscillate nightly is not tracking anything morally fundamental.
Five rounds of modifications, each prompted by red team objections. The sigmoid was added, then reframed as epistemic confidence, then recharacterised as a decision-theoretic tool. Bare subjective continuity was proposed then revised. What is the foundational principle from which all these features are derived, rather than added? If there isn't one, this is a collection of intuition-matching devices held together by a sigmoid.
This forces a genuine modification. Replace bare subjective continuity with dispositional subjectivity: an entity has full moral status if it (a) currently possesses first-person experience, or (b) possesses the intact physical substrate for experience such that, absent intervention, experience will resume with high probability. Sarah satisfies (b) decisively. Sleepers, surgical patients — all straightforwardly existing persons.
This tracks a genuine intuition: what matters is not the flickering presence of phenomenal states moment to moment, but whether there is a someone there — a locus of subjectivity that persists through temporary interruptions. Sleep and coma preserve the substrate; death does not.
The wrongness attaches to the dictator's coercive creation-decision, not the people created. Once those people exist, they're existing persons with full moral status. The correct response: refuse the ransom and commit to maximal welfare support for anyone thereby created. Standard hostage-negotiation ethics — don't incentivise hostage-taking even when individual hostages suffer.
A policy recommendation is trustworthy when it survives across a range of plausible c values. When it flips sign within the plausible range, the framework says "we do not have sufficient information to act confidently." This is not silence — it is decision-relevant output.
Yes, moral weight doubles instantly. Speed is morally irrelevant. Before activation there was no one to be harmed or benefited; after activation there is. If anything, this supports the framework: the asymmetry tracks something real (the presence or absence of a subject).
The framework is asymmetric, not silent. Strong obligations not to make existing and near-certain future persons worse off. Weaker but non-zero obligations regarding potential persons' welfare. No obligation to maximise sheer numbers. This generates clear guidance: aggressive climate mitigation, pandemic preparedness, infrastructure investment. What it blocks is the totalising longtermist claim that we should sacrifice present welfare to maximise expected future lives. That is a feature.
Not epistemic confidence, not ontological gradation, but warranted practical moral consideration — a decision-theoretic quantity integrating both uncertainty about moral status and best estimates of morally relevant properties present in the entity. A decision function, not a metaphysical claim.
The core asymmetry is defensible. Harming existing persons is worse than failing to create new happy persons. No counterexample has fully dislodged this. The Repugnant Conclusion remains total consequentialism's open wound, and this framework avoids it.
The sigmoid is a useful engineering choice — stripped of ontological pretension, it's a reasonable decision-theoretic instrument for graduated uncertainty.
Dispositional subjectivity handles sleep, coma, and anaesthesia by appeal to a principled criterion rather than ad hoc stipulation.
The Colony Ship Inversion remains unanswered. Five colonists must be sacrificed to provide biological material or 10,000 future colony members will never exist. The future colonists are conditional on the sacrifice, not near-certain. The framework either says "the five outweigh the ten thousand" (monstrous) or quietly imports aggregative reasoning that contradicts the person-affecting constraint.
The Misdiagnosis Problem: A patient is declared brain-dead. Under dispositional subjectivity, they lose moral status — substrate judged non-functional. Organs harvested. Autopsy reveals: the substrate was intact. Experience would have resumed. If moral status is evidence-relative, it is not a property of the person but of our epistemic state about them. The blue team has quietly replaced moral realism with constructivism while speaking the language of realism.
The Framework Shopping charge was never answered. After six rounds the theory has: an asymmetry, a sigmoid, dispositional subjectivity, a decision-theoretic recharacterisation, robustness testing, and several more patches. These have the character of a policy platform rather than a theory derived from foundations.
A person's neurons are replaced one-by-one with silicon equivalents. Behaviour preserved at each step. At 50% replacement, does the person have full dispositional subjectivity? The physical substrate is "intact" functionally but not biologically. At 100%, is this the same existing person or a new entity? The theory must either specify what counts as substrate-intactness — importing a philosophy of mind it hasn't defended — or admit that dispositional subjectivity is indeterminate in precisely the cases where it's most needed.
APAC is an excellent policy heuristic for reproductive ethics, population policy, and near-term resource allocation. As a comprehensive moral theory, it is incomplete. It cannot handle existential-risk trade-offs without smuggling in aggregative reasoning it officially rejects. Its metaphysics inherits unsolved problems in philosophy of mind. Its components are assembled in response to objections rather than derived from a unifying principle. A good tool mistaken for a foundation.
APAC rests on a single claim: moral status is not binary but graded, and the grade tracks the degree to which an entity instantiates conditions sufficient for morally relevant experience. Everything else follows from working out what that commitment requires when applied honestly. The sigmoid captures threshold-with-gradual-onset. The multi-dimensional inputs reflect that morally relevant experience has multiple necessary components. The modifications across six rounds were clarifications, not patches.
Five colonists must die to prevent extinction in generation 4. APAC assigns lower individual weight to each future person but does not cap the aggregate. Hundreds or thousands of future lives at σ of 0.3–0.5 each, aggregated, clearly favour preventing extinction against five deaths. This is straightforward application, not smuggled aggregation — the framework always summed weighted individual contributions.
The output of the confidence reading and the moral discount reading is identical. But the justification differs, and justifications constrain future reasoning. On the epistemic reading, learning with certainty that a future person will suffer demands updating σ toward 1 — and the inputs do update, because learning about certain suffering is learning that the phenomenal conditions are met. An honest concession: in practice the two readings are harder to distinguish than ideal. Whether the recharacterisation does genuine philosophical work or merely psychological work is a question that cannot be fully resolved here.
A principled, non-arbitrary way to handle cases where moral status is genuinely uncertain or graded: early-stage embryos, advanced AI, non-human animals, future persons at varying probability. It structures disagreement — forcing you to say which conditions you think are met and to what degree — even when it cannot resolve it.
It does not eliminate moral disagreement. Two reasonable people can assess the inputs differently. It does not handle rights-based constraints. It remains vulnerable to the charge that sigmoid parameterisation is ultimately arbitrary. Calibration requires reflective equilibrium, and that process is inherently contestable.
APAC is a mid-level moral framework — more structured than raw intuition, less complete than a comprehensive ethical theory. Its foundational principle is genuine and independently motivated. Its mathematical apparatus is the simplest form adequate to the principle's requirements. The strongest remaining objection is that the epistemic-moral collapse may be a distinction without practical difference. This is the version someone should actually hold: as a useful, principled, honestly limited tool for navigating graded moral status — not as a complete ethical theory, and not as a substitute for judgment in hard cases.
The position that survives is Asymmetric Person-Affecting Consequentialism — a graded moral-status framework with a high critical level, sigmoid weighting, and dispositional subjectivity grounding the existing/potential distinction.
What held up: The core asymmetry (existing persons matter more than potential ones). The sigmoid as a decision-theoretic tool. Dispositional subjectivity as a principled criterion for moral status. The framework's ability to avoid the Repugnant Conclusion while preserving x-risk urgency.
What broke: The claim to be a complete moral theory. The framework is better understood as a mid-level heuristic — powerful in its domain (reproductive ethics, population policy, near-term resource allocation) but incomplete for existential trade-offs between existing and future welfare. The person-affecting intuition and the aggregative machinery remain in genuine philosophical tension that no amount of sigmoid-smoothing resolves.
Remaining honest gaps: The sigmoid's parameterisation can't be derived from first principles. Dispositional subjectivity inherits open questions from philosophy of mind (gradual upload, substrate-independence). The epistemic-moral distinction may not do as much work as claimed. These are real weaknesses — but every competing theory has equivalent or worse ones in different places.