Research compiled March 2026 • 44 source files • 100+ web sources
This is a practical guide based on exhaustive research into AI fiction writing — practitioner reports, academic studies, community consensus, and documented workflows. The core finding is a paradox: LLMs predict the most probable next token, but good fiction lives in the improbable. Every technique here is a workaround for that tension. Some workarounds are remarkably effective.
Quick Start: Do These 7 Things
If you read nothing else, read this.
Pick the right model. Claude Opus for literary prose. Gemini Pro for dialogue-heavy work. A fiction fine-tune (Muse, Erato) for genre fiction. Model choice matters more than any prompting technique. C1
Brainstorm before you write. Spend 60–80% of your time on planning, character psychology, scene design. Only 20–40% on prose generation. Discuss what each character wants, what the tension is, what sensory details ground the scene. At least 3–5 exchanges before requesting any prose. C2
Maintain a banned-word list. Start with: delve, tapestry, vibrant, pivotal, testament, beacon, crucible, landscape, nuanced, multifaceted, intricate, realm, embark, resonate, unprecedented, bustling, profound, stark. Add to it as you go. This is the single easiest high-impact intervention. C2
Use a story bible. Structured character profiles, world rules, style guidelines injected into every generation call. Use Novelcrafter's Codex, NovelAI's Lorebook, or build your own. C2
Generate in small chunks. 250–1,200 words per request. Edit between chunks. Feed the edited version back as context. Longer generations lose coherence and drift toward generic patterns. C2
Cap AI revision at 3 passes. Each pass flattens prose (the "blandification" effect). Use AI to diagnose problems, not to fix them. Rewrite weak passages yourself. C2
Accept the editing time. Budget 1–2 hours of human editing per 1,000 words. If you're not editing heavily, the output will read as AI-generated. The honest bar: a reader who doesn't know it's AI-assisted finds it a competent, enjoyable read. C2
Expected results with this workflow: First drafts 60–70% usable. Final quality reads as competent genre fiction after editing. 500–1,500 words polished output per hour. Occasional passages of genuine quality. A persistent battle against AI defaults requiring constant vigilance.
The Setup (30 minutes)
1. Get a Claude API key or subscribe to Claude Pro. Opus is ideal; Sonnet is cost-effective.
2. Choose a tool: Novelcrafter (structured story management, $8+/mo + API costs), Sudowrite (guided features, $19–59/mo), or Claude directly (max control).
3. Create a style guide document containing your banned-word list, 2–3 example passages in your target style, show-don't-tell before/after pairs, and character voice notes.
System Prompt Template
You are a skilled literary author writing a [genre] novel. Your prose style is [2-3 adjectives].
STYLE RULES:
- Use precise, concrete sensory details. "The fluorescent tube buzzed and flickered" not "the light was harsh."
- Vary sentence length dramatically. Follow a long complex sentence with a short one.
- Show emotion through action and physical detail, never by naming the emotion.
- Let subtext carry meaning. Do not state themes or morals.
- Prefer strong specific verbs over adverb-modified weak verbs.
- Do not resolve tension in the same paragraph it is introduced.
NEVER USE: delve, tapestry, vibrant, pivotal, testament, beacon, crucible, landscape, nuanced, multifaceted, intricate, underscores, furthermore, realm, embark, resonate, unprecedented, bustling, profound, stark, journey (metaphorical), grapple, navigate (metaphorical), etched, symphony, crescendo, veil, mere, sprawling, glint, shimmer
[Your character voice notes]
[Your world-building context]
[Previous scene summary or full text]
The LLM-ism Hall of Shame
The definitive catalog of AI writing tells. These are the words, phrases, and patterns that mark prose as machine-generated. Based on corpus analysis (Berenslab, Pangram Labs, GPTZero), practitioner catalogs (tropes.fyi, NousResearch, stop-slop), and community consensus.
The single most reliable tell is not a word but a texture: uniform sentence rhythm, predictable paragraph structure, and 2–4x lower entropy than human writing.
Tier 1: Kill on Sight (5+ sources, corpus-confirmed)
Self-posed rhetorical questions: "The result? Devastating." / "The best part?"
Monotonous sentence length: 2–4x lower entropy than human writing
Tailing participles: "...highlighting the importance of..." / "...underscoring the need for..."
Uniform paragraph length: Every paragraph roughly the same size ("symmetric load-balancing")
Throat-clearing openers: "In the dim light of..." / "In the ever-evolving landscape of..."
Closing tautologies: Sections ending with empty recaps restating what was just said
Elegant variation (synonym cycling): "The building... the structure... the edifice..."
Short punchy fragment paragraphs: Standalone sentences for false drama. Deployed compulsively.
Mode Collapse Names & Settings
ElaraKaiLunaChenZephyrDaliaTeliathe ancient courtyardthe bustling marketplacethe dimly lit tavern
Prompting Techniques That Work
Ranked by practitioner consensus and evidence quality. Techniques independently recommended by 3+ credible sources rank highest.
High Impact (do these first)
1. Brainstorm extensively before writing C2
The single highest-impact technique. Kaj Sotala's method: spend multiple exchanges discussing character psychology, scene dynamics, thematic resonance, and specific sensory details before requesting any prose. This front-loads the creative decisions that models are worst at making autonomously.
JP LeBlanc's average prompt was 28,200 words. His largest was 87,202 words. The scale of context engineering required for good AI fiction is genuinely surprising.
"How well would a human writer do with this prompt? If humans would struggle, an LLM will too."— Kaj Sotala
2. Banned word/phrase lists C2
Maintain an explicit list in your system prompt. Start with the Tier 1 words from the Hall of Shame. Add model-specific terms (Claude: em-dashes, "something shifted"; GPT: "I couldn't help but," "a sense of").
Warning: banning too many words forces AI toward awkward third-choice vocabulary. "The cure can be worse than the disease." — Blake Stockton
3. Positive style instructions over negative C2
"Write with precise, concrete sensory details. Vary sentence length. Use strong verbs. Let subtext carry emotion." This works better than "Don't use cliches. Don't be purple."
Why: Telling a model "don't write purple prose" can paradoxically increase purple prose — you've primed the concept, like telling someone not to think of a pink elephant.
4. Concrete specifics over abstractions C2
"She was studying differential equations at the kitchen table, her coffee going cold" — not "she was a math genius." Provide this instruction AND examples in your system prompt.
"If you state a character's motivation in the summary, GPT will include that motivation directly in the text" — which is telling, not showing.— Alexander Wales
5. Style extraction from samples C2
Feed the model 2–5 passages from your target style. Ask it to analyze style characteristics (sentence structure, vocabulary register, metaphor use, pacing, POV distance). Then use that analysis as a reusable style guide.
Simple "write in the style of X" prompts don't work for expert readers (fidelity OR=0.16). Style extraction works much better.
Medium Impact
6. "Author framing" persona C2
"You are a skilled literary author giving voice to this character" outperforms both "You are this character" (too literal) and "Write a story" (too generic). The framing positions the model as craftsperson, not role-player.
7. Gwern's anti-sentimentality directive C2
"Prefer concrete technical specificity over emotional generality. Avoid sentimentality and unearned epiphany. Do not resolve tension prematurely. Let ambiguity stand."
This was the key breakthrough for Claude 4.6 fiction. Earlier versions "veer dangerously close to sentiment and schmaltz" without it.
8. Show-don't-tell with before/after examples C2
Include 2–3 pairs in your system prompt:
Before: "She was angry." After: "She set the mug down carefully, exactly centered on the coaster."
Models learn from examples far more effectively than from abstract instructions. JP LeBlanc found that "show don't tell" required "loads of instructions that are repeated throughout the prompt" — a single mention was insufficient.
9. Generate 3-4 versions and curate ("kitbashing") C1
Universal consensus: never accept first output. Generate multiple drafts with different priorities (structural, emotional, voice, risk-taking). Manually assemble the best lines/paragraphs. "Your voice stays consistent because you're selecting every line."
10. Temperature 0.7–1.0 / Min-p 0.05–0.1 C2
Below 0.7 = flat, predictable. Above 1.0 = incoherent. For local models, min-p 0.05–0.1 dramatically outperforms top-p (ICLR 2025). Presence penalty 0.1–0.3 reduces repetition without harming coherence.
Note: temperature's effect on "creativity" is weaker than commonly believed (R² = 0.385 per academic study). Model choice matters more.
Lower Impact but Worth Knowing
Verbalized sampling produces 1.6–2.1x diversity increase: ask for 3 different approaches, then select or combine. C3
Write the first paragraph yourself — it anchors voice, tone, style, and pacing for everything that follows. C2
Dialogue-first drafting — generate dialogue exchanges first, then narration around them. Often produces more natural scenes. C3
Frame prompts as reader experience goals — "Write an opening that makes the reader feel trapped" rather than "write a good opening." C3
Write LLM outlines obliquely — never state motivations directly; the model will insert them literally into the text. C2
Model Rankings for Fiction (March 2026)
Ranked by convergent evidence from benchmarks, community consensus, and practitioner reports.
Tier 1: Frontier Fiction Models
1
Claude Opus 4.6/4.5
Anthropic • $15/$75 per MTok • 200K context
lechmazur benchmark #1 (8.53/10). Near-universal community consensus.
High EQ, human-sounding prose. No longer available.
emotional intelligencehuman-soundingdeprecated
5
Sudowrite Muse 1.5
Proprietary fiction fine-tune • Only via Sudowrite
Self-reported 2x preference over Claude 3.7 Sonnet. Good prose, fewer cliches.
fiction-tunedfewer AI tellsself-reported claimsSudowrite-only
6
GPT-4o (post-Nov 2024)
OpenAI • Creative writing update
Community considers it superior to GPT-5 for fiction.
good all-aroundcreative updateAI voice patterns
Tier 3: Usable with Caveats
7
GPT-5 family
OpenAI • Sam Altman admitted they "screwed up" writing quality
emotionally flatover-filteredgood dialogue
8
NovelAI Erato 70B
Fiction-finetuned • 8K context • $15-25/mo
less AI voiceuncensoredlimited reasoning8K context
9
Local: Midnight Miqu 70B / MythoMax-L2 13B
Open-source • Free • Requires GPU
freeuncensoredcommunity classicrequires hardware
Not recommended for fiction: Reasoning models (o1, o3, DeepSeek-R1) produce stilted, over-explained writing. Useful only for story planning and outlining. GPT-3.5 / small models (<7B) lack capability for anything beyond fill-in-the-blank.
Key Insight: RLHF Damages Fiction Quality
Base models consistently outperform their aligned counterparts at creativity measures C1. RLHF causes mode collapse, verbosity bias, and forced neutrality — precisely the qualities that harm fiction. This is why fiction-specific fine-tunes (Erato, Muse, Midnight Miqu) often outperform much larger general-purpose models at raw prose quality.
Genre Performance Rankings
#
Genre
AI Level
Why
1
Romance/Erotica/Fanfic
Strong
Strong conventions, pattern-heavy
2
LitRPG/Progression
Strong
System-heavy, formulaic by design
3
Thriller/Mystery
Moderate-Strong
Established pacing templates
4
Fantasy
Moderate
Worldbuilding strong; depth weak
5
Science Fiction
Moderate
Ideas good; "remarkable sameness"
6
Horror
Moderate
Atmosphere OK; true dread is hard
7
Poetry
Contested
Competitive in blind tests, debated
8
Literary Fiction
Weak
Requires subtext, voice — AI's weakest
9
Comedy/Satire
Weakest
"Cruise ship comedy from the 1950s"
Tools & Platforms
Ranked by actual output quality, not marketing. Based on practitioner experience, community size, and documented results.
Tier 1: Purpose-Built, Recommended
Recommended
Novelcrafter
$4–20/mo + API costs (BYOK)
Best-in-class Codex (story bible) with structured character/world/plot tracking. Bring your own API key — use any model. Scene-level generation with persistent context injection. Active Discord community (~10K+ users).
Best for: Power users, serious projects, people who want control. Limitation: Steeper learning curve, requires API keys.
Recommended
Sudowrite
$19–59/mo (includes AI credits)
Story Engine for guided multi-chapter generation. Proprietary Muse model fine-tuned for fiction. "Describe" and "Expand" tools for targeted improvement. Style Examples persist across your account.
Best for: Genre fiction (romance, fantasy, thriller), less technical users. Limitation: Subscription pricing, less control than Novelcrafter, Muse claims are self-reported.
Recommended
NovelAI
$10–25/mo
Fiction-tuned models (Erato 70B) with genuinely less "AI voice." No content restrictions — critical for horror, dark fiction, mature themes. Lorebook system for world/character persistence. 100K+ subscribers estimated.
Best for: Uncensored fiction, genre fiction within 8K context. Limitation: 8K context ceiling, weaker at complex plotting, text-adventure UX paradigm.
Tier 2: Usable with Limitations
Usable
Claude Direct (API / Console / Claude Code)
API pricing • Pro $20/mo
Highest prose quality available. 200K context enables whole-novel awareness. Claude Code + CLAUDE.md as style guide is a viable power-user workflow. No story bible, no scene management — requires manual context management.
Usable
SillyTavern + KoboldCpp
Free (open-source)
Maximum flexibility. Run any local or API model. Extensive character card and lorebook system. Chat-based paradigm is awkward for traditional fiction. Best for interactive fiction, character-driven scenes.
Not Recommended
Avoid
Jasper, Writesonic, Rytr
Marketing copy tools masquerading as creative writing tools. Optimized for SEO content, not narrative prose.
Avoid
ShortlyAI
Effectively abandoned. Still running GPT-3.5. Overpriced.
Agent Workflows for Longer Fiction
How to use multi-step and multi-agent approaches for novel-length work.
The Universal Pipeline: Plan → Draft → Revise
Every successful long-form AI fiction workflow follows this pattern:
Plan (human-led, AI-assisted)
→
Draft (AI-led, human-guided)
→
Revise (human-led, AI diagnoses)
Critical Workflow Patterns
1. Immutable "Bible" Pattern C2
Create a canonical document with style rules, character voices, world rules, tone guidelines. Inject into every generation call but never modify it during generation. Novelcrafter's Codex and NovelAI's Lorebook implement this natively.
2. Perplexity Gate (Claude-Book framework) C2
Use a small, fast model (Ministral 8B) to score generated prose for perplexity. Low-perplexity text (highly predictable) is flagged as "AI-sounding" and sent for regeneration. Flags ~20–30% of content.
Writer (Opus)
→
Perplexity Gate (Ministral 8B)
→
Pass / Rewrite
Detection criteria: PPL < 22 per sentence, sigma < 14 over 14-sentence windows, forbidden AI-signal vocabulary.
3. Dual Evaluation (Autonovel) C2
Combine mechanical scanning (regex for banned words, sentence length variance, repeated phrases) with LLM-as-judge scoring (rate for voice, tension, specificity). Neither alone is sufficient. Plateau detection stops the loop when scores stabilize.
4. Scene-by-Scene with Human Editing Between C2
Generate one scene, edit it, then generate the next with the edited version in context. Prevents error accumulation and keeps the human in creative control. The edited scenes become the "ground truth" for subsequent generation.
5. Separation of Planning and Writing Agents C2
The planning agent can be a reasoning model (o3, DeepSeek-R1). The writing agent should be a frontier creative model (Claude Opus, Gemini Pro). Even within the same model, switching system prompts between planning and writing modes helps.
Documented Multi-Agent Systems
Claude Book (Thomas Houssin) — 79K-word novel
Open-source (MIT). Orchestrator-worker pattern: Planner (Opus) → Writer (Opus) → Perplexity Gate (Ministral 8B) → Reviewers (Sonnet). Versioned state management. Nine rewriting techniques for flagged passages. Max 3 rewrite loops per chapter.
The most comprehensive open-source fiction agent system. Full anti-slop + craft + voice framework. Adversarial editing → cut application → reader panel (4 personas) → brief generation → chapter rewrite. Produced The Second Son of the House of Bells.
3 writer agents + 3 editor agents + 4 reviewer agents completed a novel in 12 hours from a human story bible. Output: readable genre fiction — functional plots, consistent characters. Not literary fiction.
The Honest Assessment
What AI fiction can and cannot do in March 2026. No sugarcoating.
What AI Fiction Can Do
Produce short-form fiction (<500 words) that non-experts cannot distinguish from human writing. 46.6% detection rate — below chance. C1
Match or exceed human quality in short passages when fine-tuned on specific authors (MFA readers preferred fine-tuned AI 2:1). C1
Generate original creative vision or genuine surprise. C2
Produce quality at novel length without heavy human editing. No AI novel has received genuine critical acclaim. C1
Satisfy literary experts with standard prompting (8:1 expert preference for human text). C1
Write comedy that's actually funny. "Cruise ship comedy from the 1950s, but a bit less racist." C2
Blind Reading Studies — Surprising Results
Mark Lawrence Test (2025): 964 voters, 8 flash fiction stories. The highest-rated story was AI-generated. 3 of top 4 were AI.
NYT Blind Quiz (2026): 86,000 participants. 54% preferred AI writing over passages from McCarthy, Le Guin, Sagan.
AI Poetry Study (2024): 1,634 participants. Below-chance detection (46.6%). Rated AI poems MORE favorably.
Caveat: Quality inversely correlated with length. At <500 words, AI is competitive. At novel length, no.
The Fundamental Tension
"Art requires making choices. When using AI, you are making very few choices; the AI is filling in for all of the other choices that the human is not making."— Ted Chiang
"At every level AI fiction fails to be surprising. The plot will be maximally obvious. But also every metaphor will be maximally obvious, and every sentence structure, and almost every word choice."— JustisMills, LessWrong
"Every time ChatGPT tries to write a grief scene, it sounds like a Hallmark card."— Reddit user
The Publishing Reality
The professional writing community is overwhelmingly hostile to AI fiction. Clarkesworld received 500 AI submissions in one month — none reached the second round. The SFWA banned AI-assisted submissions. The Authors Guild created an AI-free certification program. 69% of authors feel their careers threatened.
One practitioner has published 200+ romance novels under 21 pen names, generating six-figure revenue. She demonstrated producing a complete book in 45 minutes during a live interview. This represents one end of the spectrum: AI as volume-production where speed matters more than literary ambition.
The bottom line: The most productive use of AI for fiction is as a generator of raw material that a skilled human writer curates, edits, and shapes. The human's editorial judgment — knowing what's good — is the bottleneck, not the AI's generation ability. "The AI handles volume. You handle voice."
Battle-Tested Prompt Library
Copyable system prompts and templates from documented practitioners. Each has been used in real fiction projects.
1. JP LeBlanc's Core Writing Rules
Used to produce an 82,000-word novel over 102 evenings. Carried in every prompt (~6,000 words static).
You are a best-selling fiction writer capable of writing stories that readers love.
WRITING RULES:
- Past tense, US English, active voice exclusively
- Always follow the "show, don't tell" principle (repeated throughout)
- Avoid adverbs, cliches, and overused/commonly used phrases
- Aim for fresh and original descriptions
- Skip "he/she said" dialogue tags; convey actions/expressions through speech
- Each dialogue in separate paragraphs
- Mix short, punchy sentences with long, descriptive ones
- Drop fill words to add variety
- Reduce uncertainty indicators like "trying" or "maybe"
- Dialogue-driven storytelling
[World setting data]
[Character profiles]
[Previous 4 scenes for context]
[Scene specifics: actions, characters, location, emotional tone]
[Additional instructions for known LLM failure modes per scene]
2. Gwern's Anti-Sentimentality Directive
The key breakthrough for Claude 4.6 fiction quality.
Ignore conciseness rules. Prioritize vividness, narrative flow, and sensory imagery.
Prefer concrete technical specificity over emotional generality. Avoid sentimentality and unearned epiphany. Do not resolve tension prematurely. Let ambiguity stand.
Describe what exists rather than what doesn't. Ban "antithesis bloat" (phrasing like "It was not X, but Y") and "list-negation" syntax.
3. Kaj Sotala's Co-Writer Persona
The most technically rigorous LessWrong guide. Key insight: LLMs are "vibe-matching machines."
You are a prolific fanfiction writer who's been active in online fandom spaces for over a decade. You've written in dozens of fandoms — from sprawling sci-fi epics to intimate character studies — and you've read hundreds of different takes on the same characters. You live for that electric moment when someone takes established canon and tilts it just enough to reveal something new. You also enjoy writing original speculative fiction.
[Key techniques to add:]
- The "Human Test": "How well would a human writer do with this prompt?"
- Anti-stereotype: Instead of "a teenager who is a math genius," provide "He has been learning differential equations recently and is eager to explain those to his father"
- The "Yes, And" response: Convert AI inconsistencies into creative seeds
4. NousResearch Anti-Slop Framework
From the most comprehensive open-source fiction agent system. Produced a 79K-word novel.
Instead of "write like Author X," generate reusable instructions.
Step 1 — Generate style instructions:
"If I wanted to advise an AI to write dialog like Aaron Sorkin, what instructions should I give?"
Step 2 — The AI might produce:
"Write fast-paced dialogue with long, complex sentences using rhetorical devices, sophisticated vocabulary, and many character-challenging questions."
Step 3 — Use those generated instructions as system prompt components for actual fiction generation.
6. Style Analysis for Emulation
Analyze the text below for its writing style, tone, themes, and other literary aspects. Provide a comprehensive guide based on these features that can be used as a reference for writing an original story emulating this author's distinct style. Keep in mind that the writer will not have access to the original work, and the guide must be abstract and comprehensive enough to allow the creation of new original ideas that seem similar to the original. Do not quote character names; refer to them as Character1, Character2, etc.
[Paste sample text here]
7. Human-Like Chapter Writing
Write a chapter in a natural, human-like writing style. Focus on eliminating any elements of purple prose and cliche phrases from your writing. Instead of using exaggerated descriptions, aim for clear and straightforward language that conveys the scene effectively and evocatively without unnecessary embellishments.
What to Avoid: "Huntington's abandoned factories stretched toward a colorless sky, their smokeless chimneys standing like accusing fingers."
Better Approach: "The abandoned factories of Huntington loomed against the gray sky, their chimneys silent and still."
Construct the chapter to reflect an authentic voice, allowing readers to connect with the narrative without flowery language distracting them.
This guide was compiled from 44 research files covering 100+ web sources. This content was AI-generated and has primarily been AI fact-checked, not personally verified by the author. Confidence ratings (C1–C5) are used throughout.