NovelFlame

Built a token economy and a CCBill payment integration across 2 specs. Watched competitor reviews name counting-currency UX as the #1 complaint in the category. Ripped out both and shipped a flat subscription instead.

Live site

Why this stack

SvelteKit's server-first routing puts every auth and content-access gate at the server layer instead of scattered client checks, which mattered once a security audit found the auth guard excluded all /api routes from protection entirely. Supabase paired with Drizzle keeps the schema in version-controlled TypeScript instead of a GUI-managed database, important when a solo operator is the only reviewer of every migration. Cloudflare R2 skips egress fees for a product that generates images and video on every story. OpenTofu over Terraform was a license call, not a technical one: Terraform's 2023 move to the Business Source License made a state file holding database passwords and API keys a bigger risk than a drop-in open-source replacement with native state encryption. Railway's Docker-based deploy meant the Docker build itself, not a platform-specific buildpack, was the thing worth optimizing, which is why the multi-stage Dockerfile and GitHub Actions layer caching became their own spec.

AI-assist note

Built spec-driven with Claude Code across 189 specs (spec, plan, tasks per feature, with a CodeRabbit review pass on higher-risk PRs). I wrote every spec, reviewed every diff, and made the calls that don't show up in a diff: OpenTofu over Terraform, killing the token economy, moving token deduction before generation instead of after. Claude Code wrote most of the implementation and tests.

Stack

SvelteKit 2 / Svelte 5
TypeScript
Supabase (Postgres + Auth)
Drizzle ORM
Cloudflare R2 + Turnstile
Vercel AI SDK (xAI, OpenAI, Anthropic, Google providers)
Stripe + Apple IAP
OpenTofu (IaC)
Docker + GitHub Actions + GHCR
PostHog (product analytics)
Loops.so (lifecycle + newsletter email)
Railway

Domains

Full-Stack SaaS Engineering
Infrastructure as Code
Payments & Compliance
AI Content Safety

Live4 mo

Users200+ users, 450+ story sessions in soft launch (Supabase, Jul 2026); still early on paid conversion

PaymentsStripe subscriptions (web) + Apple IAP (iOS); CCBill fully retired Mar 2026

InfraCloudflare (DNS/CDN/R2/Turnstile), OpenTofu IaC (~30 resources), Docker multi-stage + GHCR, Railway

AuthSupabase Auth via SSR cookies

MarketingPaid Facebook ads via a marketing partner (Jul 2026)

Specs189

Why this exists

NovelFlame is an interactive fiction platform: readers pick a genre and premise, and the app generates a branching story with inline images and a completion video, choice by choice. I run it solo, end to end, from the Terraform-equivalent that provisions the Cloudflare zone to the CI pipeline that gates every merge. The interesting engineering problem isn’t the story generation itself, it’s everything around it: a payment system that has to be right the first time because it’s real money, a content filter that has to run on every output because the input is an LLM, and an infrastructure and release pipeline built for a team of one.

The product also changed shape under me. It shipped with a token economy and a CCBill integration, and by spec 122 the data said that model wasn’t working: 0 paying users, and competitor reviews naming counting-currency UX as the category’s top complaint. Rather than keep patching a model the market was telling me not to build, I killed it and shipped a flat subscription instead.

Architecture

The flow is 1 direction with 2 gates. A reader’s request passes Supabase’s SSR auth check and the subscription gate (free tier caps at 3 stories per 30 days; Plus is unlimited) before it ever reaches an LLM. The input safety scan runs the same 2-layer filter, the OpenAI Moderation API plus custom rules, that runs again on the way out, so a prompt has to clear the gate twice: once before generation, once before display. Generation itself is provider-routed rather than locked to 1 vendor: xAI’s Grok is the default story and video model, OpenAI covers the utility tier and character-consistency images, and Anthropic’s Claude runs a separate editorial quality pass over demo content. Media lands in Cloudflare R2 and streams back over SSE, segment by segment, so a reader sees text and images appear progressively rather than waiting on the full generation. Underneath all of it, a Payments Gate (Stripe on web, Apple IAP on iOS) controls access, and the whole app is provisioned through OpenTofu and deployed through a GitHub Actions to Docker to GHCR pipeline onto Railway.

What shipped

189 specs took NovelFlame from a forked prototype to a live, solo-operated product. Some of the most valuable work wasn’t a new feature. It was a security audit that found 4 real financial bugs already in production, including a payment webhook silently failing to credit real customers, and fixed each with its own targeted mechanism rather than 1 patch that half-covered all 4. It was infrastructure-as-code that turned roughly 30 hand-configured cloud resources into version-controlled state, importable and idempotent. And it was a monetization reset that killed a fully-built token economy and CCBill integration once the data, 0 paying users and a UX complaint pattern from competitor reviews, said that model wasn’t the right one, replacing it with a flat Stripe and Apple IAP subscription instead.

The throughline across all of it is the same: a solo operator’s infrastructure has to be legible enough that 1 person can trust it, and honest enough to change course when a shipped feature turns out to be the wrong bet.

Skill stories

Click a skill to open the story behind it: the decision, what broke, how it got measured, and how it got fixed.

Infrastructure as Code (OpenTofu over Terraform)Infrastructure & DevOps
CI/CD Pipeline + Docker Layer CachingBuild & Release Engineering
Token & Payment IntegrityBackend Reliability & Payments
Monetization Reset: Killing the Token EconomyProduct & Payments Architecture
2-Layer Content-Safety FilterAI Safety & Trust

Infrastructure & DevOps

Infrastructure as Code (OpenTofu over Terraform)

Decision: Cloudflare DNS, R2 buckets, Turnstile widgets, Supabase project settings, and GitHub branch protection all lived in dashboards, changeable by anyone with access and undocumented anywhere in the repo. I picked OpenTofu over Terraform for the codify-everything pass: same HCL, same providers, but MPL 2.0 instead of Terraform's 2023 move to the Business Source License, plus OpenTofu's native state encryption, which matters because the state file holds database passwords and API keys.
What broke: State lived only on my laptop. A disk failure would have meant losing the only record of how roughly 30 cloud resources were actually configured, and there was no way to detect drift if I, or an AI agent, changed something by hand in a dashboard.
How I measured it: Ran the import command against every existing resource first instead of applying from scratch, then confirmed a plan against the imported state showed 0 changes, proving the codified config matched what was actually running in production.
How I fixed it: State now lives encrypted in a Cloudflare R2 backend. Secrets moved out of plaintext .env/.tfvars into SOPS with age-encrypted keys. Branch protection and CI secrets are codified, so repo governance survives even a full repo recreation.

Build & Release Engineering

CI/CD Pipeline + Docker Layer Caching

Decision: The existing GitHub Actions workflow only ran tests. I built a required-status-check gate (lint, format, type-check, tests, a production build, and infra validation that only runs when infra/ files change) and rebuilt the Dockerfile into cache-friendly stages so a code-only push doesn't force a full dependency reinstall.
What broke: Every push rebuilt the Docker image from scratch, dependencies included, even for a 1-line source change. There was also no branch protection tying merge eligibility to a green CI run, so a red PR could still land on main.
How I measured it: Benchmarked cache-hit builds against cold builds against a 50%+ speedup target and a sub-3-minute goal for code-only changes, then verified PR builds never publish an image by checking the GHCR package API for a 404 before the first real merge.
How I fixed it: Dependency installation and application build are now separate Docker stages, so unchanged dependencies stay cached across pushes. Images publish to GHCR only on merge to main, tagged by commit SHA plus a latest tag, gated behind the same CI status check that blocks merges.

Backend Reliability & Payments

Token & Payment Integrity

Decision: A security audit surfaced 4 separate financial bugs at once instead of 1, and I fixed each with an independent mechanism rather than patching symptoms: a starter-token race, a generation-guard race, a charge-after-work ordering flaw, and a payment webhook that resolved purchasers by a raw email string instead of their account ID.
What broke: The CCBill webhook credited tokens to a phantom user record built from the purchaser's email, not their actual account UUID, so real purchases never reached the paying user's balance. Separately, token balance was checked before generation but deducted after, meaning a balance that hit 0 mid-generation produced free content, and 2 concurrent generation requests on the same session could double-charge.
How I measured it: Wrote concurrent-request tests (Promise.all against the starter grant and the generation guard) to prove exactly 1 grant and 1 generation win under a race, plus webhook tests with both a known and an unknown email to confirm the unknown case logs for manual resolution and still returns 200 so the processor doesn't retry-storm the endpoint.
How I fixed it: The starter grant became an atomic INSERT ... ON CONFLICT DO NOTHING. The generation guard became a single conditional UPDATE ... RETURNING that the losing request sees as a 409. Token deduction moved before generation, with an automatic refund transaction on failure. The webhook resolved the real account UUID via an admin user lookup before crediting anything, and the digest check moved to a timing-safe comparison.

Product & Payments Architecture

Monetization Reset: Killing the Token Economy

Decision: By the time I scoped the monetization reset, NovelFlame had 0 paying users and a billing model that competitor reviews (rival AI-fiction apps) named as the category's top complaint: counting currency per segment, image, and video instead of a plain subscription. I made the call to remove CCBill entirely and kill the user-facing token system rather than keep patching it, replacing both with a flat Stripe subscription plus Apple IAP for the iOS app.
What broke: The billing UI advertised the paid tier as unlimited while the code silently deducted tokens from subscribers on every generation and every image, a bait-and-switch that would have generated refunds and 1-star reviews the moment anyone subscribed. CCBill's compliance surface (refunds, chargebacks, admin purchase monitoring, a full webhook) was fully built and tested but was solving for a payment processor the product no longer needed once the content stayed PG-13.
How I measured it: The token rip-out ran as a tracked 4-PR arc with a verified impact matrix, and a follow-up sweep caught 6 dead operator scripts plus a staging seed script that had been silently failing for days because continue-on-error: true in the CI workflow masked the crash.
How I fixed it: CCBill's config, service, webhook handler, and 16 test files came out in a single dedicated removal PR, with the payment-provider schema columns deliberately preserved rather than dropped so the historical record survives. NovelFlame Plus now bills through Stripe on web ($8.99/mo or $69.99/yr) and Apple IAP on iOS, with no token, credit, or gem counter visible anywhere in the product.

AI Safety & Trust

2-Layer Content-Safety Filter

Decision: Every AI-generated story, chat message, and image prompt needed a safety gate before it reaches a reader, so I built a hybrid filter instead of relying only on the upstream models' own guardrails: the OpenAI Moderation API as the primary classifier, plus custom keyword and pattern rules for platform-specific categories the moderation API doesn't cover.
What broke: The upstream AI providers already run their own content safety, but those thresholds can change without notice and differ by provider, so a filter that only trusted a model's refusal behavior had no way to catch a policy gap consistently across every provider.
How I measured it: Set explicit target rates instead of a pass/fail gut check: 100% of explicit prohibited prompts blocked pre-generation, under 1% false-positive rate on legitimate content, and 95%+ catch rate on common evasion techniques like leetspeak and unicode substitution, each checked against a dedicated test set.
How I fixed it: Filter actions use a tiered response: a hard block with no override for the highest-risk category, and a soft warning with a suggested rephrase for lower-risk categories. Every action logs a content hash, the rule matched, and the outcome for audit purposes, and a user-facing report button files straight to GitHub Issues, so a solo operator with no admin dashboard still has a working review queue.