Details
Technical overview of Typo's architecture, tech stack, and spellcheck pipeline.
Typo is a multi-service application. The Next.js frontend talks to a FastAPI backend, which dispatches long-running crawl and spellcheck jobs to Celery workers via Redis.
- Frontend — Next.js 15, Tailwind CSS, shadcn/ui
- Backend — Python, FastAPI, async SQLAlchemy, Alembic
- Task Queue — Celery with Redis broker
- Crawling — Playwright (always JS-rendered)
- Spellcheck — LanguageTool (self-hosted, spelling + grammar categories)
- NLP — spaCy (en_core_web_lg) for proper noun recognition
- Database — PostgreSQL
- Infrastructure — Docker Compose, all services containerized
Every page goes through a multi-stage pipeline. Content is extracted, filtered for noise, then checked for spelling and grammar errors. Spelling findings are filtered through proper noun and custom dictionary checks to remove false positives. Grammar findings skip these spelling-specific filters.
- User submits a URL — Typo fetches sitemap.xml to discover pages, falling back to a discovery crawl if no sitemap exists.
- User selects pages from the folder tree or scans all discovered pages.
- A Celery worker picks up the job, launches Playwright, and crawls pages concurrently with a shared browser instance.
- Each page's text is extracted, filtered, and sent to LanguageTool for spelling and grammar analysis.
- Spelling results pass through proper noun and dictionary filters to remove false positives. Grammar findings skip these spelling-specific filters.
- Cross-page deduplication groups identical errors from shared content (headers, footers).
- If a previous scan exists for the same domain, findings are diffed and tagged as new, fixed, or recurring.
Schedules allow recurring scans to run automatically on a weekly or monthly cadence. Each scheduled run re-discovers pages via sitemap and executes the full spellcheck pipeline, then diffs results against the previous scan.
- Frequency — weekly (pick a day of the week) or monthly (pick a day of the month, capped at the 28th).
- Time — configured in Eastern time (America/New_York), converted to UTC for storage and Celery Beat scheduling.
- Celery Beat — a periodic task checks every minute for schedules whose next run time has passed, then dispatches a scan job.
- Auto-pause — after 3 consecutive failures, the schedule is automatically paused to prevent repeated broken runs. Users can resume manually.
- Run Now — any schedule can be triggered on-demand without affecting the next scheduled run time.
- Scan diffs — each scheduled scan links to the previous scan for the same domain, so findings are tagged as new, fixed, or recurring.
- Spelling + grammar — LanguageTool checks for spelling errors, grammar issues, confused words, casing, compounding, and punctuation. Style, redundancy, and other noisy categories remain disabled.
- JS rendering always — every page is rendered with Playwright. No lightweight HTTP fallback, ensuring accuracy on modern JS-heavy sites.
- Proper noun intelligence — spaCy NER + a geographic database (US & Canada, ~14K place names) + a brand dictionary prevent false positives on names and locations.
- Cross-page deduplication — identical errors in repeated content are reported once with a “found on N pages” count.
- Scan diffs — each scan links to the previous scan for the same domain, enabling new/fixed/recurring tagging.
- Scan retention — configurable retention period (default 90 days) with automated cleanup.
The Stats page compares Typo's hosting cost against what the same usage would cost on Spling's Enterprise plan. The comparison baseline is:
- Spling — $120/mo base for 8,000 pages, plus $0.01 per page over 8,000
- Typo — $24/mo flat (self-hosted infrastructure)
Monthly savings
Each month's savings includes both the base plan cost and any overage charges. For example, a month where Typo scans 10,000 pages:
- Spling base cost: $120.00
- Spling overage: (10,000 − 8,000) × $0.01 = $20.00
- Total Spling cost: $140.00
- Typo cost: $24.00
- Savings that month: $116.00
Total (lifetime) savings
The lifetime number is the sum of every month's Spling cost (base + overage) minus the total Typo cost across all active months. Months with higher page counts contribute more to the savings total because of the per-page overage that Spling would have charged.