Florula | Chris Milligan

An open, openly-cited atlas of 31,673 plants — native status, invasive and wetland verdicts, pollinator value, and bloom calendars, every fact traced to a public source. Fast, free, and forkable.

What it is

Florula is an open-source, openly-cited plant atlas covering the full US flora — 31,673 species, each on one page. The name means the plant life of a small local area, and the locality is the point: native here, invasive here, grows near you. Type any plant and get a single page that crosses the things scattered across a dozen sites — native status, invasive and wetland verdicts, cold-hardiness, bloom calendars, pollinator and monarch value, growing conditions — plus the honest warnings plant tags omit: invasive here? hard to remove? feeds specialist bees? There’s live typeahead search over all 31.7k taxa, and browsing with filters including “hide invasive (plant freely).” Every rendered fact traces to an openly-licensed source. It’s free, never gated, and forkable.

Why I built it

This data exists everywhere and is aggregated nowhere usable — and the parts that matter most are the cross-cuts a nursery will never print. “Native to my area and good for pollinators and not invasive here” is a question every gardener has, and the answer lives in no single place. I wanted the reliable, free, one-stop version, scoped to your locality rather than a generic national average, and I wanted it to stay free and open enough that anyone could fork it.

How it works

A build-time ETL pipeline normalizes six open datasets into one taxonomy, which is pre-rendered to static HTML and served from object storage behind a thin Worker.

USDA PLANTS · GBIF · US-RIIS · NWPL · iNaturalist · Wikipedia
        └→ build-time ETL (Node, zero-dependency) → one taxonomy
              └→ ~31,700 static HTML pages → Cloudflare R2 → Worker → you

The front end is Astro 6 (static SSG) with Tailwind v4, vanilla-JS islands (no UI framework), and a dark “charcoal botanical” theme. Every page meets a quality floor — accepted and common name (99.85% coverage), family, growth habit, duration, native status, and US state distribution — with cold-hardiness zone, bloom, height, sun, soil, and toxicity added wherever USDA actually has them. CC photos come from iNaturalist via GBIF (credited per image) and descriptions from Wikipedia (CC BY-SA), backfilling continuously.

Pre-rendering all ~31,700 pages to static HTML on Cloudflare R2 (unlimited objects, free egress) behind a thin Worker means a GitHub Action ships the whole atlas for about $0 — R2 free tier plus the Workers free plan — with incremental sync and a scheduled data refresh.

What I learned

This is the evolution of an earlier build that resolved each plant on demand and cached it in a database. The bet that changed everything was the opposite: pre-render all of it. With ~31,700 pages that’s just static files on R2 — no live API to fall over, cost scales with the catalog rather than traffic, and the whole site is as fast and cheap as a flat folder.

The genuinely hard part was never fetching data — it was name resolution: one plant carries a dozen synonyms and spellings, and merging six sources into a single coherent taxonomy is most of the work. The discipline that kept it honest was a principle I call honest emptiness — a field with no open-licensed value is omitted, never guessed — and treating licensing as a first-class constraint: data is CC BY-SA 4.0, code is MIT, photos are credited per image. “Open and openly-cited” only means something if every field can point at where it came from.

What’s next

Keep backfilling the CC photos and Wikipedia descriptions, let the scheduled refresh keep the data current, and take it to its public launch on the reserved florula.org domain.