The Global Comparison — Regan Studio Lab

🔄 If something looks off, refresh the page — some translations need a reload to settle.

I’ve been building the.lab for three days. Today I asked the question that matters: how does it compare to the best in the world?

Not in the “we’re a scrappy underdog and we’re trying our best” sense. In the “let’s look at the actual numbers” sense. Because if the.lab can’t match what a university genomics core delivers, then it’s a hobby. And if it can, then the implications are significant.

So here’s the comparison. Unvarnished. No spin.

What the.lab Delivers Today

As of this morning, the.lab has processed 12.8 million whole-genome variants through our primary annotation pipeline — enriched with pathogenicity scores, splice disruption predictions, and conservation metrics. HLA typing is complete across Class I and in progress for Class II using graph-based resolution against the highly polymorphic MHC region. Five polygenic risk scores from the eMERGE panel are computed. Seven disease panels covering over 50 genes have been screened. Forty-eight blood group antigen systems are typed. One hundred and eleven pharmacogenomic markers are annotated.

All of this — the full computational output — completed in under six hours.

At zero marginal cost.

On bare metal.

In a home lab.

Connected via a mesh VPN.

Two critical processes are still running. The variant annotation enrichment is producing 11 gigabytes of additional output and will finish within three hours. The HLA Class II resolution is extracting reads from 706 HLA regions — a process that takes 20 hours because the MHC region is, to put it mildly, a computational nightmare. It’s the most polymorphic stretch of the human genome, and resolving it from a whole-genome BAM requires graph-based alignment against population-scale reference panels. We’re doing it. It just takes time.

The Pipeline: What’s Running and What’s Coming

The active pipeline is processing against a 93-gigabyte whole-genome BAM file. Two additional analysis engines — a structural variant caller and a repeat expansion analyzer — will run once the primary annotation finishes. These will screen for large-scale genomic rearrangements and trinucleotide repeats across 20 clinically relevant loci.

Three large database downloads remain: a 28-gigabyte aggregation of over 40 pathogenicity scoring algorithms, an 8-gigabyte deep learning predictor that incorporates protein structure awareness, and a structural variant caller build from source. Once these land, the full ACMG secondary finding screen — all 81 genes — will execute automatically.

When that’s done, the.lab will have completed the same computational analysis that a university genomics core delivers in two to six weeks. We’ll have done it in under 24 hours.

The United States

The top genomics centers in America — the Broad Institute, Stanford Clinical Genomics, Vanderbilt’s bioVU, Baylor’s Human Genome Sequencing Center, and UCSF — all use the same primary annotation engine as the.lab. They layer the same pathogenicity scoring tools. They compute the same risk models.

They have CAP/CLIA accreditation. We don’t. They have variant review boards with board-certified clinical geneticists. We don’t. They have clinical integration with electronic health records, pharmacy systems, and insurance workflows. We don’t.

Their turnaround time is two to six weeks. Their cost is $500 to $3,000 per genome.

Our turnaround is hours. Our cost is zero marginal.

This is not because we’re better. It’s because we’re smaller. A university core has to serve thousands of patients, maintain regulatory compliance, and integrate with institutional systems that were designed in the 1990s. Those constraints are real and necessary. But they’re also slow.

The.lab doesn’t need accreditation because it’s private research. It doesn’t need a variant review board because the patient is right here. It doesn’t need EHR integration because the data goes directly from the analysis to the person who needs it.

The computational output is functionally equivalent. The institutional wrapper is absent. Whether that’s an advantage or a limitation depends on what you’re trying to do.

The United Kingdom

The NHS Genomic Medicine Service has processed over 150,000 genomes through Genomics England. Their pipeline uses industry-standard alignment and the same annotation framework as the.lab. They have institutional quality controls, clinical validation workflows, and the backing of a national health system.

Their advantage is scale. They’ve built a population-scale genomics program that integrates directly with clinical care. Their variant database — aggregated from 150,000 genomes — is a resource that no single lab can replicate.

Their disadvantage is speed. For rare disease cases, the turnaround is six to twelve months. A family waiting for a diagnosis waits half a year for an answer that the computational pipeline could deliver in a day. The bottleneck isn’t technology — it’s institutional process.

The.lab delivers the computational layer in hours. The NHS delivers the clinical validation layer in months. These are different things, and both matter.

Iceland

deCODE Genetics has sequenced over 60,000 whole genomes — more than half the Icelandic population. They maintain the world’s deepest population genetics database. Their statistical power for variant-disease association is unmatched. When deCODE says a variant is pathogenic, they’ve seen it in thousands of carriers.

The.lab cannot match this. Population genetics requires population-scale data, and one genome is not a population. For individual patient analysis, the computational pipeline is functionally equivalent — the same annotation, the same scoring, the same pathway analysis. But for statistical inference, for discovering new associations, for understanding how variants behave across thousands of people — that requires data we don’t have.

This is an honest limitation. We’re not pretending otherwise.

China

BGI processes over one million genomes annually. Their per-genome cost has dropped below $200 — a price point that makes population-scale sequencing economically viable for national health programs. The.lab cannot compete on cost at scale. One genome at zero marginal cost is impressive. A million genomes require infrastructure, supply chains, and institutional partnerships that a home lab cannot provide.

However, BGI’s individual analysis pipeline uses the same open-source tools as the.lab. The computational analysis is commodity. The value — and this is the key insight — is in interpretation and integration.

If the analysis is commodity, then the advantage belongs to whoever interprets it best. And interpretation doesn’t require a million-dollar sequencer. It requires a system that understands the patient.

The Netherlands

The Hartwig Medical Foundation maintains the world’s largest open cancer whole-genome database. Their work on tumor evolution, treatment resistance, and mutational signatures has redefined how we understand cancer genomics.

Radboud University is the global leader in pharmacogenomics. Their Dutch Pharmacogenetics Working Group guidelines are the foundation that underpins the international CPIC recommendations used worldwide. The.lab’s 111-variant pharmacogenomic panel is modeled directly on this framework.

These are the intellectual ancestors of what we’re building. We’re not competing with the Netherlands. We’re building on their work, openly and respectfully, under the same open-source licenses they’ve chosen to publish under.

Australia

The Garvan Institute operates the Australian Genomics Health Alliance with landmark studies across over 50 rare disease indications. Their panel gene curation database is arguably the world’s most comprehensive manually curated resource — each gene reviewed by clinical experts, each variant classified against published evidence.

The.lab’s disease panel coverage is comparable in breadth. But we lack the institutional expert review. When the Garvan curators say a variant is pathogenic, it’s been through a committee. When the.lab says a variant is pathogenic, it’s been through an algorithm.

Both approaches have value. The algorithm is faster. The committee is more reliable. The right answer is probably both — automated screening followed by expert review. We’re working on the first half.

Japan

The ToMMo (Tohoku Medical Megabank Organization) has sequenced over 80,000 whole genomes with uniquely detailed population-specific variant catalogs. Their imputation panel dramatically improves accuracy for Japanese individuals — variants that are common in East Asian populations but rare in European cohorts.

The.lab uses reference databases that are predominantly European-descent. This is a known and acknowledged limitation. For non-European ancestry, the accuracy of polygenic risk scores, variant classification, and population frequency estimates drops measurably. The reference genome itself — GRCh38 — is a composite that doesn’t fully represent human diversity.

This isn’t a technical problem we can solve with better software. It’s a data problem that requires more diverse reference populations. The global genomics community is working on it. The.lab will benefit from that work as it matures.

The Honest Assessment

Here’s what the.lab is, as of today:

Equivalent to leading academic centers in computational annotation, variant scoring, pharmacogenomic profiling, and disease panel screening. The tools are the same. The output is the same. The turnaround is faster.

Ahead of most clinical labs in raw speed, cost efficiency, and integrated CRISPR design capability. No hospital system currently offers therapeutic target design alongside diagnostic analysis. The.lab does both in a single pipeline.

Behind institutional centers in clinical validation, expert review, population-scale statistics, and regulatory compliance. These are real advantages that require human expertise, institutional infrastructure, and years of accumulated clinical evidence. We respect that gap.

Behind global leaders in reference population diversity, cancer genomics depth, and rare disease variant curation. These advantages come from decades of investment and population-scale data that no single genome can replicate.

What This Means

The computational layer of clinical genomics has become commodity. The same open-source tools, the same annotation databases, the same scoring algorithms — available to anyone with a server and the knowledge to deploy them. This is, by the way, exactly what the open-source movement intended.

The institutional layer — accreditation, clinical validation, expert curation, EHR integration — has not become commodity. And it shouldn’t. These are the systems that ensure a variant classification is trustworthy, that a treatment recommendation is safe, that a patient receives care within a framework of accountability.

The.lab operates at the commodity layer. We run the same tools as the Broad Institute, Genomics England, and deCODE Genetics. We deliver the same computational output. We do it faster and cheaper because we have no institutional overhead.

But we’re honest about what we don’t have. We don’t have a variant review board. We don’t have clinical accreditation. We don’t have population-scale statistics. We don’t have the depth of reference data that comes from sequencing half of Iceland.

What we have is a proof of concept that the computational barrier to clinical-grade genomics analysis has fallen to zero. The tools are free. The databases are open. The infrastructure is affordable. The only remaining barrier is knowledge — knowing how to deploy the pipeline, interpret the results, and build the clinical workflows around them.

That barrier is lower than anyone in the industry wants to admit.

The.lab exists to prove it.

— Sasha / Regan Studio Lab