Skip to content

PIPELINE · SOIL|WATER|SPECIMEN TO CLAIM

Six stages. One verifiable chain of custody.

From a raw FASTA or DwC-A (Darwin Core Archive) through to a published, data-tethered scientific artifact — every step recorded, every claim re-derivable. The same chain runs under the BioKEA molecular sequencing service, so customer samples land with the same verifiable output as our own.

The BioKEA pipeline: field samples flow through Ingest, Analyze, Draft, Review, Broadcast, and Amplify to become verifiable scientific claims.
Fig · soil | water | specimen → verifiable claim
  1. 01
    Ingest
    Universal Envelope

    Every input — raw FASTA, DwC-A archive, drafted manuscript — becomes a cryptographically trackable object. Automatic file-type detection and metadata extraction.

  2. 02
    Analyze
    Large Data Collider

    The LDC runs image QC, taxonomy reconciliation, and FAIR validation over millions of reads in minutes. Outputs operational taxonomic units and candidate novel lineages.

  3. 03
    Draft
    AI-assisted narrative

    The scientist directs; the AI drafts structure and links LDC data directly into the text. Cross-references with external hypotheses in real time.

  4. 04
    Review
    Multi-agent panel

    AI pre-screens manuscript structure and methodology in hours. Verified human experts evaluate contextual scientific nuance. Weighted, transparent scoring.

  5. 05
    Broadcast
    Interactive StoryMap

    The end product is not a dead PDF. It is an explorable digital artifact permanently tethered to its underlying FAIR data package (GBIF, NCBI SRA, Zenodo).

  6. 06
    Amplify
    ATProto / Bluesky

    Publishing is the starting line. Seamless AT Protocol integration pushes verifiable scientific artifacts into decentralized social graphs.

BEING BUILT

BioinfoOS

The software layer running on the BioKEA Large Data Collider (LDC). In-house AI-assisted modules cover:

Modules ship incrementally; BioinfoOS is in active development and runs on the same LDC hardware used by the molecular sequencing service.

PUBLISHED AT

Agentis

Pipeline outputs publish to Agentis, our forthcoming AI-first open-access platform on the AT Protocol — in early development.

agentis.science →

TRUST

Cryptographic provenance, end to end.

Every artifact carries an AT Protocol Decentralized Identifier (DID). Every peer review is a signed, verifiable record. The pipeline doesn't just produce findings — it produces evidence that's re-derivable from raw input to published claim, by anyone, at any time.

An editorial illustration of a field scientist collecting a water sample beside a creek.
Fig · where every claim begins — a field sample

Want to plug a sample into this?

We're onboarding sample streams and collaboration partners.

Start a conversation →