De novo Assembly

We provide high-quality de novo genome assembly services using state-of-the-art assemblers optimized for various sequencing platforms (Illumina, Oxford Nanopore, PacBio). Our pipeline covers quality filtering, assembly, scaffolding, polishing, and genome quality assessment, enabling genome reconstruction from scratch.

Workflow Summary

  ---
config:
    theme: 'base'
    themeVariables:
        fontFamily: 'verdana'
        fontSize: '25px'        
---
flowchart LR
    subgraph Preprocessing["`**Preprocessing**`"]
        direction TB
        Raw["Raw Reads"]
        QC["Quality Control"]
        Trim["Adapter Trimming"]
        Error["Error Correction"]
        Raw --> QC --> Trim --> Error
    end

    subgraph Assembly["`**Assembly**`"]
        direction TB
        Contig["Contig Assembly"]
        Scaffold["Scaffolding"]
        GapFill["Gap Filling"]
        Polish["Polishing"]
        Contig --> Scaffold --> GapFill --> Polish
    end

    subgraph Evaluation["`**Evaluation & Annotation**`"]
        direction TB
        Assess["Assembly Quality Assessment"]
        BUSCO["Completeness Check (BUSCO)"]
        Annot["Structural/Functional Annotation"]
        Report["Summary & Visualization"]
        Assess --> BUSCO --> Annot --> Report
    end

    Preprocessing --> Assembly --> Evaluation

Preprocessing

  1. Raw Reads: Input raw reads from short- or long-read sequencers.
  2. Quality Control: Remove low-quality or contaminated reads.
  3. Adapter Trimming: Trim sequencing adapters and low-quality ends.
  4. Error Correction: Correct sequencing errors, especially in long reads.

Assembly

  1. Contig Assembly: Assemble high-quality reads into contigs using assemblers like SPAdes, Flye, or hifiasm.
  2. Scaffolding: Link contigs using paired-end or long-read data.
  3. Gap Filling: Resolve gaps between contigs within scaffolds.
  4. Polishing: Improve base-level accuracy with tools like Pilon or Racon.

Evaluation & Annotation

  1. Assembly Quality Assessment: Assess N50, L50, GC content, and other metrics.
  2. BUSCO Completeness Check: Evaluate completeness using conserved gene sets.
  3. Genome Annotation: Predict genes and annotate features with tools like Prokka or MAKER.
  4. Reporting: Generate summary statistics and visualizations.

Example

Assembly graphs provide a visual overview of contig connectivity and repeat structure.


BUSCO analysis evaluates assembly completeness using evolutionarily conserved single-copy orthologs.


Genome annotation identifies genes, coding sequences, tRNAs, and other functional elements.


For detailed options and customization, contact us.