HPC:NGSUtils

From HPC wiki

NGSUtils

NGSUtils is a suite of software tools for working with next-generation sequencing datasets. NGSUtils v0.5.7 is installed across all HPC nodes.

Usage

The entire NGSUtils suite, which includes the ngsutils, bamutils, bedutils, fastqutils and gfutils commands, can be loaded by a module

[asrini@node062 ~]$ module show ngsutils-0.5.7
-------------------------------------------------------------------
/usr/share/Modules/modulefiles/ngsutils-0.5.7:

module-whatis	 NGSUtils is a suite of software tools for working with next-generation sequencing datasets
prepend-path	 PATH /opt/software/ngsutils/0.5.7/bin
-------------------------------------------------------------------

[asrini@node062 ~]$ module load ngsutils-0.5.7

[asrini@node062 ~]$ which ngsutils
/opt/software/ngsutils/0.5.7/bin/ngsutils

[asrini@node062 ~]$ ngsutils
Usage: ngsutils COMMAND

Commands
    update        - Updates NGSUtils from git repository
                    (http://github.com/ngsutils/ngsutils)
    repeat2fasta  - Extract repeatmasker flagged regions to a FASTA file
    strip_fasta   - Remove sequences from a FASTA file based on name
    tag_fasta     - Tag FASTA sequence names with a prefix or suffix
    tabixindex    - Index a tab-delimited file using Tabix and bgzip


Run 'ngsutils help CMD' for more information about a specific command
ngsutils 0.5.7-efb237d

[asrini@node062 ~]$ bamutils
Usage: bamutils COMMAND

Commands
  DNA-seq
    basecall      - Base/variant caller

  RNA-seq
    count         - Calculates counts/FPKM for genes/BED regions/repeats (also CNV)

  General
    best          - Filter out multiple mappings for a read, selecting only the best
    convertregion - Converts region mapping to genomic mapping
    export        - Export reads, mapped positions, and other tags
    expressed     - Finds regions expressed in a BAM file
    extract       - Extracts reads based on regions in a BED file
    filter        - Removes reads from a BAM file based on criteria
    innerdist     - Calculate the inner mate-pair distance from two BAM files
    junctioncount - Counts the number of reads spanning individual junctions.
    keepbest      - Parses BAM file and keeps the best mapping for reads that have multiple mappings
    merge         - Combine multiple BAM files together (taking best-matches)
    pair          - Given two separately mapped paired files, re-pair the files
    peakheight    - Find the size (max height, width) of given peaks (BED) in a BAM file
    renamepair    - Postprocesses a BAM file to rename pairs that have an extra /N value
    split         - Splits a BAM file into smaller pieces
    stats         - Calculates simple stats for a BAM file
    tag           - Update read names with a suffix (for merging)

  Conversion
    tobed         - Convert BAM reads to BED regions
    tobedgraph    - Convert BAM coverage to bedGraph (for visualization)
    tofasta       - Convert BAM reads to FASTA sequences
    tofastq       - Convert BAM reads back to FASTQ sequences

  Misc
    check         - Checks a BAM file for corruption
    cleancigar    - Fixes BAM files where the CIGAR alignment has a zero length element

Run 'bamutils help CMD' for more information about a specific command

[asrini@node062 ~]$ bedutils
Usage: bedutils COMMAND

Commands
  General
    clean        - Cleans a BED file (score should be integers)
    extend       - Extends BED regions (3')
    overlap      - Find overlapping BED regions from a query and target file
    reduce       - Merges overlapping BED regions
    refcount     - Given a number of BED files, calculate the number of samples that overlap regions in a reference BED file
    sizes        - Extract the sizes of BED regions
    sort         - Sorts a BED file (in place)
    stats        - Calculates simple stats for a BED file
    subtract     - Subtracts one set of BED regions from another

  Conversion
    annotate     - Annotate BED files by adding / altering columns
    frombasecall - Converts a file in basecall format to BED3 format
    fromprimers  - Converts a list of PCR primer pairs to BED regions
    fromvcf      - Converts a file in VCF format to BED6
    tobed3       - Removes extra columns from a BED (or BED compatible) file
    tobed6       - Removes extra columns from a BED (or BED compatible) file
    tobedgraph   - BED to BedGraph
    tofasta      - Extract BED regions from a reference FASTA file

  Misc
    cleanbg      - Cleans up a bedgraph file

Run 'bedutils help CMD' for more information about a specific command

[asrini@node062 ~]$ fastqutils
Usage: fastqutils COMMAND

Commands
  General
    barcode_split - Splits a FASTQ/FASTA file based on sequence barcodes
    filter        - Filter out reads using a number of metrics
    merge         - Merges paired FASTQ files into one file
    names         - Write out the read names
    properpairs   - Find properly paired reads (when fragments are filtered separately)
    revcomp       - Reverse compliment a FASTQ file
    sort          - Sorts a FASTQ file by name or sequence
    split         - Splits a FASTQ file into N chunks
    stats         - Calculate summary statistics for a FASTQ file
    tag           - Adds a prefix or suffix to the read names in a FASTQ file
    tile          - Splits long FASTQ reads into smaller (tiled) chunks
    trim          - Remove 5' and 3' linker sequences (slow, S/W aligned)
    truncate      - Truncates reads to a maximum length
    unmerge       - Unmerged paired FASTQ files into two (or more) files

  Conversion
    convertqual   - Converts qual values from Illumina to Sanger scale
    csencode      - Converts color-space FASTQ file to encoded FASTQ
    fromfasta     - Converts (cs)FASTA/qual files to FASTQ format
    fromqseq      - Converts Illumina qseq (export/sorted) files to FASTQ
    tobam         - Converts to BAM format (unmapped)
    tofasta       - Converts to FASTA format (seq or qual)

Run 'fastqutils help CMD' for more information about a specific command

[asrini@node062 ~]$ gtfutils
Usage: gtfutils COMMAND

Commands
  General
    add_isoform - Appends isoform annotation from UCSC isoforms file
    add_reflink - Appends isoform/name annotation from RefSeq/refLink
    add_xref    - Appends name annotation from UCSC Xref file
    annotate    - Annotates genomic positions based on a GTF model
    filter      - Filter annotations from a GTF file
    genesize    - Extract genomic/transcript sizes for genes
    junctions   - Build a junction library from FASTA and GTF model
    query       - Query a GTF file by coordinates

  Conversion
    tobed       - Convert a GFF/GTF file to BED format

Run 'gtfutils help CMD' for more information about a specific command



Other Pages