liminfo

Galaxy Platform Reference

Free reference guide: Galaxy Platform Reference

25 results

About Galaxy Platform Reference

The Galaxy Platform Reference is a comprehensive quick-reference guide for the Galaxy bioinformatics platform, an open-source web-based system for accessible, reproducible, and transparent computational research. This reference covers the complete Galaxy ecosystem including workflow construction with the visual editor, subworkflows for modular pipeline design, and Markdown-based execution reports with embedded dataset previews and workflow diagrams.

The reference documents key bioinformatics tools available in Galaxy organized by analysis type: NGS QC tools (FastQC, MultiQC, Trimmomatic, Cutadapt), alignment tools (BWA-MEM2, HISAT2, STAR, Bowtie2), RNA-seq analysis (featureCounts, DESeq2, StringTie, Salmon), ChIP-seq analysis (MACS2, deepTools, HOMER), and variant analysis (FreeBayes, SnpEff, SnpSift, GEMINI). Each tool entry includes input/output formats, key parameters, and practical usage examples.

Beyond tools, this reference covers Galaxy data management (histories, dataset collections, upload methods including FTP and SRA), bioinformatics data formats (FASTQ, BAM/SAM, VCF/BCF, BED/GFF/GTF), programmatic access via the REST API and BioBlend Python library, server administration (galaxy.yml configuration, job scheduling with SLURM, Tool Shed installation), and complete workflow examples for RNA-seq, ChIP-seq, and WGS variant calling pipelines.

Key Features

  • Documents Galaxy workflow editor features including subworkflows, conditional execution, and Markdown-based reports with dataset directives
  • Covers NGS analysis tools: FastQC, MultiQC, Trimmomatic, Cutadapt, BWA-MEM2, HISAT2, STAR, Bowtie2, featureCounts, DESeq2, and Salmon
  • Includes ChIP-seq pipeline tools: MACS2 peak calling with FDR filtering, deepTools bamCoverage/computeMatrix/plotHeatmap, and HOMER motif analysis
  • Documents variant analysis tools: FreeBayes (Bayesian variant calling), SnpEff (variant annotation), SnpSift (VCF filtering), and GEMINI (family-based filtering)
  • Explains bioinformatics data formats: FASTQ (Phred+33/+64), BAM/SAM with indexing, VCF/BCF for variants, and BED/GFF/GTF for genomic intervals
  • Provides BioBlend Python API examples for uploading files, running tools, invoking workflows, and checking invocation status programmatically
  • Covers server administration: galaxy.yml configuration, job_conf.yml for SLURM/local runners, Tool Shed installation, and Planemo tool development
  • Includes complete step-by-step workflow examples for RNA-seq differential expression, ChIP-seq peak analysis, and WGS variant calling with GATK

Frequently Asked Questions

What analysis types does this Galaxy reference cover?

This reference covers five main analysis categories: NGS quality control (FastQC, MultiQC, Trimmomatic), read alignment (BWA-MEM2 for DNA-seq, HISAT2/STAR for RNA-seq, Bowtie2 for ChIP-seq), RNA-seq differential expression (featureCounts, DESeq2, StringTie, Salmon), ChIP-seq peak analysis (MACS2, deepTools, HOMER), and variant calling/annotation (FreeBayes, SnpEff, SnpSift, GEMINI). It also includes complete workflow examples for each pipeline.

How do I create and run workflows in Galaxy?

The reference documents the workflow creation process: navigate to Workflow > Create New Workflow, drag-and-drop tools onto the canvas, draw connections between tool outputs and inputs, configure parameters, then save and run. You can also extract workflows from existing analysis histories using History > Extract Workflow, and nest workflows as subworkflows for modular pipeline design.

What is BioBlend and how do I use it with Galaxy?

BioBlend is a Python library that provides programmatic access to the Galaxy REST API. The reference includes code examples for connecting to a Galaxy instance with GalaxyInstance(url, key), listing and creating histories, uploading files, running tools with specific inputs, invoking workflows, and checking invocation status. You need a Galaxy API key generated from User > Preferences > Manage API Key.

What bioinformatics data formats are documented?

The reference covers FASTQ (sequencing reads with Phred+33/+64 quality encoding), BAM/SAM (aligned reads with automatic indexing), VCF/BCF (variant calls with INFO/FORMAT fields), and BED/GFF/GTF (genomic intervals and annotations). Each format includes Galaxy-specific handling details, supported conversions, and visualization options.

How do I set up Galaxy server job scheduling?

The reference documents job_conf.yml configuration for defining runners (local, SLURM) and execution environments. You can configure SLURM destinations with native specifications like partition, memory, and CPU allocations, then assign specific tools to appropriate environments based on their computational requirements.

What does the RNA-seq workflow example include?

The complete RNA-seq workflow consists of: 1) QC with FastQC and MultiQC, 2) adapter/quality trimming with Cutadapt, 3) alignment with HISAT2 or Salmon, 4) quantification with featureCounts, 5) differential expression with DESeq2, 6) visualization with volcano plots and heatmaps, and 7) functional analysis with goseq for GO enrichment. The reference also points to Galaxy Training Network tutorials.

How do dataset collections work in Galaxy?

Dataset collections group multiple files as a single unit for batch processing. Three types are available: List (simple file list), Paired (paired-end FASTQ pair), and List of Pairs (multiple samples of paired FASTQs). Create collections by selecting multiple datasets and clicking Build Dataset List or Build List of Dataset Pairs. Collections enable running the same tool across many files simultaneously in workflows.

How do I install tools from the Galaxy Tool Shed?

Navigate to Admin > Install or Uninstall Tool Shed tools, search for the desired tool, select it, and choose Install to Galaxy with a target tool panel section. Dependencies are automatically resolved via conda or containers. For custom tool development, the reference includes Planemo commands: planemo tool_init for scaffolding, planemo lint for validation, and planemo test for automated testing.