BSPC Training¶
These pages are developed by NICHD’s Bioinformatics and Scientific Programming Core (BSPC) as a resource to help train NICHD staff and fellows in various computational and bioinformatics topics.
Note: the opinions on these pages do not reflect those of NICHD or NIH.
There’s a lot of excellent training material out there, so rather than repeat it, this website acts as a central location of curated resources. We’ve found the good stuff so you can get right to learning.
Start here¶
Bioinformatics and scientific programming is a big field, and it can be difficult to know where to start.
If you are at NIH, you can schedule a meeting (ryan.dale@nih.gov) to discuss your training goals and expectations so we can help develop a customized training plan.
Ready? Head to First steps.
Changelog¶
If you haven’t been here in a while and want to know what’s new, see the Changelog.
Currently-written topics¶
First steps gives you an introduction to the content here and some context. If you’re just starting out, head there first.
Initial training¶
These sections help you learn the basics of programming, and include some examples of beginner/intermediate/advanced skills to help you figure out where you are in your learning and how to advance.
Next steps¶
Once you have the basics of programming, these sections will broaden your skills.
Genomics¶
These sections point to resources to learn about some specific genomics topics:
Additional resources¶
Changelog¶
TODO¶
Scattered throughout the documentation are .. todo::
entries, which are
collected here for reference. This demonstrates which parts of the
documentation are still in progress as well as serving as a one-stop-shop for
what topics to write about next.
Todo
Identify what should stay on the rnaseq page and what should be moved here
(The original entry is located in /home/runner/work/training/training/source/deseq2.rst, line 6.)
Todo
add info and links about emacs, especially org-mode
(The original entry is located in /home/runner/work/training/training/source/emacs.rst, line 6.)
Todo
To tie everything together, add examples of figures from papers, and explain how all of these steps come together.
(The original entry is located in /home/runner/work/training/training/source/genomics-formats.rst, line 324.)
Todo
For genomics, write the following:
Aligners (Bowtie2, HISAT2, BWA, STAR)?
Links to example RNA-seq and ChIP-seq workflows (possibly from https://hbctraining.github.io/main/)
bedGraph, wig, bigBed, bigWig, chromsizes
example RNA-seq and ChIP-seq bash scripts scale that up to Snakemake workflows?
(The original entry is located in /home/runner/work/training/training/source/genomics-formats.rst, line 329.)
Todo
Describe our workflow more (using issues, merge requests, etc)
(The original entry is located in /home/runner/work/training/training/source/gitlab.rst, line 32.)
Todo
Write about lcdb-wf, why we use it, how to learn it
(The original entry is located in /home/runner/work/training/training/source/lcdb-wf.rst, line 6.)
Todo
write data science learner profile
(The original entry is located in /home/runner/work/training/training/source/learner-profiles.rst, line 64.)
Todo
write bioinformatician learner profile
(The original entry is located in /home/runner/work/training/training/source/learner-profiles.rst, line 71.)
Todo
The learner profiles link to the following pages, many of which still need writing:
DESeq2
Galaxy
Reproducibility
Installation
Collaborating with BSPC
Biowulf
lcdb-wf
RNA-seq
(The original entry is located in /home/runner/work/training/training/source/learner-profiles.rst, line 94.)
Todo
Needs more content!
(The original entry is located in /home/runner/work/training/training/source/programming.rst, line 21.)
Todo
Need to write on the following topics for reproducibility:
conda
git
requirements.txt
make sure you add anything you install to requirements.txt
conda create -p ./env
for shared directories
(The original entry is located in /home/runner/work/training/training/source/reproducibility.rst, line 6.)
Todo
Topics needed for UCSC:
track hubs
udcTimeout=1
hosting files on datashare
custom tracks and tracklines
useful built-in tracks
(The original entry is located in /home/runner/work/training/training/source/ucsc.rst, line 6.)
Todo
This section could use some better organization
(The original entry is located in /home/runner/work/training/training/source/variant-calling.rst, line 177.)
Todo
Visualization topics needed:
genome browser screenshots
inkscape
Tufte-isms (chartjunk; data to ink ratio)
why not pie charts
colormaps (jet vs viridis)
colorblindness
(The original entry is located in /home/runner/work/training/training/source/visualization.rst, line 29.)