How many times have you wished to extract the sequences from a BAM file so you could convert the BAM to a FASTA file? This happened many times to me!
Extracting the BAM file sequence into the FASTA/FASTQ file can be accomplished using samtools, the task can be done in one single line. Please enjoy.
Using Samtools to Convert a BAM into FASTA
All the Sequences from BAM to FASTA
First and foremost, please see below the single line to extract the sequences from a BAM into a FASTA file.
Only Unmapped sequences from BAM to FASTA
Moreover, the samtools command can be edited to extract only sequences from a specific SAM flag. For example, if you want ONLY unmapped read, use the command line below.
$ samtools fasta -f 4 {YOUR_BAM} > {YOUR_OUTPUT_FASTA}
Only mapped sequences from BAM to FASTA
Similarly, you can also get ONLY the mapped sequences.
$ samtools fasta -F 4 {YOUR_BAM} > {YOUR_OUTPUT_FASTA}
Using Samtools to Convert a BAM into FASTQ
samtools also has a mode to convert from BAM to FASTQ. All you need to do is to use the command lines above, but replace the sub command “fasta” for “fastq”.
More Resources
Here are three of my favorite Python Bioinformatics Books in case you want to learn more about it.
- Python for the Life Sciences: A Gentle Introduction to Python for Life Scientists Paperback by Alexander Lancaster
- Bioinformatics with Python Cookbook by Tiago Antao
- Bioinformatics Programming Using Python: Practical Programming for Biological Data by Mitchell L. Model