How many times have you found an amazing bioinformatics tool but had trouble installing it into your personal computer/ server? This happened to me every week before I learned about Bioconda and I’m sure it happened to you too.
Bioconda is a channel for the conda package manager which focuses on bioinformatics software. This tutorial will use the word Bioconda and conda in the same manner.
This blog post demonstrates how you can use Bioconda to install pretty much any tool on your computer, and never again stress out trying to install a tool and its dependencies.
1. Why is Bioconda cool?
First and foremost, when interested in using a bioinformatics tool, in my opinion, the focus should be running the tool and interpreting the results, right? However, very often, a bioinformatics tool requires many other bioinformatics tools and Python packages to be installed.
Installing bioinformatics tools is like playing a game where you need to defeat many bosses (dependencies) so you can reach the promised land (run the tool). Bioconda with one line installs all the dependencies for you and take you to the promised land, can you believe it?
2. pip vs Bioconda: I love pip, but I love Bioconda more.
Don’t get me wrong – I still use pip to install some Python packages. However, pip is a package manager for Python packages, but it does not install dependencies written in C/C++, Java, etc.
Bioconda lets you install Python Packages and all the other non-python tools and their dependencies with one single command. That’s why it awesome.
3. Bioconda recipes
A Bioconda recipe contains the metadata for the package such as package name, tool version, URL with the download link, what requirements it has and their minimum versions to work, etc
For example, for the tool that I wrote, FOCUS, its recipe defines the dependencies for the tool, Scipy, Jellyfish, and unzip.
On the other hand, if you go to the Jellyfish (written in C++) recipe it contains all the Jellyfish dependencies, but on the same directory, it contains a build.sh script with the instructions to how conda can compile the tool.
Don’t worry. If you are a bioinformatics user, you don’t need to write a recipe. Bioconda has an active community that writes them for you. I have written a file for some of my tools and fixed some others.
4. How to Install Bioconda and Use it
Bioconda contains great documentation on how to use it. Please check here how this can be done.
Once you have the conda environment installed in your machine all you need to do is to call conda install tool_name. For example,
# install latest jellyfish recipe version and its dependecies
conda install jellyfish
# install jellyfish version 2.2.6 and its dependencies
conda install jellyfish==2.2.6
# install BLAST and its dependencies
conda install blast==2.9.0
# install more than one tool and their dependencies
conda install blast jellyfish
You know how what is your target tool name on Bioconda? Well… most of the time it is the tool name. I normally just google “bioconda tool_name” and it is the top hit. Moreover, you can look at it in the recipe index.
Last but not least, if you use Bioconda, please make sure you cite their paper. Also, if you are interested in learning more about it, please check the video below.
More Resources
Here are three of my favorite Python Bioinformatics Books in case you want to learn more about it.
- Python for the Life Sciences: A Gentle Introduction to Python for Life Scientists Paperback by Alexander Lancaster
- Bioinformatics with Python Cookbook by Tiago Antao
- Bioinformatics Programming Using Python: Practical Programming for Biological Data by Mitchell L. Model
Conclusion
In summary, I hope you can love Bioconda the same way I do (or more). Don’t get me wrong, I still use pip to install Python packages, but anything outside of the pip repository, I use Bioconda every time.
Moreover, (bio)conda is great to make science reproducible when combining with Docker, but I guess it is a conversation for another blog post 🙂