Author: onestop_data

Data Scientist who loves to share some knowledge on the field.

Python

H5PY – A Python Package to Store Big Data Efficiently

onestop_databy:

This tutorial shows how to use the v, a python package to store big data efficiently. It will mainly focus on creating and reading HDF5 files.

Bioinformatics

Multiple Sequence Alignment – Theory and Practice – Step-by-Step

onestop_databy:

This blog post described Multiple Sequence Alignment (MSA) focusing on the theory and practice – Step-by-Step using MAFFT and Muscle.

Docker

Kubernetes vs Docker – Apples vs Oranges

onestop_databy:

This blog posts compares Kubernetes vs Docker, and talks about the similarities and differences between the two platforms.

Bioinformatics

The Easiest Way to Download Genomic Data from NCBI SRA, MG-RAST, etc

onestop_databy:

This tutorial will teach you how to download NGS data and metadata from repositories such as NCBI SRA, MG-RAST, Imicrobe, etc – very helpful to download …

Bioinformatics

Fun Fact: What is so special about an odd k-mer length?

onestop_databy:

Have you noticed that most assemblers use an odd k-mer length? Do you know why? Don’t you think it is odd? This blog post explains below …

Machine Learning

Painless Random Forest Regression in Python – Step-by-Step with Sklearn

onestop_databy:

This tutorial demonstrates a step-by-step on how to use the Sklearn Python Random Forest package to create a regression model.

Python

Insights on a Plethora of Python IDEs and Code Editors

onestop_databy:

Python is a powerful, easy to use, general-purpose programming language, loved by most programmers to develop web applications, data science, software prototypes, etc. Being a widely …

Bioinformatics

Estimate Genome Size and Best k-mer Size for Assembly – Step-by-Step

onestop_databy:

This tutorial shows how you can estimate the genome size and the best k-mer length for genome assembly using KmerGenie. Moreover, the tutorial shows how to …

Bioinformatics

Reference-Free Metagenomic Datasets Comparison – Step-by-Step

onestop_databy:

Reference-free metagenomic methods are very useful to compare datasets with high levels of unknown sequences. This tutorial teaches how this can be done going from metagenomic …

Python

Bloom Filter Made Simple: Theory and Code

onestop_databy:

This tutorial simplifies Bloom Filter in Python by teaching what is a bloom filter, talks about its false positive and false negative rate, introduces some graphics …