This repo is dedicated to make bioinformatics resources available for anyone who wish to enter this field. (You may find it useful or not useful based on your level). I am not an expert in the field, just posting things based on my knowledge and the things which worked for me personally, they may/may not work with you, totally fine.
Header | Value |
---|---|
Author | Ahmed Sameh |
[email protected] | |
Education | BSc in Biotechnology/Biomolecular Chemistry, Faculty of Science, Cairo University. |
Table of Contents:
- Road-Map
- Structural Bioinformatics Resources
- HTS/Omics Data Analysis Resources
- Community and Support
In order to embark your journey in bioinformatics, you need to ask decide which branch of bioinformatics you are interested in the most? Up-to my very little knowledge, there are 2 main branches which are:
- Structural Bioinformatics which deals with the structure of biomolecules and structural data, whether they are proteins, nucleic acids, or in-oragnic molecules. This branch requires a background mainly related to thermodynamics and biophysics.
- The second branch is more concerned with High-throughput sequencing (HTS) and omics data (as genomics and transcriptomics) and the analysis of it's data. Below are road-maps and resources for both direction whether you decided to start either of them.
- Programming skills.
- Thermodynamics background.
- Structural biology background.
- Mathematics background.
- Machine-Learning background.
Skill |
Resources |
ETA to study |
Extra Notes |
---|---|---|---|
Programming Skills | 1- Bioinformatics sepcilization on coursera 2- Code Academy course 3- Python official website 4- Elzero channel on youtube and website - Python specialization on Coursera 5- Linux course on Future Learn 6- Introduction to R for Biologists 7- R for Biologists |
2-3 months | - Do not push yourself, it is totally fine if you are starting your journey. The journey demands time and patience. - The first 4 resources are for python, it is recommended that you start with python as it is the most used programming language and that it is very easy to begin with. - Make sure that you do lots of exercised and practices as this is the most important aspect in programming in general. The more you practice, the more you master it and become more fluent. - The linux course can be started after the python one, to avoid overlapping. - Consider letting R as a lower priority (you may not use throughout your work in structural bioinformatics). R mostly here used for data analysis and visualization. |
Thermodynamics and Structural Biology Background | 1- Text books found here in that folder 2- This repo is amazing for both thermodynamics and structural biology |
2 Months | - This step is critical, you may find these resources not useful for you. If so, you can share it here with me and I will merge the pull request and I will accept it. - My advice here is to enjoy that really, it is pure science and it is you laying the base for all of the future skills to follow. |
Mathematics Background | 1- Linear Algebra course on Coursera 2- Linear Algebra course on Khan Academy 3- Multivariate Calculus course on Coursera 4- Differential Calculus course on Khan Academy 5- Linear Algebra lecture |
1 Month for Linear Algebra and 1 Month for Calculus | - I really recommend starting Linear Algebra before Calculus as it may pave the way for some concepts that you are going to need in Calculus (Matrix transformation, inverse matrix,...) |
Machine Learning Background | 1- Machine Learning specialization on Coursera 2- Machine Learning Specialization |
1-2 Months | - You can start this after finishing Mathematics. - Andrew NG is considered the most famous and the best one to explain Machine Learning in the world. |
- Do not rush and give the process time. I recommend that you start learning in the same order of the above table, but at the end do what you see best for you!
- Practice makes perfect, make sure that the most important thing is to continuosly practice.
- Also, do not be shy to ask those who have experience about career advices, things you do not understand, or even an general inquiry. It is totally fine if they did not reply, be brave.
- Programming Skills.
- Biology Background.
- Genomics.
- Transcriptomics.
- Proteomics.
- Metabolomics.
- Statistics.
- Mathematics Background.
- Machine Learning Background.
Skill |
Resources |
ETA to study |
Extra Notes |
---|---|---|---|
Programming Skills | 1- Bioinformatics sepcilization on coursera 2- Code Academy course 3- Python official website 4- Elzero channel on youtube and website - Python specialization on Coursera 5- Linux course on Future Learn 6- Introduction to R for Biologists 7- R for Biologists |
2-3 months | - These are the same resources as the Structural Bioinformatics part, but here I would recommend to start learning R instead of python as it is by far more used than Python. - After R, you can start with linux. - Python here is the least priority. You may not even have to use it, but I have to say that the future is with Python as Python is really catching-up in terms of a better visualization, so your call :). |
Genomics | 1- Khan Academy's Genomics and DNA Sequencing course 2- National Human Genome Research Institute's Learn Genetics website 3-Coursera's Introduction to Genomic Technologies Specialization 4- Bioinformatics for Biologists: Analysing and Interpreting Genomics Datasets - FutureLearn |
2-3 Weeks | - Try as much to understand the theory behind each technology to be more aware of each step. - Another point that may help is to read also papers that takes the same subject and see how they did the experiment and interpreted the data. |
Transcriptomics | 1- edX's Introduction to Genomics and Transcriptomics course 2- DIY Transcriptomics site 3-Class Central Course |
2-3 weeks | - Same notes as the previous |
Proteomics | 1- Nature Education's Introduction to Proteomics course 2- The Human Proteome Organization (HUPO) Education Committee 3- Udemy's The Complete Proteomics Masterclass 3- Experimental Methods in Systems Biology |
2-3 weeks | - The third course include modules related to Transcriptomics too. - Rest of the notes are the same as the previous |
Metabolomics | 1- Metabolomics Society's Education page 2- The Metabolomics Wiki 3- Udemy's Introduction to Metabolomics |
2-3 weeks | - Same notes as the previous. |
Statistics | 1- Coursera's Introduction to Statistical Inference for Bioinformatics Studies 2- edX's Introduction to Genomics and Transcriptomics 3- Johns Hopkins University's Biostatistics for Beginners 4- Statistics Folder |
1 Month | - Make sure you practice a lot and do lots of exercises and try to integrate it with your work. Stats is very easy and fun, it is considered as the was you show and present your work. |
Mathematics Background | 1- Linear Algebra course on Coursera 2- Linear Algebra course on Khan Academy 3- Multivariate Calculus course on Coursera 4- Differential Calculus course on Khan Academy 5- Linear Algebra lecture |
1 Month for Linear Algebra and 1 Month for Calculus | - Same resources and notes as in Structural Bioinformatics section. |
Machine Learning Background | 1- Machine Learning specialization on Coursera 2- Machine Learning Specialization |
1-2 Months | - Same resources and notes as in Structural Bioinformatics section. |
- First of all, you have to decide and choose which track are you most interested in from Genomics, Transcriptomics, Proteomics, and Metabolomics.
- A very important repo which has lots of beneficial resources (I even used it as a reference) is the following repo.
- Practice makes perfect, make sure that the most important thing is to continuosly practice.
- Also, do not be shy to ask those who have experience about career advices, things you do not understand, or even an general inquiry. It is totally fine if they did not reply, be brave.
If you have any resource you wanna add or even have a question, do not hesitate to leave a comment or even send me directly to my email: [email protected] and I will be more than happy to help.