-
If you haven't already, fill out this form and join our mailing list. This will keep you up-to-date on the club.
-
Download the files in this repo by clicking
Code
(the green button near the top) ->Download ZIP
and unzip the files into a folder. You can of course also fork the repo if you have experience with Git. -
Follow the general setup guide.
-
Complete the Git setup guide.
For most people, this is the hardest part of the tutorial! If you feel frustrated, know it is normal. Come see us at tutorials or office hours and we will help you out.
If you have trouble with the general setup, you can follow the Google Colab setup guide and use Colab to complete the tutorials.
You can also use deepnote or hex. For the later, you must not sign up with your umich.edu email address.
If you have trouble with the Git setup, you can upload your files to Git by going to your GitHub repository and do Add file
-> Upload files
.
Get started with tutorial0
and checkpoint0
in the tutorial0
folder and then move on to tutorial1
and checkpoint1
in the tutorial1
folder. We recommend working through each tutorial before attempting the corresponding checkpoint.
The two challenges in the Optional Challenges
folder are completely optional. You will find instructions about them in the submission section.
The Data-Visualization
folder contains materials for those who want to get a head start. pandas.ipynb
is a very brief introduction to internal Pandas data visualization tools. The AnatomyofMatplotlib
folder contains a comprehensive tutorial for the Matplotlib library, which most beginner projects use and is foundational to other data visualization packages such as seaborn
.
These checkpoints are not meant to be selective. Their sole purpose is to give you sufficient foundational knowledge about Python and some important packages so you can start contributing to a project.
The definition of success for us is to have everyone who begins the tutorials finish them. Thus, we will offer support in two ways:
-
Sunday Tutorials: Live tutorials will be held from 12 to 3 on 1/21 and 1/28, in-person only, at one of the fishbowl classrooms. These are the stand-alone rooms in fishbowl in Mason Hall. Tutorials will be a combination of short presentations and Q&A.
-
Weekday Office Hours: We will be offering office hours from 7 to 9 PM on 1/16, 1/23, 1/30. We will offer these in-person at the third floor of UGLI.
Neither tutorials nor office hours are mandatory.
We have also created a forum where you can ask questions.
Join the mailing list and monitor the join page for updates.
The tutorial submission form will be released soon.
In your submission, make sure to select the option saying you are new a member, and submit the link to your repository containing all your tutorial checkpoints. We are looking for:
- [REQUIRED] checkpoint 0 and checkpoint 1. These are assessed by completion and effort, not accuracy.
- [OPTIONAL] ML Challenge and Stats Challenge. These are assessed by merit. We usually put new members on beginning projects for their very first semester but you may want to work on advanced projects right away if you are experienced with data science. You will be able to demonstrate said experience in these two challenges. You can choose to complete one or both of them.
It is strongly recommended for you to complete at least one challenge if the project you are most interested in is labelled as an advanced project. This will give you the best chance to be placed on that team.
All technical or logistical questions MUST be posted on Piazza. We will not answer those questions over email.
If you have a personal question, email us at [email protected].
A list of relevent python libraries that are used extensively throughout the checkpoints, challenges, MDST projects, and beyond.
Numpy: https://numpy.org/doc/stable/
Pandas: https://pandas.pydata.org/docs/
Matplotlib: https://matplotlib.org/stable/gallery/index
Scikit-Learn: https://scikit-learn.org/stable/user_guide.html