-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wip] First pass proof-read of best practices lessons #42
Changes from 6 commits
d0607be
086ffd2
6057fcf
ba85f33
d589f33
e271bab
c530e66
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,10 @@ | ||
*.pyc | ||
*~ | ||
.DS_Store | ||
.idea | ||
.ipynb_checkpoints | ||
.sass-cache | ||
__pycache__ | ||
_site | ||
.Rproj.user | ||
.jekyll-cache/ | ||
.jekyll-cache/ |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,9 +14,10 @@ keypoints: | |
- "You can use the CMS CookieCutter to quickly create the layout for a Python package" | ||
--- | ||
|
||
For this workshop, we are going to create a Python package that performs analysis and creates visualizations for molecules. We will start from a Jupyter notebook which has some functions and analysis, which you should download on the [setup]. | ||
*TODO: Define "package". Distinguish from "module". Consider distinguishing w.r.t distribution, archive, source, installed...* | ||
For this workshop, we are going to create a Python package that performs analysis and creates visualizations for molecules. We will start from a Jupyter notebook which has some functions and analysis, which you should download on the [setup]. *<- wording?* | ||
|
||
The idea is that we would like to take this Jupyter notebook and convert the functions we have created into a Python package. That way, if anyone (a labmate, for example) would like to use our functions, they can do so by installing the package and importing it into their own scripts. | ||
The idea is that we would like to take this Jupyter notebook and convert the functions we have created into a Python package. That way, if anyone (a lab-mate, for example) would like to use our functions, they can do so by installing the package and importing it into their own scripts. | ||
|
||
To start, we will first use a tool called [CookieCutter](https://cookiecutter.readthedocs.io/en/latest/) which will set up a Python package structure and several tools we will use during the workshop. | ||
|
||
|
@@ -42,9 +43,9 @@ $ cookiecutter gh:molssi/cookiecutter-cms | |
~~~ | ||
{: .language-bash} | ||
|
||
This command runs the cookiecutter software (`cookiecutter` in the command) and tells cookiecutter to look at GitHub (`gh`) n the repository under `molssi/cookiecutter-cms`. This repository contains a template which cookiecutter uses to create your project, once you have provided some starting information. | ||
This command runs the cookiecutter software (`cookiecutter` in the command) and tells cookiecutter to look at GitHub (`gh`) in the repository under `molssi/cookiecutter-cms`. This repository contains a template that cookiecutter uses to create your project, once you have provided some starting information. | ||
|
||
You will see an interactive prompt which asks questions about your project. Here, the prompt is given first, followed by the default value in square brackets. The first question will be on your project name. You have very cleverly decided to give it the name `molecool` (it's like molecule, but with `cool` instead, because of your cool visualizations - get it?) | ||
You will see an interactive prompt which asks questions about your project. Here, the prompt appears first, followed by the default value in square brackets. The first question will be on your project name. You have very cleverly decided to give it the name `molecool` (it's like molecule, but with `cool` instead, because of your cool visualizations - get it?) | ||
|
||
Answer the questions according to the following. | ||
If nothing is given after the colon (`:`), hit enter to use the default value. | ||
|
@@ -82,10 +83,10 @@ The first two questions are for the project and repository name. The project nam | |
|
||
The next choice is about the first module name. Modules are the `.py` files which contain python code. The default for this is the `repo_name`, but we will change this to avoid confusion (the module `molecool.py` in a folder named `molecool` in a folder named `molecool`??). For now, we'll just name our first module `functions`, and this is where we will put all of our starting functions. | ||
|
||
Another thing the CookieCutter checks for is your email address. Be sure to provide a valid email address to the cookiecutter (it must have an `@` symbol followed by a domain name, or the cookiecutter will fail.). Note that your email address is not recorded or kept by the software. Your email is asked for insertion into created files so that people using your software will have contact information for you. | ||
Another thing that CookieCutter checks for is your email address. Be sure to provide a valid email address to `cookiecutter` (it must have an `@` symbol followed by a domain name, or `cookiecutter` will fail.). Note that your email address is not recorded or kept by the CookieCutter software, itself. `cookiecutter` inserts your email address into generated files so that people using your software will have contact information for you. | ||
|
||
#### License Choice | ||
Choosing which license to use is often confusing for new developers. The MIT license (option 1) is a very common license and the default on GitHub. It allows for anyone to use, modify, or redistribute your work with no restrictions (and also no warranty). | ||
Choosing which license to use is often confusing for new developers. The MIT license (option 1) is a very common license, and the default on GitHub. It allows for anyone to use, modify, or redistribute your work with no restrictions (and also no warranty). | ||
|
||
Here, we have chosen the `BSD-3-Clause`. The `BSD-3-Clause` license is an open-source, permissive license (meaning that few requirements are placed on developers of derivative works), similar to the MIT license. However, it adds a copyright notice with your name and requires redistributors of the code to keep the notice. It also prohibits others from using the name of the project or its contributors to promote derived products without written consent. | ||
|
||
|
@@ -95,7 +96,7 @@ You can see more detailed information on each license at [choosealicense.com](ht | |
1. [LGPLv3](https://choosealicense.com/licenses/gpl-3.0/) | ||
1. Not Open Source - In this case, the cookiecutter will not generate a license. You can add a custom license, or choose to not add a license. If there is no license in a repository, you should assume that the project is **not** open source, and [you cannot modify or redistribute the software](https://choosealicense.com/no-permission/). | ||
|
||
For most of your projects, it is likely that the license you choose will not matter a great deal. However, remember that if you ever want to change a license, you may have to get permission of all contributors. So, if you ever start a project that becomes popular or has contributors, be sure to decide your license early! | ||
For most of your projects, it is likely that the license you choose won't matter a great deal. However, remember that if you ever want to change a license, you may have to get permission of all contributors. So, if you ever start a project that becomes popular or has contributors, be sure to decide your license early! | ||
|
||
> ## Types of Open-Source Licenses | ||
> | ||
|
@@ -105,10 +106,10 @@ For most of your projects, it is likely that the license you choose will not mat | |
{: .callout} | ||
|
||
#### Dependency Source | ||
This determines some things in set-up for what will be used to install dependencies for testing. This mostly has consequence for the section on Continuous Integration. We have chosen to install dependencies from anaconda with pip fallback. Don't worry too much about this choice for now. | ||
This determines some things in set-up for what will be used to install dependencies for testing. This mostly has consequence for the section on [Continuous Integration]. We have chosen to install dependencies from anaconda with pip fallback. Don't worry too much about this choice for now. | ||
|
||
#### Support for ReadTheDocs | ||
This option is to choose whether you would like files associated with the documentation hosting service [ReadTheDocs](https://readthedocs.org/). Choose yes for this workshop. | ||
This option is to choose whether you would like files associated with the documentation hosting service [ReadTheDocs](https://readthedocs.org/). Choose "yes" for this workshop. | ||
|
||
### Reviewing directory contents | ||
Now we can examine the project layout the CookieCutter has set up for us. Navigate to the newly created `molecool` directory. You should see the following directory structure. | ||
|
@@ -164,9 +165,9 @@ Now we can examine the project layout the CookieCutter has set up for us. Naviga | |
``` | ||
{: .output} | ||
|
||
To visualize your project like above you will use "tree". If you do not have tree you can get using `sudo apt-get install tree` on linux, or `brew install tree` on Mac. Note - tree will not show you the helpful labels after '<-' (those were added by us). | ||
To visualize your project like above you will use *tree*. If you do not have *tree*, you can get it using `sudo apt-get install tree` on Linux, or `brew install tree` on Mac. Note - `tree` will not show you the helpful labels after `<-` (those were added by us). | ||
|
||
CookieCutter has created a lot of files! This can be thought of as three sections. In the top level of our project we have a folder for tools related to development (`devtools`), documentation (`docs`) and to the package itself (`molecool`). We will first be working in the `molecool` folder to build our package, and adding more things later. | ||
CookieCutter has created a lot of files! They can be thought of as three sections. In the top level of our project we have a folder for tools related to development (`devtools`), documentation (`docs`) and to the package itself (`molecool`). We will first be working in the `molecool` folder to build our package, and adding more things later. | ||
|
||
~~~ | ||
... | ||
|
@@ -183,10 +184,11 @@ CookieCutter has created a lot of files! This can be thought of as three section | |
~~~ | ||
{: .output} | ||
|
||
This the only folder we actually have to work with to build our package. The other folders relate to "best practices", which do not technically have to be used in order for your package to be working (but you should do them, and we will talk about them later). You could build this directory structure by hand, but we have just used cookiecutter to set it up for us. This directory will contain all of our python code for our project, as well as sample data (in the `data` folder), and tests (in the `tests` folder.) | ||
This the only folder we actually have to work with to build our package. The other folders relate to "best practices", which do not technically have to be used in order for your package to be working (but you should do them, and we will talk about them later). You could build this directory structure by hand, but we have just used `cookiecutter` to set it up for us. This directory will contain all of our Python code for our project, as well as sample data (in the `data` folder), and tests (in the `tests` folder.) | ||
|
||
> ## Packages and modules | ||
> | ||
> *TODO: Rewrite. Separate discussion of packages vs. modules from discussion of importable entities and scoping.* | ||
> | ||
> What 'packages' or 'modules' are in Python may be confusing. | ||
> In general, 'module' refers to a single `.py` file containing Python definitions and statements. It may be imported for use in another module or script. The module name is determined by the file name. A function defined in a module is used (once the module is imported) using the syntax `module_name.function_name()`. | ||
> 'Package' refers to a collection of Python modules. The package may also have an `__init__.py` file. | ||
|
@@ -205,11 +207,14 @@ $ cd molecool | |
### The `__init__.py` file | ||
|
||
The `__init__.py` file is a special file recognized by the Python interpreter which makes a directory into a package. This file can be blank in some cases, however, we will use it to define how the user interacts with the functions in our package. | ||
*TODO: Cite section on defining the interface, where we can also mention `__all__` and `_` prefixed names.* | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This may be more appropriate in the section "Deciding Package Structure". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe what he means is adding hyperlink to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, cross-linking with the later more thorough discussions would be good. The terseness of the description here seems appropriate, but
|
||
|
||
Contents of `molecool/molecool/__init__.py`: | ||
~~~ | ||
""" | ||
molecool | ||
A Python package for analyzing and visualizing xyz files. For MolSSI Workshop. | ||
Analyze and visualize xyz files. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This section will be autopopulated based on answers to the cookiecutter prompts. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then it should be consistent with the generated material. Is the docstring intentionally deviant from the PEP257 guidelines? Such as to make a point in the later lessons? If not, it seems like we should avoid mixed messages and
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @eirrgang This docstring is automatically generated by the cookiecutter-cms. Therefore the above example comes automatically from the following template. However, I think it is still possible to add this material on Docstring section in Python Coding Style. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks. Opened MolSSI/cookiecutter-cms#130 |
||
|
||
For MolSSI Workshop. | ||
""" | ||
|
||
# Add imports here | ||
|
@@ -224,7 +229,7 @@ del get_versions, versions | |
~~~ | ||
{: .language-python} | ||
|
||
The very first section of this file contains a string opened and closed with three quotations. This is a docstring, and has a short description of the file. | ||
The very first section of this file contains a string opened and closed with three quotations. This is a [docstring](https://www.python.org/dev/peps/pep-0257/), and has a short description of the file. | ||
|
||
The section we will be concerned with is under `# Add imports here`. This is how we define the way functions from modules are used. | ||
|
||
|
@@ -235,44 +240,51 @@ from .functions import * | |
~~~ | ||
{: .language} | ||
|
||
goes to the `molecool.py` file, and brings everything that is defined there into the file. When we use our function defined in `functions.py`, that means we will be able to just say `molecool.canvas()` instead of giving the full path `molecool.functions.canvas()`. If that's confusing, don't worry too much for now. We will be returning to this file in a few minutes. For now, just note that it exists and makes our directory into a package. | ||
goes to the `functions.py` file, and brings everything that is defined there into the file. When we use our function defined in `functions.py`, that means we will be able to just say `molecool.canvas()` instead of giving the full path `molecool.functions.canvas()`. If that's confusing, don't worry too much for now. We will be returning to `__init__.py` in a few minutes. For now, just note that it exists and makes our directory into a package. | ||
|
||
### Our first module | ||
Once inside of the `molecool` folder (`molecool/molecool`), examine the files that are there. View the first module (`functions.py`) in a text editor. We see a few things about this file. The top begins with a description of this module surrounded by three quotations (`"""`). Right now, that is the file name, followed by our short description, then the sentence "Handles the primary functions". We will change this to be more descriptive later. CookieCutter has also created a placeholder function in called `canvas`. At the start of the `canvas` function, we have a `docstring` (more about this in [documentation]), which describes the function. | ||
Once inside the `molecool` folder (`molecool/molecool`), examine the files that are there. View the module (`functions.py`) in a text editor. We see a few things about this file. The top begins with a description of this module surrounded by three quotations (`"""`). Right now, that is the file name, followed by our short description, then the sentence "Handles the primary functions". We will change this to be more descriptive later. CookieCutter has also created a placeholder function called `canvas`. At the start of the `canvas` function, we have a `docstring` (more about this in [documentation]), which describes the function. | ||
|
||
We will be moving all of the functions we defined in the Jupyter notebook into python modules (`.py` files) like these. | ||
|
||
We will be moving all of the functions we defined in the jupyter notebook into python modules (`.py` files) like these. | ||
### Installing from local source. | ||
|
||
### Python local installs | ||
You may be accustomed to `pip` automatically retrieving packages from the internet. You can also install packages from local sources that contain a `setup.py` file. | ||
|
||
To develop this package, we will want to something called a developmental install so that we can try out our functions and package as we develop it. | ||
To develop this package, we will want to use what is called "development mode" or an "editable install" so that we can try out our functions and package as we develop it. We access development mode using the `develop` command to `setup.py`, or the `-e` option to `pip`. | ||
|
||
*TODO: Note that "editable" install is not (yet) standard and may even go away in the future.* | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add a reference for this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From what I know calling setup.py directly (for install, test, etc.) is deprecated. Pip comes bundled with Python 3.4+ so let's stick with pip. And as a side note from what I learned, Poetry install your local package in editable mode as default, CMIIW. @eirrgang could you elaborate more what does it mean by "not (yet) a standard"? From what I know there are lots of way to develop and test your project depending on what kind of project that you working on. For example in Web app framework like Django you can use the demo server. Or you can also directly deploy your code to a docker container every time you make the change. But I think installing in editable mode gives the freedom to check the API, or function/module without having to uninstall and install the package that we develop every time we make the change, which is very practical. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There have been rumblings about deprecating non PEP517/518 behaviors, and I was thinking of the note at https://setuptools.readthedocs.io/en/latest/setuptools.html#setup-cfg-only-projects But it is probably just something to keep an eye on, and doesn't warrant an update to the material at this time.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @eirrgang Yes this is very true, I've been reading Bernat Gabor's posts and watching some of his videos related to PEP517/518. From what I know there is a traction to move from As for now, IMHO I think it is still best to go with pip install as the PEP517/518 implementation is still work in progress and yet to be matured. Edit note: added Pyproject.toml reference on Stackoverflow |
||
|
||
#### Reviewing `setup.py` | ||
Return to the top directory (`molecool`). One of the files CookieCutter generated is a `setup.py` file. `setup.py` is the build script for [setuptools]. It tells setuptools about your package (such as the name and version) as well as which code files to include. We'll be using this file in the next section. | ||
|
||
#### Installing your package | ||
A developer install will allow you to import your package and use it from anywhere on your computer. You will then be able to import your package into scripts in the same way you import `matplotlib` or `numpy`. | ||
A development install will allow you to import your package and use it from anywhere on your computer. You will then be able to import your package into scripts in the same way you import `matplotlib` or `numpy`. | ||
|
||
A local install uses the `setup.py` file to install your package by inserting a link to your new project into your Python site-packages folder. To find the location of your site packages folder, you can check your Python path. Open Python (type `python` into your terminal window), and type | ||
A development installation uses the `setup.py` file to install your package by inserting a link to your new project into your Python site-packages folder. To find the location of your site-packages folder, you can check your Python path. Open Python (type `python` into your terminal window), and type | ||
|
||
*TODO: update.* | ||
~~~ | ||
>>> import sys | ||
>>> sys.path | ||
~~~ | ||
{: .language-python} | ||
|
||
This will give a list of locations python looks for packages when you do an import. One of the locations should end with `python3.7/site_packages`. The site packages folder is where all of your installed packages for a particular environment are located. | ||
This will give a list of locations python looks for packages when you do an import. One of the locations should end with `python3.7/site-packages`. The site packages folder is where all of your installed packages for a particular environment are located. | ||
|
||
To do a local install, type | ||
To do a development mode install, type | ||
|
||
~~~ | ||
$ pip install -e . | ||
~~~ | ||
{: .language-bash} | ||
|
||
Here, the `-e` indicates that we are installing this project in 'editable' mode (i.e. setuptools "develop mode"), while `.` indicates to install from the local directory (you could also specify a path here). Now, if you examine the contents of your site packages folder, you should see a link to `molecool` (`molecool.egg-link`). The folder has also been added to your path (check `sys.path` again.) | ||
Here, the `-e` indicates that we are installing this project in *editable* mode (i.e. setuptools [*development mode*](https://setuptools.readthedocs.io/en/latest/userguide/commands.html#develop-deploy-the-project-source-in-development-mode)), while `.` indicates to install from the local directory (you could also specify a path here). Now, if you examine the contents of your site packages folder, you should see a link to `molecool` (`molecool.egg-link`). The folder has also been added to your path (check `sys.path` again.) | ||
|
||
Now, we can use our package from any directory, similar to how we can use other installed packages like `numpy`. Open Python, and type | ||
|
||
*TODO: Consider using doctest-compliant examples (with expected output).* | ||
|
||
~~~ | ||
>>> import molecool | ||
>>> molecool.canvas() | ||
|
@@ -295,6 +307,8 @@ This should work from anywhere on your computer. | |
> {: .solution} | ||
{: .challenge} | ||
|
||
*TODO: Consider removing, move to a separate lesson, mention in the context of an existing package, or just cite Python Packaging Guide for optional components.* | ||
|
||
Optional dependencies can be installed as well with `pip install -e .[docs,tests]` | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this to do item. You can see the information the Python documentation has on modules here - https://docs.python.org/3/tutorial/modules.html
I think the definition given here is correct and also appropriate for the level intended. Can you clarify what you'd like to discuss in a rewrite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of https://docs.python.org/3/glossary.html#term-module vs. https://docs.python.org/3/glossary.html#term-package
and https://docs.python.org/3/reference/import.html#regular-packages
Further muddying the waters are overloaded usages like https://packaging.python.org/glossary/#term-Distribution-Package and related terms.
There is a relationship between the filesystem layout and the import system that is really important to convey clearly and concisely. But it also seems important to recognize that what I
import
is a "module" (optionally, something nested in a modular namespace), whether or not that module (or submodule) is implemented as a "package".It makes total sense to cite a more thorough outside reference like that tutorial, as long as the material presented in this lesson doesn't introduce confusion with respect to terminology at python.org.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eirrgang I'm sorry I can not fully grasp your idea. Maybe you could show a simple example of how the following Package and Module box should be changed?
In my opinion this explanation already fulfill its purpose as it is just a short explanation (or keynote). And the importable entity and scoping is there to explain how your module used in real life, so the student could understand the connection between what they create and the consequences when they are using them. Surely there are more than one way to define
Python package
but IMHO making it too comprehensive could lead to confusion among students.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly, that is a concern.
It is worth a try to see if a few words could be chosen more carefully. I'll give it a try at some point. Maybe the best and easiest thing is to just review 6-6.1 of the Python hosted tutorial (though that doc might get updated less frequently than this workshop material. ;-) )
I'll try out this Jekyll thing, though. Maybe some side-bars would ease my concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what is so great about this workshop material is that it reflect the cms-cookiecutter. For example there are 2 notable changes in cms-cookiecutter since last year, the first one is switch autoformatting from yapf to black, and second one is the CI migration from Travis to Github Action. And this workshop material keep up with those changes. Which is awesome.
Yes, I can relate to the lack of sidebar, and it is hard to get used to. But it uses the template from Software Carpentry, and I found that it is not simple to edit the appearance.