-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace data space with workspace in docstrings #743
Comments
What are the differences in semantics between "project workspace" and "project data space"? Or are they synonyms? |
Yeah, my proposed change is intended to address this problem in a slightly different way. Essentially, both a Job and a Project are directories. A directory has a path. Therefore, both of them should have a path, which fixes the analogy. The concept of a workspace is a little more specific, relating to the exact directory layout currently used by signac. The data model can be roughly described as "A root directory, which we call a Project, contains a subdirectory called its workspace. That workspace directory in turn contains one subdirectory per data point, each of which is called a Job." A Job therefore does not have a workspace. In fact, the solution that you proposed (allowing jobs to themselves contain workspaces) is precisely what @csadorf and I were trying to get at when we discussed the long-term roadmap and I made the case for both Job and Project subclassing a generic Directory! Both Job and Project are Directories, so they have a path, and that is independent of a particular layout. A given Project needs to have a well-defined layout, which is a higher-level concept that currently encompasses the workspace as well. By encoding that layout in a standalone "data model" concept, we would allow users to define different data layouts such as the nesting that you proposed. The |
@joaander I've found 2 definitions of "data space" in the docs:
|
@vyasr I made the connection between fixing the double meaning of workspace and your future "data model" after thinking about how to clarify the definition of workspace. I wrote out the future directory structure to show myself how clarifying the definition helps resolve some of my confusion around your idea. I will clarify my initial example that I was applying "my proposal" to the idea I had heard you discuss. I think we are mostly on the same page!
What's a "data model"? I prefer the other term you use "data layout". But I could see other options too like "project structure/template/layout" or "file/directory layout". You use "file layout" in the roadmap. |
Inheritance relationships like In that sense, a Project or Job “is-a” Directory under the proposed class hierarchy. |
Fumbled buttons on my phone. Reopening. edit: … twice. |
Thank you for clarifying that!! I'll add a note about it to my comment but preserve my expressed confusion. |
I brought this up as it became an issue when writing the workflow tutorial for hoomd: https://hoomd-blue.readthedocs.io/en/v3.0.1/tutorial/05-Organizing-and-Executing-Simulations/01-Organizing-Data.html The signac tutorials use the word "data space" a lot, so I introduced that concept first. But then signac mandates the directory name is "workspace". It is confusing for users (especially new users) when more than one word describes the same thing. If they are the same, it would be good to only use one - workspace since that is the required directory name. If they are different, then they need to be defined clearly and used consistently. |
Totally agree! Issue tracking glossary: glotzerlab/signac-docs#59 |
@cbkerr could you update this issue in case there were any important/useful/relevant points made in the meeting today that you think would help contribute to this discussion? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@stale-bot this is not ready to be closed. This should remain open because "job workspace" still returns many hits in the next branch.
I made a more specific issue to track usage of "data space": #809 |
@cbkerr any activity (including your comment) will cause stalebot to remove the |
@cbkerr could you revisit this now and see what you would like to change? IIUC the remaining action item is to remove all references to a job's "workspace" in docs in favor of a job's "directory" or the "path to a job" or something along those lines, is that correct? Would you be able to make that change? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
All references to job workspace will be gone after glotzerlab/signac-docs#185. I'm changing the name of the issue to better track that we need to resolve this comment: #743 (comment) |
New focus is on pinning down what "data space" means
#743 (comment)
Original issue description
Summary
Consider the following analogy: "The directory of the job's workspace is to job as the directory of the project's workspace is to project."
It is currently false!! Fixing this would break things, making it a good candidate for 2.0.
The fix would make the following analogies true:
Problem Details
What we have now is: "Directory of the job's workspace is to job as directory of the project is to project."
(If your head is spinning like mine was, read those again after reading the rest of the issue)
Some example usages in the documentation:
Here is an illustration of the problem. When developing dashboard, to display an image from a job or project, you need to get the job or project directory in a general way.
The way I found to do this was
job_or_project.fn("")
because currently the separate syntax isjob.workspace()
orproject.root_directory()
. Both are aliased to.path()
.Solution
job.workspace()
with "job directory", orjob.path()
(also accessible withjob.fn("")
). The job directory is a directory containing files associated with a signac job.Currently
job.workspace()
is an alias for the job path. I prefer writing "job directory" rather than "job path" in the documentation, even if you would writejob.path()
in the code, because a directory is a container, which is a distinct concept from the path that identifies the container. This deprecation is announced in Additional deprecations #685.Schematic
Benefits
Signac roadmap for context
I then realized that @vyasr already mentioned this idea in the tentative signac roadmap coming at it from a different angle. I think that means it's a good time to open a focused discussion on it. He suggested:
Does this writeup capture your idea @vyasr?
The text was updated successfully, but these errors were encountered: