Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: create documentation #463

Closed
2 tasks
jorgeorpinel opened this issue Jun 27, 2019 · 25 comments · Fixed by #908
Closed
2 tasks

api: create documentation #463

jorgeorpinel opened this issue Jun 27, 2019 · 25 comments · Fixed by #908
Assignees
Labels
A: docs Area: user documentation (gatsby-theme-iterative) p0-critical Affects users in a bad way at the moment type: enhancement Something is not clear, small updates, improvement suggestions

Comments

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Jun 27, 2019

See src @ https://github.com/iterative/dvc/blob/master/dvc/api.py

dvc.api.get_url
dvc.api.open
dvc.api.read

Please more insights here More details in #463 (comment).

Also, please update the one mention to API in the data registry (which will me merged with #818) per #818 (comment).

UPDATE:

@shcheklein

This comment has been minimized.

@shcheklein shcheklein added A: docs Area: user documentation (gatsby-theme-iterative) type: enhancement Something is not clear, small updates, improvement suggestions p1-important Active priorities to deal within next sprints labels Jun 28, 2019
@dnabanita7

This comment has been minimized.

@shcheklein
Copy link
Member

@Suor ? can you give a summary or is there a link? we should probably put docstrings around APIs before we release it.

@dnabanita7

This comment has been minimized.

@shcheklein

This comment has been minimized.

@Suor
Copy link
Contributor

Suor commented Jul 23, 2019

They have some short docstrings, I will update them based on future docs or discussion here if we decide to do that.

So, what is this about? There are three public things in dvc.api now:

  1. read(path, repo=None, rev=None, remote=None, mode="r", encoding=None) - returns the contents of an artifact as a bytes object or a string.
  2. get_url(path, repo=None, rev=None, remote=None) - returns an url of an artifact.
  3. open(path, repo=None, rev=None, remote=None, mode="r", encoding=None) - opens an artifact as a file, may only be used as context manager:
    with dvc.api.open("path/to/data.csv", remote="my-s3", encoding="utf-8") as f:
        for line in f:
            process(line)

Arguments always mean the same:

path - a path to an artifact, relative to repo root,
repo - a path or git url of a repo,
rev - revision, i.e. a branch, a tag, a sha. This only works with an url in repo,
remote - a name of a remote to fetch artifact from/give url to
mode - a mode with which we open a file, the only sensible options are r/rt and rb
encoding - an encoding used to decode contents to a string

mode and encoding mirror their namesakes builtin open() has.

@shcheklein
Copy link
Member

shcheklein commented Jul 23, 2019

k, thanks @Suor. @naba7 now we need to come with a good place and a format for it. Probably, we need a separate top-level section. API reference similar to command reference we have.

@dnabanita7

This comment has been minimized.

@shcheklein

This comment has been minimized.

@dnabanita7
Copy link
Contributor

How about this format angularJS or microsoft or Azure?

We can use these formats and write a introductory page listing all the APIs and linking them to GitHub

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Jul 27, 2019

@Suor are repo=None, rev=None, remote=None default values actually None? Or do they get turned into repo='.', rev='HEAD', remote=(read from config file)? Probably important to document (both in docstring and) in the API ref.

@shcheklein re

Probably, we need a separate top-level section. API reference similar to command reference we have.

Agree, perhaps in docs path /api-reference and the index page for that section could explain what is the API and how to start using it. Actually its not that obvious! I'm not 100% sure what we mean by "the DVC API" for example. Is it a Python library people can install separately?

$ pip install dvc
...
$ python
...
>>> from dvc import api as dvcapi
>>> dvcapi
<module 'dvc.api' from '/.../dvc/dvc/api.py'>
>>> # etc

@shcheklein
Copy link
Member

that's right. I would say that it's not separate though, it's the same DVC package.

@dnabanita7

This comment has been minimized.

@shcheklein

This comment has been minimized.

@jorgeorpinel

This comment has been minimized.

@dnabanita7
Copy link
Contributor

dnabanita7 commented Jul 28, 2019

@shcheklein Okay. I get it now. I thought for getting DVC-api we need to download it separately.
@jorgeorpinel repo is also an api, we should include it in the intro.
And

  1. ... some of the core functions of DVC such as add, push, pull, commit, checkout, etc., ...

  2. Writing one-liner for read,open, get_url, repo(?) such as:

  • read[link] - returns the contents of an artifact as a bytes object or a string.
  • get_url[link] - returns an url of an artifact.
  • open[link] - opens an artifact as a file.
  • repo[link] - assigns the dvc-root-directory that are used in Python scripts.
  1. Other followed up pages may contain vivid description and followed up by examples. What more can we include here?

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Jul 28, 2019

I think that's good enough to start a PR. Please let us know, thanks!

@Suor
Copy link
Contributor

Suor commented Jul 29, 2019

@jorgeorpinel

I would say simply use import dvc.api instead of from dvc import api as dvcapi, more straightforward and almost the same length:

import csv
import pickle
import dvc.api

# Loading from content
model = pickle.loads(dvc.api.read("some-model.pkl", repo="https://github.com/..."))

# Loading using file descriptor
with dvc.api.open("dataset.csv", repo=...) as fd:
    reader = csv.reader(fd)
    for row in reader:
        # ...    

# Obtaining an url
resource_url = dvc.api.get_url("path/to/resource.ext", repo=..., remote="s3")

@Suor
Copy link
Contributor

Suor commented Jul 29, 2019

@naba7 I would start with some Usage section, with short and most common examples, then continue with complete API listing.

Or another layout: Install, Usage, Methods sections. Then each method goes on its separate page linked from Method section, with full operation and params description, more examples. The point is making it glanceable and copy-pastable, while providing all the ins and outs too.

@dnabanita7
Copy link
Contributor

dnabanita7 commented Jul 30, 2019

I think the layout : Install, Usage, Methods and describing each methods is better.
@shcheklein @jorgeorpinel If you agree to this, I will start working on the same.
@Suor I quite don't understand by "copy-pastable, while providing all he ins and outs too."
Since, you don't need to install any other package, so we can mention that in one line and link to install DVC.
So, starting with Usage section for now.

@shcheklein
Copy link
Member

@naba7 yep, I like the idea. So, we can start with three levels:

Python API is the top most

it includes Install, Usage, Method Reference

Method Reference includes one page per each method with simple example. And we need to discuss the structure for it.

@jorgeorpinel any thoughts on this?

@jorgeorpinel

This comment has been minimized.

@dnabanita7
Copy link
Contributor

I am sorry. I won't be able to work further on this PR.

@shcheklein
Copy link
Member

@naba7 np! thank you for all your contributions ;)

jorgeorpinel added a commit that referenced this issue Dec 11, 2019
move diagram lower
For PR #818
but also related to #463
@jorgeorpinel jorgeorpinel changed the title document DVC API api: create documentation Dec 12, 2019
@jorgeorpinel jorgeorpinel added p0-critical Affects users in a bad way at the moment and removed p1-important Active priorities to deal within next sprints labels Dec 19, 2019
jorgeorpinel added a commit that referenced this issue Jan 8, 2020
@jorgeorpinel jorgeorpinel self-assigned this Jan 8, 2020
@jorgeorpinel

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: docs Area: user documentation (gatsby-theme-iterative) p0-critical Affects users in a bad way at the moment type: enhancement Something is not clear, small updates, improvement suggestions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants