-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add dataproxy access to ebrains_drive #15
add dataproxy access to ebrains_drive #15
Conversation
@apdavison or @appukuttan-shailesh I have tentatively added dataproxy support in ebrains-drive. Are there any appetite to incorporate this functionality to ebrains-drive, or would it be more suitable to create a new package, in your opinion? |
I personally think it makes sense to include it here. (I myself was intending to implement this at some point; so I am happy to see this contribution! Thanks @xgui3783 ) |
we will also add support for HDG datasets in this PR in future commits. I will convert this PR to a draft for the time being. Please feel free to provide feedbacks all the same! I will ping you all when the PR is ready for review. |
@i-Zaak I have push a commit that should have support for HDG (see read me example ) I would welcome your review of the PR as well as a user. Whilst not in the scope, would the support for public {collab,datasest} buckets be something we are interested? |
Hello thanks for theses developement. We just need to ensure that the lib will works properly on different env (prod, int, dev) it's in discussion to put dataproxy and bucket on int too. |
Would be lovely if so. I don't think I have access to -int/-dev environments of data-proxy, and thus, currently, it will raise exception if env other than empty string is passed |
Was you able to try your change and do you want a new release version for it ? |
@xgui3783: I notice that you have provided some examples in the README.md file. But could you also list out all the functionality available inside doc.md? |
Also, I do not use buckets much, so probably best to have someone who actively uses the same to test this out. Thanks again. |
I will add some doc there. I did not see
I believe @i-Zaak uses the buckets? He promised me that he would check it once he is back from vacation =)
I will make some update the |
Dear @appukuttan-shailesh @Cracky5457 , Let's wait on feedback on @i-Zaak on the functionality (or the lack there of) of buckets. |
Hey, sorry for the delays. I've tested the Buckets and datasets and I think this is definitely going in the right direction! I've tested
PS: there is a missing depenency for |
Re: missing dependency, that was there before I got here =S I will investigate the HDG issue again. thanks for spending time on this! Greatly appreciated! |
I have had a chance to look into the code again, re @i-Zaak 's comments:
Unfortunately, I am not able to reproduce, but I have a few suspicion how this could have been the case:
I added a new commit that checks token expiry, and will raise when it is. Unfortunately, there's not much I can do about the second one oO
certainly reasonable. I would recommend adding this as an iterative improvement, in a separate PR? I can imagine we can even create a directory abstraction for data proxy bucket.
This I will need a little help. should we use content type to decide how to decode the file, or file extension, or a mixture? I feel much more comfortable letting the user of the library determine how the files should be parsed. Another approach maybe, we can maintain a list of lambda functions, given such filehandle, this is how you can decode the file content.
ping @appukuttan-shailesh , I think this dependency is in the original seafile python client? I think this is only used when building the pypi artefact, and not needed otherwise? Still this cause some issue when user try to install the package via git, for example. Could we explicitly add it as a dep? |
I believe this dependency was added by Axel here. I think there we can/should add it to the deps to avoid such problems. |
I just revisited my experiments and it looks like "token expiring at the wrong time" was the cause. However I'm suspicious that the "wrong time" might not be completely independent of the interaction with HDG. Is the token maybe reset after the user is granted access to a HDG dataset (wasn't able to reproduce)? Is there a way to get some info on the current token (e.g. expiry date etc.)? |
the latest commit also now checks the token expiry, and will raise if the token is expired note that this does not check the authenticity of the token, simply its expiry. I don't think new tokens are generated pre or post granting access to HDG. The access is controlled purely at the dataproxy level, assuming a valid token is presented. |
My bad - I've seen the code, but misread it. I've seen the tokens have ~ 2.5h live time, so it is easy to get caught in between if one plays with the code for a while... |
@xgui3783, @i-Zaak : Token life of ~2.5 hours seems unusually short! I don't have much experience using the dataproxy... does it use the same token as would apply for the SeaFile client? i.e. would a token obtained via the @Cracky5457 : is there a known workaround for getting longer duration tokens? |
To correct myself, the expiry is actually 30m, I was getting the token for testing from the https://kg-editor.humanbrainproject.eu/ (bad habit from previous times). Token from the Collaboratory is substantially more durable (almost a week). |
side note: it is probably feasible to check the scope check on
Token expiry depends entirely on how long it is requested. kg creates tokens of short expiry. Since the client does not fetch tokens, there is nothing much I can do. |
From what I remember, client = ebrains_drive.connect('hbp_username', 'password') The same should be possible for |
IMO, authentication is out of scope for this PR. But since the auth logic has been moved to ClientBase, I believe the following code should work: from ebrains_drive import BucketApiClient
client = BucketApiClient(username="user", password="password") I did not modify the |
That's fine. The above is what I was proposing. Could @i-Zaak check if the token expiration problem persists when the token is fetched via the above method, i.e. login using credentials (username, password), as opposed to providing the token obtained from elsewhere. |
Hello, yes - the token obtained through username and password has much longer live-time (a week-ish). For my usecase, I'm mainly looking into using the tokens obtained within the collaboratory ( And one more thing - to be able to call
I had to move around the
|
haha oops... Lemme fix that 😅 |
Just pushed a commit that should fix the iam_url issue |
Thats great... as good as it gets I guess. So the token issue seems resolved 👍 |
@appukuttan-shailesh and @Cracky5457 , is there anything else we are waiting for this PR? I would love to see this PR merged and also, hopefully a new pypi package published, so we can patch our dependency appropriately. |
All ok from my side (I haven't had a chance to test it, but @i-Zaak seems to have tried it out sufficiently). |
Thanks to all of you, version 0.5.0 is release. I hope it will works well ^^ |
TODO: