Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow http file to be closed early #5405

Merged
merged 1 commit into from
Oct 17, 2024

Conversation

tom-seqera
Copy link
Contributor

@tom-seqera tom-seqera commented Oct 15, 2024

Proof of concept for #5360

This uses Java's OpenOption mechanism to pass a configuration param to the XFileSystemProvider, allowing a script to declare that it doesn't care if the input stream is closed before being fully read:

import nextflow.file.http.HttpOpenOption

def firstLine(path) {
    path.withInputStream(HttpOpenOption.ALLOW_EARLY_CLOSE) {
        def r = new BufferedReader(new InputStreamReader(it))
        return r.readLine()
    }
}

This approach is quite flexible and generic, since it would also allow other OpenOptions to be passed to other FileSystemProviders, and as a result might be a little in conflict with the formalisation of a nextflow script syntax. An alternative could be to add a single-purpose method to XPath, eg:

InputStream newLenientInputStream() {
   ...
}

Copy link

netlify bot commented Oct 15, 2024

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 6be448b
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/6710edfd872cae0007c8f0f3

@pditommaso
Copy link
Member

The problem I see in this approach is that only works for withInputStream, there are plenty of other ways to open a stream.

What about if we something much more crazy, for example when opening (or closing the connection) inspect the current stack trace and determine the request is coming from a file path operation.

@pditommaso
Copy link
Member

Ok, forget about the stack trace thing, because still does not solve the problem (you may still want to read partially a stream via a Path).

I think instead, we can apply something very similar to what you are suggesting using a reverse logic, i.e. enforcing the size check only when file transfer operation are made (not for generic stream operations).

All file copies are made by the FilePorter here .

The fileCopy take some options that could be extend as you have implemented.

What do you think?

@tom-seqera tom-seqera force-pushed the 5360-fixedinputstream-early-close branch from 33458a4 to 6830712 Compare October 16, 2024 14:46
@tom-seqera
Copy link
Contributor Author

I tried reversing the logic, so it only applies the check on file copy.

It generally seems to work, but the Jimfs handler doesn't like it very much, and throws a java.lang.UnsupportedOperationException here - so we'd potentially have to add logic so the option is only added for http/ftp inputstreams

@pditommaso
Copy link
Member

If jimfs is blocking we could also consider something more dirty as a thread local flag

@pditommaso
Copy link
Member

@tom-seqera any eta or blocking issue for this ?

@tom-seqera tom-seqera force-pushed the 5360-fixedinputstream-early-close branch from 6830712 to a46231f Compare October 17, 2024 10:21
@tom-seqera
Copy link
Contributor Author

I managed to make it work by implementing copy() on the XFileSystemProvider. It's not the cleanest because it relies on provider exceptions to determine their supported operations. I'm not sure if it's better or worse than a thread local.

@pditommaso
Copy link
Member

Even if the OpenOption is a nice solution, I think it would be better to not change the copy implementation to make it work with a testing library. A thread local could be a better hack here

@tom-seqera tom-seqera force-pushed the 5360-fixedinputstream-early-close branch from a46231f to 6be448b Compare October 17, 2024 10:59
@pditommaso
Copy link
Member

Neat! likely "Ready for review" ?

@tom-seqera tom-seqera marked this pull request as ready for review October 17, 2024 12:22
@pditommaso pditommaso merged commit 718dcbe into master Oct 17, 2024
22 checks passed
@pditommaso pditommaso deleted the 5360-fixedinputstream-early-close branch October 17, 2024 13:13
alberto-miranda pushed a commit to alberto-miranda/nextflow that referenced this pull request Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants