-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
task
object does not work with custom Groovy functions inside of process directives
#4215
Comments
At the risk of making this Issue thread too verbose, I also wanted to note these other discrepant cases of the usage of with a modified pipeline that looks like this;
// $ nextflow run main.nf --queue my-aws-queue -work-dir "s3://my_bucket"
nextflow.enable.dsl=2
params.resourceLabels = [fooKey:"barValue"]
include { BAR } from './modules/bar.nf'
workflow {
BAR("Sample1")
}
params {
queue = null
}
aws {
region = 'us-east-2'
batch {
cliPath = "/home/ec2-user/miniconda/bin/aws"
}
}
process {
executor = 'awsbatch'
queue = params.queue
}
manifest {
name = 'workflow-introspection-demo'
author = 'Stephen Kelly'
description = 'Demo workflow script'
mainScript = 'main.nf'
}
class Utils {
public static String customMessage (String label) {
return "customMessage-from-${label}"
}
public static Map customTaskLabels (nextflow.processor.TaskConfig task) {
def newLabels = [
pipelineCustomKey: "customValue"
]
newLabels = newLabels + [pipelineProcess: task.process]
return newLabels
}
public static Map customMapLabels (Map taskLabel) {
def newLabels = [
pipelineCustomKey: "customValue"
]
newLabels = newLabels + taskLabel
return newLabels
}
}
process BAR {
container "ubuntu:latest"
// resourceLabels params.resourceLabels // THIS WORKS
// resourceLabels pipelineTask: task.process // ERROR ~ No such variable: task
// resourceLabels pipelineTask: {task.process} // Unable to marshall request to JSON: MarshallingType not found for class class Script_9d318d7e$_runScript_closure1$_closure2
// resourceLabels pipelineTask: "${task.process}" // THIS WORKS
// resourceLabels params.resourceLabels + [pipelineTask: "${task.process}"] // THIS WORKS
// resourceLabels Utils.customTaskLabels(task) // ERROR ~ No such variable: task
// resourceLabels Utils.customMapLabels([barProcessKey:"barProcessValue"]) // THIS WORKS
// resourceLabels Utils.customMapLabels([barProcessKey:task.process]) // ERROR ~ No such variable: task
resourceLabels Utils.customMapLabels([barProcessKey:"${task.process}"]) // THIS WORKS
input:
val(id)
script:
println Utils.customTaskLabels(task) // THIS WORKS
"""
"""
} I have added two new methods here, I included some notes there about combinations that do and do not work, notably;
So this makes the whole thing even more confusing, in that sometimes you can access the |
Hi @stevekm , thank you for bringing up this issue. It confused me for a long time and I'm only now understanding it as a result of studying the codebase. The short answer is to wrap the failing expressions in a closure: tag { Utils.customMessage("${task.process}") }
resourceLabels { [customLabel: Utils.customMessage("${task.process}")] } The long answer is... If you don't wrap the value in a closure, it will be evaluated once when the script is executed rather than each time a task is executed. The difference here is important -- executing the script only defines the process, so variables like If the value is a closure, it will be "lazily" evaluated each time a task is executed, so that you can use task-specific variables. But... if the value is a dynamic string, Nextflow will wrap it in a closure when parsing the script (i.e. as a syntax transformation), so that the dynamic string is also lazily evaluated. But... if the dynamic string is nested in something else like a function call, then it isn't wrapped in a closure. So you can see why this syntax sugar has caused a lot of confusion over what is and isn't allowed in the process definition... because it wasn't applied comprehensively. We might be able to fix it by wrapping the value in a closure if it contains a dynamic string, but maybe we should just document the current behavior better. |
As for |
wow that helps a lot, somehow I missed a few things here
following this line of thought, I went back to my original goal, of getting
process {
cpus = 1
memory = 250.MB
executor = 'awsbatch'
queue = params.queue
resourceLabels = {[ pipelineProcess: task.process, pipelineMemory: task.memory.toString(), pipelineName: workflow.manifest.name ]} // THIS WORKS
// resourceLabels = {[ pipelineProcess: "${task.process}", pipelineMemory: "${task.memory}", pipelineName: "${workflow.manifest.name}" ]} // Unable to marshall request to JSON: MarshallingType not found for class class org.codehaus.groovy.runtime.GStringImpl
} So it looks like the reason things were originally breaking for me was that despite using a closure, I was also using the syntax for
I am glad that simply using Its very counter-intuitive that using a closure would be required here, but at the same time, using string interpolation breaks the effect of using the closure. All told, it seems like there are multiple different situations where you should vs. shouldn't use string interpolation and/or closures to get the desired effect. For advanced usages, I find it would be easier to be able to interact with the
Right, this makes sense, until you try to do this and instead of getting a Overall I think |
oh one follow up to this, might want to make liberal use of
|
Regarding the Process config settings work exactly the same way as process directives. If the value is a closure, it will be evaluated for each task, which is why you can use the
Good point, Nextflow should throw an error in this case.
The main issue here is that AWS Batch does not allow the resource label value to be null. In cases where a label might be null, you might want to consider whether you want to set it to an empty string or not set it at all. With |
Bug report
I am exploring some more advanced methods of Nextflow workflow and process introspection, and I am encountering difficulties in using the
workflow
andtask
implicit objects in different contexts where they seemingly should work.Expected behavior and actual behavior
I would expect that I would be able to use the
task
object with my custom Groovy methods as process directives, since its already commonly used with other pipeline directives. However I get errors such asSteps to reproduce the problem
I have a demo workflow set up like this;
main.nf
nextflow.config
lib/Utils.groovy
modules/bazz.nf
The import parts here are the
tag
andresourceLabels
process directives under theBAZZ
process scope. I have listed in the comments there a couple variations to illustrate some of the ways that things are broken.Note also the usage of
id
andsomeVar
, which are included to show further confusing discrepancies in regards to the scoping of variables within the process.Program output
When running the above workflow with
tag "${task.process}.${id}.tag"
, it works as expected;Changing it to
tag Utils.customMessage("foobarbazz")
also works as expected;This shows that you can use the
task
object in a process directive, such astag
, and you can also use a custom Groovy method's output in the process directive as well.However if you combine these, things break. When you change it to
tag Utils.customMessage("${task.process}")
, it no longer works;The
task
variable does not work here when you try to pass it to the custom Groovy method.You get the same error when you use it with
resourceLabels customLabel: Utils.customMessage("${task.process}")
as well, which is ultimately the process directive I wanted to use in the first place.It seems like something really weird is going on in regards to the scoping for this
task
variable which is allowing it to be used in some cases for process directives, and other times not. Some more strange combinations;tag {task.process}
works, buttag task.process
does not (ERROR ~ No such variable: task
)tag Utils.customMessage(task.process)
doesnt work (ERROR ~ No such variable: task
)tag Utils.customMessage({task.process})
andtag Utils.customMessage({ "${task.process}" })
both gives errorUltimately, the usages of the
task
variable for process and workflow introspection here have been really confusing and unclear. Its not clear whytask
works in some cases, but not other. It feels like maybe there is some kind of "magic" happening surrounding these variables behind the scenes that could be influencing these behaviors? Or is it some kind of complicated Groovy variable scoping and initialization discrepancies?I am not sure if this is a "bug" or if this behavior is an oversight or just inherent in the design of the framework. Regardless, its very counter-intuitive that you can use e.g.
tag "${task.process}"
but you cant usetag Utils.myMethod("${task.process}")
or eventag Utils.myMethod(task)
, and from my experience so far this seems to apply to all (?) of the Nextflow process directives.Ultimately, what I really want is to be able to use both
task
andworkflow
from within thenextflow.config
scope forprocess
configs, so I could have something like this;But this obviously does not work either. If you cannot use
task
indiscriminately inside the process scope, I am not sure how you would be able to use it from the nextflow.config scope.No matter the solution, it would be great to have more documentation on how this all works, and maybe some advanced examples.
Environment
The text was updated successfully, but these errors were encountered: