Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: add support for label/tags #2853

Merged
merged 12 commits into from
Sep 1, 2022
Merged

Conversation

jorgeaguileraseqera
Copy link
Contributor

Allow a new syntax in the label directive in key-value format

  label k: 'value'
  label b: "${params.user}"

when this syntax is used we store them in a new Map variable so executors can access them and tag the process if present

closes #2845

Signed-off-by: Jorge Aguilera [email protected]

@jorgeaguileraseqera jorgeaguileraseqera linked an issue May 6, 2022 that may be closed by this pull request
@pditommaso
Copy link
Member

Didn't we agree to go with label 'name=value' as first iteration"?

@jorgeaguileraseqera
Copy link
Contributor Author

yes, but once started with the implementation I've found this syntax more powerful and wanted to propose it

(I'm still working on the issue so I can include also our first idea about name=value)

@jorgeaguileraseqera jorgeaguileraseqera marked this pull request as draft May 6, 2022 13:47
@jorgeaguileraseqera jorgeaguileraseqera force-pushed the 2845-add-support-for-labeltags branch from 4dbad10 to 01e2d49 Compare May 6, 2022 13:51
@jorgeaguileraseqera
Copy link
Contributor Author

This PR allows the following syntax:

process foo {
      label 'bravo'   //<--- a label as previously
      label 'department=floor 3' 
      label region:'eu-west-1' 
      label region:'eu-west-1' , department:'floor 3'
}

@jorgeaguileraseqera jorgeaguileraseqera marked this pull request as ready for review May 6, 2022 14:06
@pditommaso
Copy link
Member

Ok, I'll check soon 👍

Copy link
Member

@pditommaso pditommaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, nice job. The main comment is about naming, let's call these resourceLabels

closes #2845

Signed-off-by: Jorge Aguilera <[email protected]>
include key=value syntax as valid format for labels

Signed-off-by: Jorge Aguilera <[email protected]>
rename sticker as resourceLabels

Signed-off-by: Jorge Aguilera <[email protected]>
@jorgeaguileraseqera jorgeaguileraseqera force-pushed the 2845-add-support-for-labeltags branch from 01e2d49 to fdcec30 Compare May 9, 2022 11:06
@jorgeaguileraseqera jorgeaguileraseqera self-assigned this May 11, 2022
@jorgeaguileraseqera jorgeaguileraseqera marked this pull request as draft May 11, 2022 11:29
@jorgeaguileraseqera jorgeaguileraseqera marked this pull request as ready for review May 13, 2022 10:53
@dougnukem
Copy link
Contributor

This might be a different feature, but there could be value in allowing users to specify labels at the top-level like on the executor.

Then have those labels propagate down to the process to be set. Otherwise we'd have to set those labels/params on every single process.

#2845

google {
    labels = [ 'user-id': params.user_id, 'project-id': params.project_id ]
    project = 'theprojectid'
    zone = 'europe-west1-b'
}

@jorgeaguileraseqera jorgeaguileraseqera force-pushed the 2845-add-support-for-labeltags branch from 1bc50ac to f274322 Compare May 21, 2022 08:48
@pditommaso pditommaso added this to the 22.10.0 milestone May 26, 2022
@pditommaso
Copy link
Member

pditommaso commented May 26, 2022

@dougnukem thanks a good point, however, think the current proposal still works, because when setting something like in the config, it would apply to all processes

process.labels = [ 'user-id': params.user_id, 'project-id': params.project_id ]

@pditommaso pditommaso force-pushed the 2845-add-support-for-labeltags branch from cb92a61 to fef3801 Compare August 30, 2022 21:31
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Copy link
Member

@pditommaso pditommaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this looks good a few comments to address. It should also be address the support for Google Batch

@@ -1674,6 +1674,19 @@ in the configuration file to select and configure subset of processes having sim

See the :ref:`config-process-selectors` documentation for details.

The ``label`` directive can be also expressed as a `Map<key-value>` or a `key=value` sentence:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now has been implemented with a new resourceLabels directive. Please update the docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pditommaso
Copy link
Member

I was looking how to set labels with Google batch but cannot find it, but I'm quite sure it's possible. @bentsherman any clue about that?

@bentsherman
Copy link
Member

docs/process.rst Outdated Show resolved Hide resolved
@jorgeaguileraseqera jorgeaguileraseqera force-pushed the 2845-add-support-for-labeltags branch from 946edef to 2004783 Compare August 31, 2022 18:48
@pditommaso pditommaso force-pushed the 2845-add-support-for-labeltags branch from 2004783 to 6184178 Compare September 1, 2022 09:27
@pditommaso
Copy link
Member

Still could not figure out how to use Google Batch API to setup labels. I've open a separate issue for that

@pditommaso pditommaso merged commit 5d0b7c3 into master Sep 1, 2022
@pditommaso pditommaso deleted the 2845-add-support-for-labeltags branch September 1, 2022 09:43
@beichen1024
Copy link

Still could not figure out how to use Google Batch API to setup labels. I've open a separate issue for that

Do you have an issue for this?

@bentsherman
Copy link
Member

Do you have an issue for this?

Fixed in #3170

@beichen1024
Copy link

Do you have an issue for this?

Fixed in #3170

Actually, yes.
NF version: 22.09.4-edge
executor: google-lifesciences
I added

process.labels = [ 'user-id': params.user_id, 'project-id': params.project_id ]

as mentioned in the discussion. And I used gcloud beta lifesciences operations describe to check the worker process operation, and in the resources section, I am not seeing the labels I added. And nothing showed up in .nextflow.log either.

resources:
      regions:
      - us-central1
      virtualMachine:
        bootDiskSizeGb: 500
        bootImage: projects/cos-cloud/global/images/family/cos-stable
        disks:
        - name: nf-pipeline-work
        labels:
          goog-pipelines-worker: 'true'  // only the google default labels
        machineType: custom-1-1024
        nvidiaDriverVersion: 450.51.06
        serviceAccount:
          email: default
          scopes:
          - https://www.googleapis.com/auth/cloud-platform
        volumes:
        - persistentDisk:
            sizeGb: 500
          volume: nf-pipeline-work
    timeout: 604800s

I think I have a couple of questions:

  1. If it is successfully passed to all process, how do I know? where should I check it? (like what log files)
  2. Are those labels suppose to show up in the resource.labels section?

Additional information:
If I include a script block in the process itself and use $task.labels I can print those values to an output file.

@bentsherman
Copy link
Member

@beichen1024 it should be process.resourceLabels

@dougnukem
Copy link
Contributor

@bentsherman The GCP labeling being added appears to be done at the Google Lifesciences Pipeline level, for this to apply to the GCP VMs for cost monitoring this needs to be done at the VirtualMachine.labels level.

https://github.com/nextflow-io/nextflow/pull/2853/files#diff-bd0a55afab4a626d773cac6787aef85e2d2756a3a25d438df70bb98ae0127f1fR479

I believe instead this should be applied instead or in addition at the Resource/VirtualMachine level.

https://cloud.google.com/life-sciences/docs/reference/rpc/google.cloud.lifesciences.v2beta#runpipelinerequest

RunPipelineRequest
labels | map<string, string>User-defined labels to associate with the returned operation. These labels are not propagated to any Google Cloud Platform resources used by the operation, and can be modified at any time.To associate labels with resources created while executing the operation, see the appropriate resource message (for example, VirtualMachine).

https://cloud.google.com/life-sciences/docs/reference/rpc/google.cloud.lifesciences.v2beta#virtualmachine

VirtualMachine
labels | map<string, string>Optional set of labels to apply to the VM and any attached disk resources. These labels must adhere to the name and value restrictions on VM labels imposed by Compute Engine.Labels keys with the prefix 'google-' are reserved for use by Google.Labels applied at creation time to the VM. Applied on a best-effort basis to attached disk resources shortly after VM creation.

Optional set of labels to apply to the VM and any attached disk resources. These labels must adhere to the name and value restrictions on VM labels imposed by Compute Engine.

Labels keys with the prefix 'google-' are reserved for use by Google.

Labels applied at creation time to the VM. Applied on a best-effort basis to attached disk resources shortly after VM creation.

dougnukem added a commit to dougnukem/nextflow that referenced this pull request Sep 23, 2022
Apply GCP resourceLabels to the VirtualMachine to propagate into the VM's so that the labels can be used for things like GCP billing tracking etc.

Related to nextflow-io#2853 (comment)

Signed-off-by: Doug Daniels <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for label/tags
5 participants