Run DAG on schedule unless it has been run since last scheduled run #708
Replies: 4 comments 1 reply
-
Hey Chris, Thanks for the detailed questions! You've brought up some really good points about how we can make DAG management in Dagu even better. Preventing a DAG from rerunning if it has already completed is a smart idea. I'm thinking of adding a Regarding your other questions:
That |
Beta Was this translation helpful? Give feedback.
-
I've found these which lead me to wonder if there may be more undocumented ones. I do wonder however since Then steps could be written like this steps:
- name: get value
executor: jq
command: '{( .schedule }}'
script: "echo $DAG_DETAILS_JSON
output: DAG_SCHEDULE That isn't to say having bespoke environment variables wouldn't be handy for more frequently accessed values. It would save needing a step simply to fetch the details.
sub-DAGs are one of the primary ways I am planning on using this tool. I have a media ingestion script that is currently a monolith, but there's no reason why it could not be split up so parts of this script could be run on demand. I plan to break the major parts of this script into DAGs and have a meta-DAG that runs them together on a schedule.
There doesn't appear to be a straight forward way from the API to know what DAG is running from the request id alone. It seems it would require hitting We would need
Thank you! To add more context as to why With my media ingestion script mentioned above at the end I run a backup command. This backup command is expensive computationally, and also wakes up spinning hard drives that I would prefer to keep spun down if possible. My media ingestion script doesn't run every day however, and if there is nothing to ingest I don't wish to run the backup command. However I would still like to run the backup command periodically. My goal is to have the backup command in a DAG on a schedule, and have my meta-DAG run this backup command DAG. However the next time the schedule triggers I don't want the backup DAG to run if it ran since the last time it was supposed to run on schedule. For example:
Thank you for this software! I'm still evaluating it but looks like it is going to fit my needs. I have further ideas, but I'll wait to supply these until I have done more than read the docs and the code 😅 |
Beta Was this translation helpful? Give feedback.
-
Thanks for the detailed response and insightful suggestions! This is super helpful. Just to clarify, I'm also glad to know your plan on using sub-DAGs! That's great feedback. You're spot on about The extra context on I really appreciate your enthusiasm and detailed feedback! It's incredibly helpful as we develop Dagu. I'd love to hear any other ideas you have as you keep exploring. Every bit of feedback helps! |
Beta Was this translation helpful? Give feedback.
-
Unfortunately it turns out I won't be able to use dagu after all. My use case for Dagu is running in a container and orchestrating a collection of tasks, some of which are related. However all my tasks are broken into their own docker containers for ease of maintenance. I discovered as I was trying to actually deploy my tasks that Dagu does not support environment variables / interpolation in anything other than a I am off to look for other solutions, but some suggestions for the future which would make this tool amazing.
I think ultimately what I wish existed is a task runner that is as powerful as go-task, with the ability to schedule tasks, and have a web ui to monitor the task runs as well as manually invoke a task out of schedule. Dagu is very close but the limitations I ran up against for docker ended up being show stoppers. Thank you for this wonderful project! I'll keep an eye on it to see if it one day will suit my needs. |
Beta Was this translation helpful? Give feedback.
-
One thing I have not been able to figure out is how to prevent a DAG from running if it has been run before the scheduler would typically run the job.
I'm wondering if there is a cleaner way than to use the HTTP executor to get the finishedAt attribute, get the current time from shell, parse the cron string, and do the math to see if it's run since the last run?
Some questions:
I would like to share this "delay run if run before scheduled run" between DAGs, and it seems like it's going to be a little cumbersome.
I believe the script will need to:
Is there a simpler way?
Beta Was this translation helpful? Give feedback.
All reactions