-
-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How many scheduled task were used for the benchmark? #209
Comments
Hi! Could you describe a bit more what type of tasks you have? 14 million recurring tasks? How often are they running? For the benchmark I created synthetic executions scheduled to run |
Scaling depends a bit on the tasks as well. Up to the point where the database becomes the bottleneck you can increase throughput by adding instances. If the task does nothing database-related, tests indicate you should be able to reach 10k executions/s. |
Hi, @kagkarlsson sorry for my delayed reply. Let me explain our use case. We have different clients that can come to our application and create/update/delete a trigger to run any time. The number of triggers is just growing and growing. Please let I know if you have any other question. |
The limiting factor will be the number of triggers running to completion / second. If these triggers/tasks take say 10s to run, and there are 1000 running in parallell, that will approximately be 100 completions/second (also referred to as executions/s). If you have long-running tasks like that, you will likely first be limited by the size of the thread-pool. That can be increased both per instance (configurable) and by adding more instances. If you reach a point where you need to run more than say 10.000 completions/s, you might need to use multiple databases and split the triggers/executions between them (i.e. sharding). How long does a typical trigger / execution / task run?
Is this one-time tasks or recurring on a schedule? If recurring, what is typically the schedule?
Is it one trigger created per user? |
How long does a typical trigger / execution / task run? Is this one-time tasks or recurring on a schedule? If recurring, what is typically the schedule? Is it one trigger created per user? |
What is the schedule? Are they evenly spread in time, or are there peaks? I still feel that I don't have the complete picture here. Currently, at what threshold of executions/s are you starting to experience problems? And how far are you hoping to push that using db-scheduler? Keep in mind that the key-metric here is executions/s. |
Hi @kagkarlsson I've started the POC and I have a question. I've tried to follow the examples from here: This is the code I'm using to create the task: Note.: scheduler is @Service
public class SchedulerService {
private final ExecutionRunner executionRunner;
private final CronTriggerBuilder cronTriggerBuilder;
private final Scheduler scheduler;
public SchedulerService(
final ExecutionRunner executionRunner,
final CronTriggerBuilder cronTriggerBuilder,
final Scheduler scheduler) {
this.dataSource = dataSource;
this.executionRunner = executionRunner;
this.cronTriggerBuilder = cronTriggerBuilder;
this.scheduler = scheduler;
}
public void create(final DummyPojo pojo) {
String idOne = pojo.getIdOne();
String idTwo = pojo.getIdTwo();
SerializableSchedule serializableSchedule = new SerializableSchedule(idOne, idTwo, cronTriggerBuilder.build(pojo));
RecurringTask<SerializableSchedule> task = Tasks.recurring(UUID.randomUUID().toString(), serializableSchedule, SerializableSchedule.class)
.execute(executionRunner);
Instant newNextExecutionTime = serializableSchedule.getNextExecutionTime(ExecutionComplete.simulatedSuccess(Instant.now()));
TaskInstance<SerializableSchedule> instance = task.instance(idOne);
scheduler.schedule(instance, newNextExecutionTime);
}
} This is the execution runner class: @Component
public class ExecutionRunner implements VoidExecutionHandler<SerializableSchedule> {
private final SQSService sqsService;
public ExecutionRunner(final RotsSqsWorker rotsSqsWorker) {
this.rotsSqsWorker = rotsSqsWorker;
}
@Override
public void execute(final TaskInstance<SerializableSchedule> taskInstance, final ExecutionContext executionContext) {
SerializableSchedule serializableSchedule = taskInstance.getData();
if (serializableSchedule != null) {
long scheduledTimeEpochSeconds = executionContext.getExecution().executionTime.toEpochMilli();
SQSMessage message = new SQSMessage();
message.setIdOne(serializableSchedule.getIdOne());
message.setIdTwo(serializableSchedule.getIdTwo());
message.setRandomId(UUID.randomUUID().toString());
message.setScheduledTimeEpochSeconds(scheduledTimeEpochSeconds);
sqsService.send(message);
}
}
} This is the SerializableSchedule class: public class SerializableSchedule implements Serializable, Schedule {
private final String idOne;
private final String idTwo;
private final String cronPattern;
public SerializableSchedule(final String idOne, final String idTwo, final String cronPattern) {
this.idOne = idOne;
this.idTwo = idTwo;
this.cronPattern = cronPattern;
}
@Override
public Instant getNextExecutionTime(ExecutionComplete executionComplete) {
return new CronSchedule(cronPattern).getNextExecutionTime(executionComplete);
}
@Override
public boolean isDeterministic() {
return true;
}
public String getIdOne() {
return idOne;
}
public String getIdTwo() {
return idTwo;
}
public String getCronPattern() {
return cronPattern;
}
@Override
public String toString() {
return "SerializableCronSchedule pattern=" + cronPattern;
}
} |
You only do this once, at scheduler construction and startup. Inject a reference to the |
I have gotten a couple of other questions along these lines which has made it clear I need a better Spring Boot example for tasks with dynamic schedule that are added at runtime |
Also, for more robust serialization, you may want to consider setting a custom JsonSerializer. (also something I need to add an example for) |
This is just setting up the implementation, I see that
|
Hey, @kagkarlsson many thanks for your help. 🙌 |
Np. Will be interesting to hear the results. Sounded like a very-high-throughput use-case |
Hi @kagkarlsson, Finally, I've managed to have time and come back with results: The POC numbers: Application behaviour: Running the tasks: The aim of this POC was to check if db-scheduler would be able to handle millions of schedulers(tasks) without delaying the execution of them (the main issue we have with Quartz today). After making few changes on the configs below: Also checking the number of pods to handle the 14 million tasks saved in our DB we've managed to not have delays. We kept the POC running for a month and checking our logs it was clear that db-scheduler was able to run with multiple pods distributing equally the load among them and also no delays. We will start a new project soon to provide a scalable scheduler solution for our company and db-scheduler is the way to go. Many thanks for your support @kagkarlsson and also for building this incredible solution. |
Good to hear! And just to let you know, working on an improvement to your use-case, many instances of the same recurring-tasks with variable schedule: #257 |
@kagkarlsson that's great, thanks for the feedback. |
Improved api released in 11.0.
I missed your comment here, sorry. If you have such code that you think might be valuable for people to see, how about pushing it to your own github-repo, and I can link from the README ? I can also add a link to this issue where you are describing your setup. Also, if you are happy users, you are welcome to add your company to the list here: |
I followed this guide and also to create the the schedule by this way. but i can't cancel this task in my spring boot project. who can help me? @PostMapping(path = "stop", headers = {"Content-type=application/json"})
|
Thanks for providing detailed explanation about you poc . We also have a similar use case . Is it possible for you to share the example code which you have used in your POC ? Thanks in advance ! |
@kagkarlsson Could you please share which example we can follow for similar use case to achieve very high throughput in case of short running jobs which just post message on message broker ? |
I think you will get the best throughput using PostgreSQL and |
Thanks a lot . Will use these settings in our PoC . |
Also make sure you have the necessary indices |
Hi there,
I'm looking for an alternative for quartz and I think your solution can be the one.
Today we use quartz a lot and can have over 14 million triggers in our DB. Quartz is not behaving well under this number and adding more instances to the cluster don't bring any benefit, the triggers are delaying a lot.
I would like to know what would be the limit of the db-scheduler and if we can add more instances to scale the growing number of scheduled tasks?
The text was updated successfully, but these errors were encountered: