-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide Serializable Interface for Creating WorkerRuns #2232
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of class indirections and classes with similar names or that are using Assembly because Factory was taken already.
I think the PR in its current state is going to make the code harder to follow.
airbyte-scheduler/src/main/java/io/airbyte/scheduler/SchedulerWorkerRunAssembly.java
Outdated
Show resolved
Hide resolved
* This class is a runnable that give a job id and db connection figures out how to run the | ||
* appropriate worker for a given job. | ||
*/ | ||
public class SchedulerWorkerRunAssembly { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you call it an assembly? It seems to be a factory.
...src/main/java/io/airbyte/scheduler/worker_run_factories/CheckConnectionWorkerRunFactory.java
Outdated
Show resolved
Hide resolved
...src/main/java/io/airbyte/scheduler/worker_run_factories/CheckConnectionWorkerRunFactory.java
Outdated
Show resolved
Hide resolved
...eduler/src/main/java/io/airbyte/scheduler/worker_run_factories/DiscoverWorkerRunFactory.java
Outdated
Show resolved
Hide resolved
...heduler/src/main/java/io/airbyte/scheduler/worker_run_factories/GetSpecWorkerRunFactory.java
Outdated
Show resolved
Hide resolved
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
public class SyncWorkerRunFactories { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you decide to group them together instead of splitting them up in separate classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because they are essentially just the same logic with very slight changes in configuration. i wanted the code close together.
...yte-scheduler/src/main/java/io/airbyte/scheduler/worker_run_factories/WorkerRunAssembly.java
Outdated
Show resolved
Hide resolved
@@ -26,6 +26,7 @@ | |||
|
|||
import pendulum as pendulum | |||
from base_python import BaseClient | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh oh
|
||
public class WorkerRunFactoryUtils { | ||
|
||
public static IntegrationLauncher createLauncher(long jobId, int attempt, final String image, ProcessBuilderFactory pbf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could put this as a static method AirbyteIntegrationLauncher.create(...)
instead of having a new utils file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
airbyte-scheduler/src/main/java/io/airbyte/scheduler/worker_run/WorkerRunFactory.java
Outdated
Show resolved
Hide resolved
airbyte-scheduler/src/main/java/io/airbyte/scheduler/worker_run/WorkerRunFactory.java
Outdated
Show resolved
Hide resolved
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
public class SchedulerWorkerRunWithEnvironmentFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the only value of this class vs just including it inside of JobSubmitter
for testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish the name just didn't resemble WorkerRunWithEnvironmentFactory
. JobToWorkerRunConverter
or anything really feels better so they have clearer roles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, the naming really has a bad code smell right, but if I understand from your PR desc, only one will remain right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly. this is to get us over the hump from scheduler doing stuff to temporal doing it. will go away.
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
public class SyncWorkerRunFactories { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have that pattern anywhere in the code and it doesn't match the pattern of the other factories. you're basically doing "modules" with static classes. If the logic is the same, you can use inheritance or your can use helpers.
// T must be serializable. The generated json pojos do not implement serializable (but they are | ||
// serializable). It means we can't force Serializable as a constraint in this interface | ||
// unfortunately. | ||
public interface WorkerRunFactory<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can enforce serializable
on T and you can make sure the generated classes implement Serializable. jsonschema2pojo supports it.
https://github.com/joelittlejohn/jsonschema2pojo/tree/master/jsonschema2pojo-gradle-plugin
serializable = true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙌 thanks!
|
||
public class WorkerRunFactoryUtils { | ||
|
||
public static IntegrationLauncher createLauncher(long jobId, int attempt, final String image, ProcessBuilderFactory pbf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
@@ -93,11 +94,12 @@ public SchedulerApp(Path workspaceRoot, | |||
public void start() throws IOException { | |||
final ExecutorService workerThreadPool = Executors.newFixedThreadPool(MAX_WORKERS, THREAD_FACTORY); | |||
final ScheduledExecutorService scheduledPool = Executors.newSingleThreadScheduledExecutor(); | |||
final WorkerRunFactory workerRunFactory = new WorkerRunFactory(workspaceRoot, pbf); | |||
final SchedulerWorkerRunWithEnvironmentFactory schedulerWorkerRunWithEnvironmentFactory = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: You can probably shorten the var name to avoid the multiline statement. I think it is still ok to call it workerRunFactory
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
public class SchedulerWorkerRunWithEnvironmentFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, the naming really has a bad code smell right, but if I understand from your PR desc, only one will remain right?
this.workerRunFactory = workerRunFactory; | ||
} | ||
|
||
public WorkerRun create(final long jobId, final int attempt, final T config) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this logic of creating path can at some point belong to the workerrun right (once we are 100% on temporal)? it is a bit overkill at the moment to have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can redefine the worker run concept a bit going forward to simplify this, but i think figuring that out will be easier once we're on temporal because we'll know exactly what the constrains are.
4115f46
to
7749367
Compare
…n/WorkerRunFactory.java Co-authored-by: Jared Rhizor <[email protected]>
…n/WorkerRunFactory.java Co-authored-by: Jared Rhizor <[email protected]>
418e9ce
to
d83b6e0
Compare
closes #2177
What
How
Pre-merge Checklist
Recommended reading order
SchedulerWorkerRunAssembly
WorkerRunFactory
(and classes that implement it)