Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DefaultingType to PodTopologySpreadArgs #95048

Merged

Conversation

alculquicondor
Copy link
Member

@alculquicondor alculquicondor commented Sep 24, 2020

/sig scheduling

What type of PR is this?

/kind feature

What this PR does / why we need it:

This new enum (System, List) allows to distinguish between k8s defined and operator provided defaultConstraints for PodTopologySpread plugin.

Which issue(s) this PR fixes:

Ref #94863

Special notes for your reviewer:

Follow up to #94864

Does this PR introduce a user-facing change?:

New parameter `defaultingType` for `PodTopologySpread` plugin allows to use k8s defined or user provided default constraints

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://git.k8s.io/enhancements/keps/sig-scheduling/1258-default-pod-topology-spread
- [Usage]: https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/#cluster-level-default-constraints

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API labels Sep 24, 2020
@alculquicondor
Copy link
Member Author

/assign @adtac

@fejta-bot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

Copy link
Member

@Huang-Wei Huang-Wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit. LGTM otherwise.

Copy link
Member

@adtac adtac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit, LGTM otherwise

pkg/scheduler/apis/config/types_pluginargs.go Outdated Show resolved Hide resolved
@chendave
Copy link
Member

LGTM, seem like this change still needs review from API owner.

@alculquicondor
Copy link
Member Author

/assign @liggitt
for API review

@chendave
Copy link
Member

/retest

@alculquicondor
Copy link
Member Author

/unassign @liggitt

/assign @lavalamp

for API review

@k8s-ci-robot k8s-ci-robot assigned lavalamp and unassigned liggitt Sep 30, 2020
@alculquicondor alculquicondor force-pushed the disable-default-spread branch from 71cb85d to 6b7f7a1 Compare October 6, 2020 14:04
// `.defaultConstraints[*].labelSelectors` must be empty, as they are
// deduced from the Pod's membership to Services, ReplicationControllers,
// ReplicaSets or StatefulSets.
// If empty, the default constraints prefer to spread Pods across Nodes and Zones
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if you don't want that behavior either, you use the next field? (add comment)

// Defaults to false. When set to true, DefaultConstraints must be empty
// or nil.
// +optional
DisableDefaultConstraints bool `json:"disableDefaultConstraints,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you flip the polarity, e.g. UseDefaultConstraints defaulted true?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading your implementation, I think if we were starting from scratch, you'd actually want a union/one-of?

E.g. the discriminator (this field) is "DefaultingType" and can be one of "SystemDefault", "List", or "None". The DefaultConstraints list may only be set if this field is "List".

Since there's existing files we want to not break (right?), I think we can default this discriminator to "List" if the list has items in it, or to "SystemDefault" otherwise. I hate to do this but I think it's OK since it's for backwards compatibility, and this is just a config file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like DefaultingType. Is this just a regular enum that you called union because of its interpretation?

Since there's existing files we want to not break (right?), I think we can default this discriminator to "List" if the list has items in it, or to "SystemDefault" otherwise.

It's alpha, so I assume it's fine to break (with the appropriate release note). However, is it normal to have to be this verbose?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we're eventually going to get unions as first-class citizens of the API.

It's a little verbose but it makes it very clear what you want the system to do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, isn't "None" unnecessary? "List" + empty slice should be enough.

@@ -200,7 +200,7 @@ func SetDefaults_PodTopologySpreadArgs(obj *v1beta1.PodTopologySpreadArgs) {
// SelectorSpread plugin.
return
}
if obj.DefaultConstraints == nil {
if len(obj.DefaultConstraints) == 0 && !obj.DisableDefaultConstraints {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess people were providing an empty list to avoid getting the default? Yeah, that's bad.

This conditional defaulting is going to give you other problems though.

I actually recommend deleting this defaulting completely and making the consumer of the config do this logic.

In particular I think you should avoid mixing system defaults and user provided defaults in a single field in this layer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess people were providing an empty list to avoid getting the default? Yeah, that's bad.

Yeah, my bad. OTOH, the system defaulting is an alpha feature.

I actually recommend deleting this defaulting completely and making the consumer of the config do this logic.

Even though this code is within pkg/scheduler, i.e. the consumer?

This conditional defaulting is going to give you other problems though.

Could you explain?

Also, having this in defaulting makes it easier to share what configuration is the plugin actually using, once we resolve #93718

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though this code is within pkg/scheduler, i.e. the consumer?

The defaulting function is called during deserialization, which means original user intent is gone by the time the eventual actual consumer gets the go structures.

This conditional defaulting is going to give you other problems though.

Could you explain?

We're adding support for static defaulting, but it won't be able to replace conditional defaulting which depends on multiple fields. I don't know if I can make a great argument for this off the top of my head, but we've found that complicated (e.g. conditional or multi-field) defaulting logic is almost always regretted later.

It's less important because this is a config file, but I'd still try to avoid it.

In my mind there's two concepts that are mixed here, "what value does this field have if it's not specified", and "what does the application do given certain input". This distinction isn't imposed on us by the laws of the universe, but imposing it on ourselves makes the codebase easier to understand, e.g. less spooky action at a distance.

If the system were not ever going to evolve then I'd be fine with the first version of this I saw. But given that it will evolve, I think it's easier to evolve this logic when it's in the application layer and not the defaulting (deserialization) layer.

I hope that makes sense, sorry :)

easier to share what configuration is the plugin actually using

I'm not sure how? It sounds like stuff that'd go in a "status" section except we don't have those for config files.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 6, 2020
// SelectorSpread plugin.
if feature.DefaultFeatureGate.Enabled(features.DefaultPodTopologySpread) {
if obj.DefaultingType != "" {
obj.DefaultingType = v1beta1.SystemDefaulting
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lavalamp help me out here.

Even though "system defaulting" is alpha (graduating to beta in 1.20), "list defaulting", i.e. the slice, is a beta API.
Once we flip the feature gate, users of "list defaulting" would break because of this line.

Should I still have the "if slice is not empty, set to ListDefaulting" marked for removal in v1beta2, which we are indeed planing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't do something that would cause a beta feature to change behavior. It's OK for the next API version to do something different, but an existing config file shouldn't change behavior, I think. Does that help?

// Defaults to false. When set to true, DefaultConstraints must be empty
// or nil.
// +optional
DisableDefaultConstraints bool `json:"disableDefaultConstraints,omitempty"`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, isn't "None" unnecessary? "List" + empty slice should be enough.

@alculquicondor alculquicondor force-pushed the disable-default-spread branch from beab504 to de20874 Compare October 8, 2020 14:14
@alculquicondor
Copy link
Member Author

@lavalamp ready for API review again :)

Copy link
Member

@adtac adtac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple of minor nits

@alculquicondor
Copy link
Member Author

/retest

@Huang-Wei
Copy link
Member

/lgtm

@alculquicondor
Copy link
Member Author

Ping @lavalamp

// Pods that don't define any in `pod.spec.topologySpreadConstraints`.
// `.defaultConstraints[*].labelSelectors` must be empty, as they are
// deduced from the Pod's membership to Services, ReplicationControllers,
// ReplicaSets or StatefulSets.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that this only applies if .defaultingType is "List".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// - "System": Use kubernetes defined constraints that spread Pods among
// Nodes and Zones.
// - "List": Use constraints defined in .defaultConstraints.
// +optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention the default of this field?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the feature gate. I guess that's fair to mention until it gets deleted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think so.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@lavalamp
Copy link
Member

lavalamp commented Oct 9, 2020

/approve

... and so you can fix the comments:
/hold

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, lavalamp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 9, 2020
Change-Id: Ibf6a4fdb39a31fe9deed68de7e7cb24a9bf9d06a
@alculquicondor alculquicondor force-pushed the disable-default-spread branch from f9407a6 to c8a0b9e Compare October 9, 2020 21:12
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 9, 2020
@alculquicondor
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 9, 2020
@Huang-Wei
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 9, 2020
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants