Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #529: Partition API requests to CloudWatch into separate concur… #540

Merged
merged 3 commits into from
Jul 4, 2017

Conversation

alpeb
Copy link
Contributor

@alpeb alpeb commented May 30, 2017

…rent batches to circumvent the 20 data per request limit that they have

@alpeb
Copy link
Contributor Author

alpeb commented May 30, 2017

@cam-stitt Looking forward for your comments.

MetricData: datums,
})
return err
var tokens = make(chan struct{}, cw.numConcurrentRequests)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the value of tokens is. I don't really see it being used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It limits the number of concurrent requests. When the channel gets filled, tokens <- struct{}{} will block until a go routine finishes and releases a token through <-tokens

var errors = make(chan error)
var n int

for len(datums) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal preference would be to loop in jumps of numConcurrentRequests and then slicing datums up each time. Rough example: https://play.golang.org/p/slrBko_QdI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you step through my code, you'll see it's pretty much equivalent to what you're proposing.

}(batch)
}

var firstErr error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By returning only the first error, there may be some context lost as to what the overall problem was. Maybe we can create an Error type that actually contains multiple errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Would you agree on limiting the number of errors, or shall we return everything?

@cam-stitt
Copy link
Contributor

Thanks a lot for this. It's a great start. I've put a few of my thoughts on there, but would love to get @peterbourgon's opinion.

lim := min(len(datums), cw.numConcurrentRequests)
batch, datums = datums[:lim], datums[lim:]
n++
go func(batch []*cloudwatch.MetricDatum) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's better to not make each PutMetricData in a goroutine, and just fire them and return the error that comes up. This means that the first request to error will stop any future requests from occurring, which might be nicer behaviour.

func (cw *CloudWatch) SetConcurrency(numConcurrentRequests int) *CloudWatch {
cw.numConcurrentRequests = numConcurrentRequests
return cw
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be an option in the constructor, rather than a method on the struct. I don't see any reason to make it modifiable during runtime. (It's also racy as implemented.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I wasn't sure how to handle optional parameters in go. Would this do?

func New(namespace string, svc cloudwatchiface.CloudWatchAPI, logger log.Logger, numConcurrentRequests ...int) *CloudWatch {
         numConcurrentRequestsValue := 10
         if len(numConcurrentRequests) > 0 {
                 numConcurrentRequestsValue = numConcurrentRequests[0]
         }
          return &CloudWatch{
                  namespace:             namespace,
                 numConcurrentRequests: numConcurrentRequestsValue,
                  svc:        svc,
                  counters:   map[string]*counter{},
                  gauges:     map[string]*gauge{},
                  histograms: map[string]*histogram{},
                  logger:     logger,
          }
  }

Copy link
Member

@peterbourgon peterbourgon Jun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's not the way. If I were you I'd just make it a non-optional parameter, and document a reasonable default in the function comment.

// New returns a blah blah. The client will emit at most 
// numConcurrent simultaneous requests to Amazon.
// A good default value is 10.
func New(namespace string, svc CloudWatchAPI, numConcurrent int, logger log.Logger) {
    ...
}

MetricData: datums,
})
return err
var tokens = make(chan struct{}, cw.numConcurrentRequests)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, this should be global per CloudWatch struct (so a member of the struct), not per-metric (scoped to NewCounter), right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand what you mean. This is inside Send(), why are you referring to NewCounter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my mistake—but my point stands: putting the semaphore here means there are numConcurrentRequest simultanous requests per Send invocation, so if the user invokes Send 10 times then you get 10x numConcurrentRequests going out. So in reality I think we want the semaphore to be in the struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha 💯

for len(datums) > 0 {
var batch []*cloudwatch.MetricDatum
lim := min(len(datums), cw.numConcurrentRequests)
batch, datums = datums[:lim], datums[lim:]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading this right, you're sizing the batch as the minimum of (number of datums remaining) and (number of allowable concurrent requests)? That doesn't seem right—batch size and concurrent batches have nothing to do with each other, do they?

Copy link
Contributor Author

@alpeb alpeb Jun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I should have instead gotten the min between len(datums) and 20 (the maximum number of datums allowed per request).

})
<-tokens
errors <- err
}(batch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you construct the batch within the loop scope, there's no reason to pass it as a parameter to this anonymous function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed 👍

MetricData: batch,
})
<-tokens
errors <- err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you decrement the semaphore before returning the error along the errors chan? If you didn't need to do that, it would be nicer to express this as

tokens <- struct{}{}
defer func() { <-tokens }()
_, err := cw...
errors <- err

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's better.

if err := <-errors; err != nil && firstErr != nil {
firstErr = err
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little fragile, for a couple reasons:

  • n is only loosely coupled to the number of goroutines you launched, easy to get out of sync w/ maintenance
  • If you forget to do precisely as many reads from errors as goroutines you launched, you'll leak goroutines

Suggest precomputing batches, constructing errors as make(chan error, len(batches)), and only then launching the goroutines to process the batches. Finally, changing this loop to e.g.

for i := 0; i < cap(errors); i++ {
    if err := <-errors; err != nil ... { ... }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I'll follow suit.

@alpeb
Copy link
Contributor Author

alpeb commented Jun 5, 2017

Thanks for the detailed review! I'll respond to comments inline, and then I'll submit a new patch.

@peterbourgon
Copy link
Member

examples/addsvc/cmd/addcli/main.go:155: multiple-value 
transportFactory.GetTransport() in single-value context

I think if you rebase you might get this fixed for you?

alpeb added 3 commits July 3, 2017 15:30
…rent batches to circumvent the 20 data per request limit that they have
…ved semaphore into the cw struct and use defer when using it. Fixed data partitioning logic and separate batch creation from goroutines launching. Improved tests.
@alpeb
Copy link
Contributor Author

alpeb commented Jul 3, 2017

Yep, after rebasing checks are passing 🎉

@peterbourgon
Copy link
Member

@cam-stitt I just want to get a +1 from you before merging. All good from your POV?

@cam-stitt
Copy link
Contributor

Yeah, looks good. Thanks @alpeb for jumping on this.

@peterbourgon peterbourgon merged commit 6a894fe into go-kit:master Jul 4, 2017
@peterbourgon
Copy link
Member

Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants