Added storage size based retention method and new metrics #343

mknapphrt · 2018-06-06T18:08:25Z

Added methods needed to retain data based on a byte limitation rather than time. Limitation is only applied if the flag is set (defaults to 0). Both blocks that are older than the retention period and the blocks that make the size of the storage too large are removed.

2 new metrics for keeping track of the size of the local storage folder and the amount of times data has been deleted because the size restriction was exceeded.

This PR is in conjunction with a PR with prometheus/prometheus. The idea is to allow a user to set a maximum number of bytes that the local storage can take up, in conjunction with the retention period, and whenever that limit is exceeded the oldest block will be deleted to regain space.

closes #124
closes prometheus/prometheus#3684

Signed-off-by: Mark Knapp [email protected]

mknapphrt · 2018-06-06T18:08:49Z

Prometheus PR: prometheus/prometheus#4230

db.go

juliusv · 2018-06-08T21:14:03Z

I just left comments for things that stood out to me, don't treat it as an exhaustive review though, since I'm not a TSDB expert :)

(could you update the PR description to say 0 instead of -1?)

db.go

mknapphrt · 2018-06-13T18:23:40Z

@juliusv Hey julius, thanks for looking over it, I made a few of the changes you (and simonpasquier) recommended. There's still 1 or 2 things I'm not sure about, I know you said you're not a tsdb expert, who would it best for me to talk to?

krasi-georgiev · 2018-06-13T19:30:49Z

I am not an expert either, but can have a look in few days.

db.go

block.go

db.go

chunks/chunks.go

mknapphrt · 2018-07-05T15:00:03Z

I rebased the code and changed my code to better match how deletions are being done now. I fixed one or two things on github's editor to fix conflicts and didn't add a signature to those commits. Not sure how to fix that now cause it's still complaining about them.

block.go

simonpasquier · 2018-07-05T15:19:01Z

Not sure how to fix that now cause it's still complaining about them.

@mknapphrt No need to worry about it now, you can squash the commit once the PR is ready to go.

mknapphrt · 2018-07-11T13:39:43Z

So it turns out that @juliusv was right before, even though ULID's are lexicographically sortable identifiers as was pointed out, the time they are generated doesn't coincide with the time range of the block itself, because when they are compacted they get a new ULID which is more recent. I changed it so that it first sorted the blocks by their times ranges and then decided which blocks to delete.

mknapphrt · 2018-07-13T17:52:40Z

Any opinions? @brian-brazil Is this more what you were thinking of instead of walking over the directory?

brian-brazil

The general approach seems fine, I'll leave the detailed review to the maintainers of this repo. Thanks for your work!

db.go

mknapphrt · 2018-07-19T19:50:56Z

Thanks for all the feedback guys, I really appreciate it. I've gotten a couple responses saying they're not the tsdb expert, so I was wondering who is the tsdb expert that I could bother to look over this? @juliusv @krasi-georgiev Thanks again

juliusv · 2018-07-20T12:54:56Z

@mknapphrt The biggest experts here would be @fabxc and @gouthamve according to https://github.com/prometheus/tsdb/graphs/contributors

mknapphrt · 2018-08-03T19:10:49Z

Doesn't seem there's been much activity around here recently, but I just wanted to try nudging @fabxc and @gouthamve to see if you guys could take a look over this. Thanks

juliusv · 2018-08-05T09:38:22Z

Maybe @gouthamve has more time after https://twitter.com/putadent/status/1026013754251653120 now :)

krasi-georgiev · 2018-08-05T12:43:36Z

@mknapphrt I can have a look again after Promcon as I already have a use case for this so would be great to see it merged.

mknapphrt · 2018-08-06T13:03:58Z

Congrats @gouthamve! And thanks @krasi-georgiev, I appreciate it

krasi-georgiev · 2019-01-03T10:41:50Z

yep not a bad idea.
@mknapphrt could you add a warn log.

mknapphrt · 2019-01-03T14:21:38Z

A warning should be added in the prometheus code, right? Done at the time of parsing the flags or on creation of the tsdb I would think. Or do you think it would fit better in the tsdb?

krasi-georgiev · 2019-01-03T14:30:32Z

aah yeah definitively in Prometheus.

I don't want to delay this one any longer, but it would be great if @gouthamve or @fabxc have a final look , but if not will wait few more days and will merge.

mknapphrt · 2019-01-03T14:41:00Z

Ok, are we still going to be doing any benchmarking or at this point should I take out the tsdb changes in the prometheus PR?

krasi-georgiev · 2019-01-03T15:23:23Z

Lets merge and will move to Prometheus next.

gouthamve

I'm confused by the whole corrupted block handling atm. Left some comments, but other than that, the actual size based retentions bits look good. I'll take another look once the handling of corrupted blocks is fixed.

block.go

db.go

gouthamve · 2019-01-08T14:00:49Z

db.go

-			continue
+	for ulid, err := range corrupted {
+		if _, ok := deletable[ulid]; !ok {
+			return errors.Wrap(err, "unexpected corrupted block")


This will always trigger an error no?

deletable := db.deletableBlocks(loadable) means the deletable list will only have blocks from the loadable list which don't have any corrupted blocks?

yes when there is a corrupted block that is not set for deletion (by size/date retention or replaced by a parent after a compaction) we should return an error.
I guess an extra comment here to clarify would be good.

db.go

gouthamve · 2019-01-08T14:06:05Z

db.go

 	sort.Slice(blocks, func(i, j int) bool {
-		return blocks[i].Meta().MinTime < blocks[j].Meta().MinTime
+		return blocks[i].Meta().MaxTime > blocks[j].Meta().MaxTime


Why was this changed? Curious if it changes anything....

I really can't remember but reverting it fails the TimeRetention test

db.go

gouthamve · 2019-01-08T14:30:43Z

db.go

+	for ulid, block := range blocks {
+		if block != nil {
+			if err := block.Close(); err != nil {
+				level.Warn(db.logger).Log("msg", "closing block failed", "err", err)


Should we return here?

This was the default behaviour before the refactoring. Not sure why.
Maybe it will at least allow deleting it under linux even when closing fails.

Signed-off-by: Krasi Georgiev <[email protected]>

gouthamve · 2019-01-14T15:35:20Z

One final thing, I still don't see the handling of corrupted blocks is right:

This block expects that deletable has the possibility of having corrupted blocks: https://github.com/prometheus/tsdb/blob/2933dbca539b963f55d78021702e52bfeba428bf/db.go#L487-L492

But from here:
https://github.com/prometheus/tsdb/blob/2933dbca539b963f55d78021702e52bfeba428bf/db.go#L481-L485

The loadable and corrupted are mutually exclusive lists, and we pass loadable to the deletableBlocks function in L485. Which means that deletable won't have any blocks from corrupted.

I think the best way to handle corrupted is to fail early, rather than expect them to be deleted.

Signed-off-by: Krasi Georgiev <[email protected]>

krasi-georgiev · 2019-01-15T21:13:14Z

I changed the logic a bit.

We still want to ignore corrupted blocks replaced by parents (crash during compaction) and possibly in a future PR also ignore corrupted blocks that would be outside the retention policy anywa, but for now this is good enough as I can't think of a nice way to implement this.

Signed-off-by: Krasi Georgiev <[email protected]>

krasi-georgiev · 2019-01-16T10:04:53Z

@mknapphrt thanks for the great work and sorry for hijacking the last few commits. We just wanted to have this ready for the 2.7 Prometheus release.

mknapphrt · 2019-01-16T14:31:10Z

@krasi-georgiev No worries! I'm happy to see it got finished up, i've been following along quietly but haven't had the time to do much in the last few days and you definitely seemed to have a handle on it. Is there an expected release date for 2.7? I'd be happy to work on the Prometheus side of this, I don't think it should take quite as long as this change haha

krasi-georgiev · 2019-01-16T15:26:42Z

after merging #374 it would be time to implement in in Prometheus.

2.7 should happen in the next 1-2 days so it would be perfect if you want to update tsdb and implement it in Prometheus.

codesome · 2019-01-18T09:15:08Z

@mknapphrt with #374 and #501 getting merged, you may now go ahead and open a PR for this in Prometheus.

gouthamve · 2019-01-18T09:17:47Z

I'm actually in the process of doing a PR based on prometheus/prometheus#4230

Sorry for hijacking it but there is some time pressure to do Prometheus 2.7 today.

mknapphrt · 2019-01-18T17:20:52Z

Thanks for all the help guys! Glad to see this finally got somewhere. I'd be happy to help in the future as I'm assuming the WAL size is a goal down the road.

…-junkyard#343) Added methods needed to retain data based on a byte limitation rather than time. Limitation is only applied if the flag is set (defaults to 0). Both blocks that are older than the retention period and the blocks that make the size of the storage too large are removed. 2 new metrics for keeping track of the size of the local storage folder and the amount of times data has been deleted because the size restriction was exceeded. Signed-off-by: Mark Knapp <[email protected]>

krasi-georgiev · 2019-01-19T21:52:41Z

sure will ping you if we need to get to that. Thanks for your help as well.

mknapphrt mentioned this pull request Jun 6, 2018

Added flags for size based retention prometheus/prometheus#4230

Closed

free reviewed Jun 8, 2018

View reviewed changes

db.go Outdated Show resolved Hide resolved