-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat : data bucket for persisting data to gcs #6170
Conversation
9fc1158
to
953247d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if we could configure the data directory and data bucket together (instead of passing one as config vars and another as a constructor function). What do you think about grouping storage/persistence related functionality in a single place and dependency-injecting it similar to the logger and activity client?
For example, we could have a runtime/storage
package:
// runtime/storage/storage.go
type Client struct {...}
func New(dataDir, tmpDir string, bucketCfg map[string]any) *Client {...}
func NewTmp() *Client {...}
func (c *Client) WithPrefix(instanceID string) *Client {...}
func (c *Client) DataDir(elem ..string) string {...}
func (c *Client) TempDir(elem ..string) string {...}
func (c *Client) OpenBucket() (*blob.Bucket, bool, error) {...}
Which we create in start.go
and pass to the runtime, like runtime.New(ctx, opts, logger, ..., storage)
. In tests, we can have storage.NewTmp()
similar to zap.NewNop()
.
Then when opening a driver, it can pass r.storage.WithPrefix(instanceID)
to it.
The advantages I see to this are:
- All disk persistence is managed in one place, making it simpler to understand and easier to track storage for billing and periodic cleanup
- In the future, enables exposing buckets in other ways than
*blob.Bucket
. This might be useful for e.g. caching exports where writing directly from the OLAP might be faster.
Yeah sounds good.
|
Co-authored-by: Benjamin Egelund-Müller <[email protected]>
runtime/storage/storage.go
Outdated
newClient := &Client{ | ||
dataDirPath: c.dataDirPath, | ||
bucketConfig: c.bucketConfig, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to handle tempDirPath
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OOPS. Thanks :)
PrefixedBucket closes underlying bucket so no instance will be able to read other instance's data.