-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Added Timestamp/Binary/Float to fuzz #13280
feat: Added Timestamp/Binary/Float to fuzz #13280
Conversation
), | ||
ColumnDescr::new("binary", DataType::Binary), | ||
ColumnDescr::new("large_binary", DataType::LargeBinary), | ||
ColumnDescr::new("binaryview", DataType::BinaryView), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can put binary
near string
types instead of placing it in the middle of some fixed-size primitive types.
use rand::Rng; | ||
|
||
/// Randomly generate binary arrays | ||
pub struct BinaryArrayGenerator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -171,6 +172,22 @@ fn baseline_config() -> DatasetGeneratorConfig { | |||
ColumnDescr::new("time32_ms", DataType::Time32(TimeUnit::Millisecond)), | |||
ColumnDescr::new("time64_us", DataType::Time64(TimeUnit::Microsecond)), | |||
ColumnDescr::new("time64_ns", DataType::Time64(TimeUnit::Nanosecond)), | |||
// TODO: randomize timezones for timestamp types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets create a ticket instead of todo
Vec::new() | ||
} else { | ||
let len = rng.gen_range(1..=max_len); | ||
(0..len).map(|_| rng.gen()).collect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering if len
differs from max_len
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that len
is the actual length of the value, which is drawn between 1..max_len
pub rng: StdRng, | ||
} | ||
|
||
impl BinaryArrayGenerator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it, thinking of if we should tests for this generator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the generator itself is part of a test 🤔 What would we test? Maybe that the distinct values are as specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jonathanc-n -- I think this looks great in my opinion
@@ -210,6 +226,9 @@ fn baseline_config() -> DatasetGeneratorConfig { | |||
// low cardinality columns | |||
ColumnDescr::new("u8_low", DataType::UInt8).with_max_num_distinct(10), | |||
ColumnDescr::new("utf8_low", DataType::Utf8).with_max_num_distinct(10), | |||
ColumnDescr::new("binary", DataType::Binary), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could potentially remove the todo binary
a few lines above
pub rng: StdRng, | ||
} | ||
|
||
impl BinaryArrayGenerator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the generator itself is part of a test 🤔 What would we test? Maybe that the distinct values are as specified?
Thanks again @jonathanc-n -- and thanks to @comphead @LeslieKid for the reviews |
* Added Timestamp/Binary/Float to fuzz * clippy fix * small fix * remove todo * remove todo
Which issue does this PR close?
Closes #13279.
What changes are included in this PR?
Added timestamp, binary, and float for the fuzz testing