-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skipping the timestamp check for permission failures #877
base: main
Are you sure you want to change the base?
Conversation
@pallavia7 you could run |
9eb30a1
to
b9e9f72
Compare
if (!skipTimestampCheck) { | ||
val gbTables = ListBuffer[String]() | ||
joinConf.joinParts.toScala.foreach { part => | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
// join parts | ||
val joinPart = Builders.GroupBy( | ||
sources = Seq(getTestGBSourceWithTs(namespace=namespace)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not quite sure how access denial is replayed here. Could you explain? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used Mockito.spy on tableUtils objects. when(tableUtils.checkTablePermission(any(), any())).thenReturn(false) modifies checkPermission behavior. So when tableUtils.checkTablePermission is invoked, the method returns false which happens when there is no access to table.
b9e9f72
to
d582acf
Compare
runTablePermissionValidation((gbTables.toList ++ List(joinConf.left.table)).toSet) | ||
} else Set() | ||
|
||
if (!skipTimestampCheck && noAccessTables.isEmpty) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we print a warning error if we are skipping the timestamp check because there are tables with permission issues?
def testJoinAnalyzerInvalidTablePermissions(): Unit = { | ||
val spark: SparkSession = SparkSessionBuilder.build("AnalyzerTest" + "_" + Random.alphanumeric.take(6).mkString, local = true) | ||
val tableUtils = spy(TableUtils(spark)) | ||
when(tableUtils.checkTablePermission(any(), any())).thenReturn(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider mocking the tableUtils.sql
to throw a runtime exception to mimic table permission issue, since the timestamp check logic will try to access the table data using tableUtils.sql
, and we want to ensure that this is not triggered and gated by the permission check first.
Summary
Moved the permission check before timestamp check. If permission fails, skipped the timestamp check
Why / Goal
User reported an issue in analyzer: the job failed open when users don't have permission to certain Hive tables. The correct behavior should be that any table permission issues should be caught in the analyzer step.
The root cause is that the timestamp check is enabled by default, and it runs before the table permission check. Since timestamp check requires accessing data, it failed open.
Test Plan
[+ ] Added Unit Tests
Checklist
Reviewers