Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for postgres - Part 1 #153

Merged
merged 10 commits into from
Jul 20, 2022
Merged

Conversation

saxenakshitiz
Copy link
Contributor

@saxenakshitiz saxenakshitiz commented Jul 14, 2022

This PR,

  • adds support for adding Postgres implementation
  • few functions are custom functions that should be part of the setup (e.g date)

@saxenakshitiz saxenakshitiz requested a review from a team July 14, 2022 10:48
@codecov
Copy link

codecov bot commented Jul 14, 2022

Codecov Report

Merging #153 (b24b978) into main (72b0c88) will decrease coverage by 10.92%.
The diff coverage is 46.00%.

@@              Coverage Diff              @@
##               main     #153       +/-   ##
=============================================
- Coverage     82.66%   71.73%   -10.93%     
- Complexity      630      761      +131     
=============================================
  Files            65       82       +17     
  Lines          2359     3361     +1002     
  Branches        243      374      +131     
=============================================
+ Hits           1950     2411      +461     
- Misses          312      818      +506     
- Partials         97      132       +35     
Flag Coverage Δ
integration 71.73% <46.00%> (-10.93%) ⬇️
unit 69.77% <45.70%> (-10.37%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ostgres/DefaultResultSetTypePredicateProvider.java 0.00% <0.00%> (ø)
.../service/postgres/PostgresBasedRequestHandler.java 0.00% <0.00%> (ø)
...query/service/postgres/PostgresResultAnalyzer.java 0.00% <0.00%> (ø)
...core/query/service/postgres/TableColumnFilter.java 0.00% <0.00%> (ø)
...ervice/postgres/PostgresRequestHandlerBuilder.java 20.00% <20.00%> (ø)
...postgres/converters/PostgresFunctionConverter.java 25.60% <25.60%> (ø)
...es/converters/PostgresFunctionConverterConfig.java 37.50% <37.50%> (ø)
...ore/query/service/postgres/PostgresColumnSpec.java 50.00% <50.00%> (ø)
...hypertrace/core/query/service/postgres/Params.java 60.65% <60.65%> (ø)
.../query/service/postgres/PostgresClientFactory.java 63.26% <63.26%> (ø)
... and 9 more

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@@ -18,4 +18,5 @@ public interface QueryFunctionConstants {
String QUERY_FUNCTION_STRINGEQUALS = "STRINGEQUALS";
String QUERY_FUNCTION_CONDITIONAL = "CONDITIONAL";
String QUERY_FUNCTION_CONCAT_OR_NULL = "CONCATORNULL";
String DATA_TIME_CONVERT = "DATETIMECONVERT";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: we will define this function at Postgres side?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will define it on postgres side

query-service-impl/src/test/resources/application.conf Outdated Show resolved Hide resolved

assertSQLQuery(
builder.build(),
"Select distinct encode(span_id, 'hex'), span_name, service_name FROM public.\"span-event-view\" "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of multiple selections, do distinct apply to the first field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distinct will apply to selection across multiple columns, and not just the first one.

void testSQLiWithStringValueFilter() {
QueryRequest queryRequest =
buildSimpleQueryWithFilter(
createEqualsFilter("Span.displaySpanName", "GET /login' OR tenant_id = 'tenant2"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference with testQueryWithStringFilter? It compares entire string, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to do with PrepareStatement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from pinot. Need to check this part.

+ TENANT_ID
+ "' "
+ "and ( start_time_millis > 1570658506605 and end_time_millis < 1570744906673 )"
+ " group by service_name, span_name order by service_name, avg(duration_millis) desc , count(*) desc limit 20",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL support alias in ORDER by clause. So, as follow-up, we should replace it with alias.

count(*) as count
avg(duration_millis) as avg_duration_millis

So order_by will become,
order by service_name, avg_duration_millis desc , count desc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my to-do list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will handle alias related changes in separate PR

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.


assertSQLQuery(
queryRequest,
"select count(distinct encode(span_id, 'hex')) FROM public.\"span-event-view\""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are missing alias distinctcount_span_id in the final query, would that be an issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will handle alias related changes in separate PR


assertSQLQuery(
queryRequest,
"select encode(span_id, 'hex'), count(distinct encode(span_id, 'hex')) FROM public.\"span-event-view\""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing alias distinctcount_span_id in the final query

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will handle alias related changes in separate PR

+ TENANT_ID
+ "' "
+ "and ( start_time_millis > 1570658506605 and end_time_millis < 1570744906673 )"
+ " group by span_id order by count(distinct span_id) limit 15",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we should use alias in order by. Like in this example, we might have two eval for count(distinct span_id) in order clause and count(distinct encode(span_id, 'hex')) in select clause as expressions are different. However, we can confirm this by explain_plan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will handle alias related changes in separate PR


assertSQLQuery(
builder.build(),
"SELECT span_name FROM public.\"span-event-view\" WHERE "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Shouldn't such a query be an exception? Are we facing some issues with PrepareStatement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to check this test. Copied from Pinot.

+ " = '"
+ TENANT_ID
+ "' "
+ "AND span_id like decode('042e5523ff6b2506', 'hex')",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encode function is used only with bytes fields, right? So, if the field would have service_name, query would have been service_name like 'frontend' or ``service_name like frontend`?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, encode/decode is only for bytes field

Expression tag = createStringArrayLiteralValueExpression(List.of("FLAGS", "0"));
Filter likeFilter =
Filter.newBuilder()
.setOperator(Operator.CONTAINS_KEYVALUE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: CONTAINS_KEYVALUE is deprecated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

+ " = '"
+ TENANT_ID
+ "' "
+ "AND ( parent_span_id != '' ) limit 5",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: is '' default for bytea?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have checked that it works. However IS NULL or IS NOT NULL is a better check. Will add it in follow up PR.


assertSQLQuery(
queryRequest,
"select PERCENTILETDIGEST99(duration_millis) FROM public.\"span-event-view\""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will depend on the configuration of the handler, right?

// Create a Postgres Client.
public static PostgresClient createPostgresClient(String postgresCluster, String path)
throws SQLException {
if (!get().containsClient(postgresCluster)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: we are creating a client per handler as per RequestHandler build, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, one connection per handler

*
* <p>A query can usually be handled by Postgres handler if the Postgres view of this handler has
* all the columns that are referenced in the incoming query. If the Postgres view is a filtered
* view on some view column filters, the incoming query has to have those filters to match the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Postgres view is a filtered view on some view column filters, the incoming query has to have those filters to match the view.

Didn't get fully. It should be all referenced columns including selection, where, group_by, order_by, etc, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw viewDefinitionSupportsFilter, I think, it is for additional configuration support.

expressionValues.add(String.valueOf(value.getInt()));
break;
case INT_ARRAY:
expressionValues.addAll(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q:are we treating INT_ARRAY as a set in other impl too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, same as Pinot

* query types we might be able to get rid of this in the future and have a single flow to parse the
* Postgres Response.
*/
public interface ResultSetTypePredicateProvider {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read above comment, but guess, need to check with you offline to understand this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fix it as part of integration tests

case QUERY_FUNCTION_CONDITIONAL:
throw new UnsupportedOperationException("Unsupported function " + function);
default:
// TODO remove once postgres-specific logic removed from gateway - this normalization
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pinot specific, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need changes in gateway before we can change here

return String.format(
"%s(%s, %d)",
config.getDataTimeConvertFunction(),
argumentConverter.apply(argumentsList.get(0)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious Q: what would the second argument mean here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

column name

delim = ", ";
}

pqlBuilder.append(" FROM public.\"").append(tableDefinition.getTableName()).append("\"");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that we will need config for namespacepublic? For now, I think, it will be fine. lets see with our final setup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should take it from config

pqlBuilder.append(delim);
pqlBuilder.append(
columnRequestConverter.convertSelectClause(expr, paramsBuilder, executionContext));
delim = ", ";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: delim mgmt is a bit tricky, we can first collect all select clause transformations in the stream, and join at the end. we can do this as follow-up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be done in follow up PR

throws SQLException {
List<Builder> rowBuilderList = new ArrayList<>();
if (resultSet.next()) {
// Postgres has different Response format for selection and aggregation/group by query.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the case with pinot. Postgres will always return as a list of columns in response, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a single parse function in Postgres.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That part is correct. Will fix it in follow up PR.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@@ -520,28 +520,27 @@ private void handleSelection(
} while (resultSet.next());
}

// TODO - Need to validate and fix this function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will need only single parse function in Postgres impl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of follow up

@@ -1,7 +1,6 @@
{
name = raw-service-view-events-handler
type = pinot
clientConfig = broker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: In real config, we will have clientConfig with connectionString of postgres URL. we will need a way to accept password via config too.

Copy link
Contributor Author

@saxenakshitiz saxenakshitiz Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to redefine it as granular as we want

kotharironak
kotharironak previously approved these changes Jul 20, 2022
Copy link
Contributor

@kotharironak kotharironak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good to me. In the follow-up:

  • handle connection string with password
  • integration test
  • single way of parsing ResultSet
  • nits related to the name of functions (e.g toPostgresPercentile, POSTGRES_CONCAT_FUNC, etc)

@github-actions

This comment has been minimized.

}
}
String physName = tableDefinition.getPhysicalColumnName(logicalName);
logicalNameToPhysicalNameIndex.put(logicalName, physicalNameColIndexMap.get(physName));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can physicalNameColIndexMap.get(physName) be null? Do we need to check for contains? Ideally, it shouldn't be, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be given it is based on selections

@saxenakshitiz saxenakshitiz merged commit bc8d70c into main Jul 20, 2022
@saxenakshitiz saxenakshitiz deleted the add_postgres_support_part1 branch July 20, 2022 12:10
@github-actions
Copy link

Unit Test Results

  37 files  +  1    37 suites  +1   7s ⏱️ -1s
238 tests +35  238 ✔️ +35  0 💤 ±0  0 ❌ ±0 

Results for commit bc8d70c. ± Comparison against base commit 72b0c88.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants