Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PC-1429: Open EPC API Error handling #403

Merged
merged 4 commits into from
Nov 28, 2024

Conversation

samyou-softwire
Copy link
Contributor

Link to Jira ticket

Description

adds some error handling to the epc interface

  • 404: this was confirmed to mean no recommendations
    • don't log as this is reasonably expected
    • default to no recommendations
  • 500: try up to 3 times if 500 responses are returned
    • log the error. if there are many 500 errors we would like to know
    • default to no recommendations to allow the user to continue through the flow

Checklist

  • I have made any necessary updates to the documentation
  • I have checked there are no unnecessary IDE warnings in this PR
  • I have checked there are no unintentional line ending changes
  • I have added tests where applicable
  • If I have made any changes to the code, I have used the IDE auto-formatter on it
  • If I have made any changes to Python files (even if not changing any content strings), I have run make extract-translations

@@ -63,7 +63,7 @@ phonenumbers = "^8.13.38"

[tool.black]
line-length = 120
target-version = ['py38']
target-version = ['py311']
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed to allow for the match syntax to be parsed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh we should have changed this anyway. Would you mind have a quick look around the codebase for any other references to old Python versions? I can see there's one to 3.9 in the README, for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had a look through and I think that's the only remaining reference, so have fixed both

@samyou-softwire samyou-softwire force-pushed the PC-1429-open-epc-api-error-handling branch from d91193f to 90c8d70 Compare November 25, 2024 15:31
Comment on lines +640 to +643
# raise this to logs, though let user continue with no recommendations
logger.exception(requestException)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could go either way on this behaviour, though I think logging & returning blank allows the user to proceed while giving us visibility if too many errors occur

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean vs throwing the error? I think this is more correct in that case: it's something we're expecting might happen, and we're handling it here rather than replying on something else further up the chain,

epc_details_response, epc_recommendations_response = interface.api.epc.get_epc(lmk)
epc_details = epc_details_response["rows"][0]
recommendations = epc_recommendations_response["rows"]
epc_details, recommendations = interface.api.epc.get_epc(lmk)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved the extracting of data from response to interface, where it makes a bit more sense

# else if not on last try, try again
continue

# 404 means no recommendations, so no need to retry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but we want to swallow the exception as it's expected behaviour from the API. In general I think it's worth documenting why we're doing this and what we've learned about the API.

def _get_epc_details(self, epc_api, lmk):
return epc_api.get_epc_details(lmk)["rows"][0]

RECOMMENDATION_RETRIES_COUNT = 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would probably bump up to 5, but that's just my instinct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

return []

# else if not on last try, try again
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have some exponential backoff here - there's a good article here which explains why it's a good idea. It's probably not a strict necessity at this scale, and I don't think we need to worry about jitter, but it's good practice and also gives us some more resilience against short-term problems with the API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have added some

RECOMMENDATION_RETRIES_COUNT = 3

def _get_epc_recommendations(self, epc_api, lmk):
for i in range(self.RECOMMENDATION_RETRIES_COUNT):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really necessary to change, but I'd instinctively have expected this sort of thing to be a while loop with an incrementing counter. Suppose it means we don't have to create this range (technically more scalable if we were doing colossal numbers of retries).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for what it's worth range in python 3 is lazy, so it'll probably be using a counter behind the scenes. I don't think it'll ever construct the range unless we call list() or similar?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was wondering whether it ends up being equivalent once you get down to internals.

Comment on lines +640 to +643
# raise this to logs, though let user continue with no recommendations
logger.exception(requestException)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean vs throwing the error? I think this is more correct in that case: it's something we're expecting might happen, and we're handling it here rather than replying on something else further up the chain,

@samyou-softwire samyou-softwire force-pushed the PC-1429-open-epc-api-error-handling branch from 90c8d70 to b14c9f3 Compare November 26, 2024 15:20
@@ -63,7 +63,7 @@ phonenumbers = "^8.13.38"

[tool.black]
line-length = 120
target-version = ['py38']
target-version = ['py311']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh we should have changed this anyway. Would you mind have a quick look around the codebase for any other references to old Python versions? I can see there's one to 3.9 in the README, for example.



@unittest.mock.patch("help_to_heat.frontdoor.interface.EPCApi", MockRecommendationsInternalServerErrorEPCApi)
def test_on_recommendations_internal_server_error_response_recommendations_are_empty_list():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easy to have a test asserting specifically that we retry? e.g. mock an error the first time and then resolve the second?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think so; the EPC API classes are stateless (theyre reconstructed each time theyre used) & using some sort of global state could have issues with this state leaking out of the test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was imagining something like

class MockRecommendationsTransientInternalServerErrorEPCApi(MockEPCApi):
    number_of_calls = 0
    def get_epc_recommendations(self, lmk_key):
        number_of_calls++
        if number_of_calls == 1:
            raise InternalServerErrorRequestException()

        return something normal

Would that work? We're retrying calls on the same instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes it can store an epc api per request, have added one now

…point

on 404: this means no results

on 500: could be due to a timeout. try 3 more times, if fails after 3rd raise error and return no results
@samyou-softwire samyou-softwire force-pushed the PC-1429-open-epc-api-error-handling branch from b14c9f3 to b94763e Compare November 27, 2024 11:21
@samyou-softwire samyou-softwire force-pushed the PC-1429-open-epc-api-error-handling branch from b94763e to 68c0c2b Compare November 27, 2024 12:36
@samyou-softwire samyou-softwire merged commit f86cdbf into develop Nov 28, 2024
4 checks passed
@samyou-softwire samyou-softwire deleted the PC-1429-open-epc-api-error-handling branch November 28, 2024 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants