Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Zayo parser to handle additional examples #109

Merged
merged 5 commits into from
Nov 15, 2021

Conversation

glennmatthews
Copy link
Contributor

Handle some additional Zayo email notifications that were failing the current parser.

I'm marking this as a draft for now as I have some questions about how to handle information that may be omitted from certain Zayo notification emails (see TODO comments in zayo.py).

circuit_maintenance_parser/parsers/zayo.py Outdated Show resolved Hide resolved
circuit_maintenance_parser/parsers/zayo.py Outdated Show resolved Hide resolved
circuit_maintenance_parser/parsers/zayo.py Outdated Show resolved Hide resolved
@glennmatthews glennmatthews marked this pull request as ready for review November 12, 2021 15:32
Copy link
Collaborator

@chadell chadell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -77,7 +77,8 @@ def init_from_emailmessage(cls: Type["NotificationData"], email_message) -> Opti
# Adding extra headers that are interesting to be parsed
data_parts.add(DataPart(EMAIL_HEADER_SUBJECT, email_message["Subject"].encode()))
data_parts.add(DataPart(EMAIL_HEADER_DATE, email_message["Date"].encode()))
return cls(data_parts=list(data_parts))
# Ensure the data parts are processed in a consistent order
return cls(data_parts=sorted(data_parts, key=lambda part: part.type))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why order is important?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case where we have information of different accuracy/precision available from different data parts. For example with Zayo we can always pull an (approximate) stamp from the email date, but for some notifications we may have a more precise stamp explicitly stated in the email body. Without this change it was unpredictable as to which one of these would be processed last and therefore win out - see the CI failures on 7970ecc above for an example of this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, good catch.

circuit_maintenance_parser/parser.py Show resolved Hide resolved
@@ -24,6 +43,14 @@ def parse_html(self, soup):
self.parse_bs(soup.find_all("b"), data)
self.parse_tables(soup.find_all("table"), data)

if data:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering why do we need this check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the case where we might decide that there's no relevant content in the email (and hence return an empty data dict) I wouldn't want to just fill in these additional fields in an otherwise empty dict.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, otherwise you could just rely on Pydantic validation once the Maintenance output is instantiated.

@glennmatthews glennmatthews merged commit 9d1c6b7 into develop Nov 15, 2021
@glennmatthews glennmatthews deleted the gfm-improve-zayo branch November 15, 2021 17:01
glennmatthews added a commit that referenced this pull request Nov 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants