-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add domain validation in DomainSpecificString #4351
Conversation
Signed-off-by: Mathias Koehler <[email protected]>
Signed-off-by: Mathias Koehler <[email protected]>
Signed-off-by: Mathias Koehler <[email protected]>
Codecov Report
@@ Coverage Diff @@
## develop #4351 +/- ##
===========================================
- Coverage 73.73% 73.72% -0.01%
===========================================
Files 300 300
Lines 29752 29957 +205
Branches 4879 4933 +54
===========================================
+ Hits 21937 22086 +149
- Misses 6390 6430 +40
- Partials 1425 1441 +16
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of thoughts here:
- we already have
synapse.http.endpoint.parse_and_validate_server_name
, which does some of this stuff. We should be using (and possibly extending) that rather than inventing a new set of checks. - The code you are changing here is on a hot path - we'll end up spending a non-trivial amount of resources validating server names which are already known to be valid. I'd prefer it if we validated identifiers at the point of entry to the system - for example, createRoom fails with a 500 error if a badly-formatted mxid is in the
invitees
list #4088 could be fixed with better checks in the RoomCreationHandler.
try: | ||
localpart, domain = s[1:].split(':', 1) | ||
if ':' in domain: | ||
domain, port = domain.split(':') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like it will fail on IPv6 addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Completely forgot IPv6.
Wrong button ;) |
Thats raises the question why UserIDs (& co) get converted back to an normal string and parsed a second time?! The only places where we should get it as string and not as an instance is on IO. The only IO you can trust is the Database (assuming the data in the database was already validated). In my opinion we also should only handle the domain in one form (IDNA or unicode). Currently both forms are accepted which could perhaps cause weird issues if someone mixes them. I need to make myself more familiar with the code, spec and ecosystem. Could also be that I currently might seeing problems because of Dunning-Kruger effect. ;) |
well yes, it's specifically the path from the database that I was thinking of. However, I don't think it's as simple as you imagine: some parts of the code expect strings, and some expect objects, which naturally leads to a reasonable amount of conversion between the two. You could make this more consistent, but it's not obvious to me that doing so would reduce the amount of conversion to be done. For instance, there are a lot of simple CRUD paths which would end up converting string to instance and back to string again for no real value.
https://matrix.org/docs/spec/appendices.html#server-name is pretty clear that they should be ascii (ie, IDNA.) Anywhere that accepts them as unicode is probably a bug. |
appears abandoned by contributor |
FIXES #4088
This adds an additional validation check whether the domain/hostname meets the criteria of a valid hostname