Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Change display names/avatar URLs to None if they contain null bytes before storing in DB #11230

Merged
merged 7 commits into from
Nov 12, 2021
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions changelog.d/11230.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Fix a long-standing bug wherein display names or avatar URLs containing null bytes cause an internal server error
when stored in the DB.
10 changes: 6 additions & 4 deletions synapse/storage/databases/main/events.py
Original file line number Diff line number Diff line change
Expand Up @@ -1640,8 +1640,8 @@ def _store_event_reference_hashes_txn(self, txn, events):
def _store_room_members_txn(self, txn, events, backfilled):
"""Store a room member in the database."""

def str_or_none(val: Any) -> Optional[str]:
return val if isinstance(val, str) else None
def non_null_str_or_none(val: Any) -> Optional[str]:
return val if isinstance(val, str) and "\u0000" not in val else None

self.db_pool.simple_insert_many_txn(
txn,
Expand All @@ -1653,8 +1653,10 @@ def str_or_none(val: Any) -> Optional[str]:
"sender": event.user_id,
"room_id": event.room_id,
"membership": event.membership,
"display_name": str_or_none(event.content.get("displayname")),
"avatar_url": str_or_none(event.content.get("avatar_url")),
"display_name": non_null_str_or_none(
event.content.get("displayname")
),
"avatar_url": non_null_str_or_none(event.content.get("avatar_url")),
}
for event in events
],
Expand Down
54 changes: 54 additions & 0 deletions tests/storage/test_roommember.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,60 @@ def test_get_joined_users_from_context(self):
)
self.assertEqual(users.keys(), {self.u_alice, self.u_bob})

def test_store_room_members_txn(self):
def _store_room_members_txn(txn, events, backfilled):
"""Store a room member in the database."""

def non_null_str_or_none(val) -> str:
return val if isinstance(val, str) and "\u0000" not in val else None

self.store.db_pool.simple_insert_many_txn(
txn,
table="room_memberships",
values=[
{
"event_id": event.event_id,
"user_id": "random_test_value",
"sender": event.user_id,
"room_id": event.room_id,
"membership": event.membership,
"display_name": non_null_str_or_none(
event.content.get("displayname")
),
"avatar_url": non_null_str_or_none(
event.content.get("avatar_url")
),
}
for event in events
],
)

room = self.helper.create_room_as(self.u_alice, tok=self.t_alice)
bob_event = self.get_success(
event_injection.inject_member_event(
self.hs, room, self.u_bob, Membership.JOIN
)
)

# Create an event with a null-containing string to pass to _store_room_members_txn
event, context = self.get_success(
event_injection.create_event(
self.hs,
room_id=room,
sender=self.u_alice,
prev_event_ids=[bob_event.event_id],
type="m.test.1",
content={"displayname": "bad\u0000null", "membership": Membership.JOIN},
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually prefer to avoid injecting events if you can avoid it (reserve that for when you need invalid events to be persisted for some reason, or to pretend you received one over federation).

Since you have alice's access token and everything, you can use something like

        new_profile = {"displayname": "ali\u0000ce"}
        self.helper.change_membership(
            room, self.u_alice, self.u_alice, "join", extra_data=new_profile, tok=self.t_alice
        )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to change_membership does indeed break without the change I made to _store_room_members_txn and passes with the change, so hopefully this is sufficient? Thanks for pointing out the use cases for injecting events, it's helpful to be told these sorts of things :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be worth also querying the state of the database and checking that they contain Nones instead of something else.
(It just feels weird to have an assert-less test in some strange way, but it's also worth noting that the behaviour is replacing the name with None rather than e.g. leaving it as it was before, if that makes sense?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally makes sense, I shall give that a go.


# pass event to _store_room_members_txn and verify that it doesn't blow up
self.get_success(
self.store.db_pool.runInteraction(
"store room members", _store_room_members_txn, [event], False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"store room members", _store_room_members_txn, [event], False
"store room members", self.hs.get_storage().persistence.persist_events_store._store_room_members_txn, [event], False

That one was hard to dig up, but doing this lets you avoid copying the function (so there won't be any chance we forget to copy it into the test if we change/break it in the future)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if you really need to test this function explicitly/individually if you properly demonstrate that self.helper.change_membership( (as discussed in another thread) works (assuming it didn't work before!).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for showing how to access _store_room_members_txn-even though it seems like it's not necessary to call now-I really scratched my head over how to get access to it so I appreciate being shown the answer!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, no worries, it took me a little bit of digging too :). Everytime I look in the data storage stuff I find out about a new angle I hadn't seen before.
I think the reason is that the event persistence store is separate from the rest, since it was hoped at once point that it could be split out into a separate database?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting-thanks for pointing that out.

)
)


class CurrentStateMembershipUpdateTestCase(unittest.HomeserverTestCase):
def prepare(self, reactor, clock, homeserver):
Expand Down