Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log when a key backup replaces an existing key #17013

Closed
andybalaam opened this issue Mar 20, 2024 · 3 comments · Fixed by #17266
Closed

Log when a key backup replaces an existing key #17013

andybalaam opened this issue Mar 20, 2024 · 3 comments · Fixed by #17266
Assignees
Labels

Comments

@andybalaam
Copy link
Member

andybalaam commented Mar 20, 2024

This issue was created by the Crypto team because we believe it will help us to identify the source of broken key backups. We expect to work on it and hope the Synapse team will assist us in deciding what exactly to do, and helping us with review and merge assuming we can agree on a sensible approach.

Recently the Crypto team has identified broken key backups as a common source of "unable to decrypt" errors (UTDs). Some clients appear to be uploading invalid data, and other clients have sometimes been more strict than the spec in validating key backups. Both of these problems result in UTDs, and it can be difficult when debugging to identify which client uploaded the key, because it may have been replaced after the UTD happened.

So, we propose to write a log message whenever a key is replaced in backup, allowing us to line up UTD reports (rageshakes) with server logs to figure out what happened.

Outcome of this task

  • Synapse writes a log message whenever a key is replaced in key backup, with enough information to align it with the relevant rageshake
@kegsay
Copy link
Contributor

kegsay commented Mar 20, 2024

Related: matrix-org/synapse#13704

@reivilibre
Copy link
Contributor

In handlers/e2e_room_keys.py

            for room_id, room in room_keys["rooms"].items():
                for session_id, room_key in room["sessions"].items():
                    if not isinstance(room_key["is_verified"], bool):
                        msg = (
                            "is_verified must be a boolean in keys for session %s in"
                            "room %s" % (session_id, room_id)
                        )
                        raise SynapseError(400, msg, Codes.INVALID_PARAM)

                    log_kv(
                        {
                            "message": "Trying to upload room key",
                            "room_id": room_id,
                            "session_id": session_id,
                            "user_id": user_id,
                        }
                    )
                    current_room_key = existing_keys.get(room_id, {}).get(session_id)
                    if current_room_key:
                        if self._should_replace_room_key(current_room_key, room_key):
                            log_kv({"message": "Replacing room key."})
                            # updates are done one at a time in the DB, so send
                            # updates right away rather than batching them up,
                            # like we do with the inserts
                            await self.store.update_e2e_room_key(
                                user_id, version, room_id, session_id, room_key
                            )
                            changed = True
                        else:
                            log_kv({"message": "Not replacing room_key."})
                    else:
                        log_kv(
                            {
                                "message": "Room key not found.",
                                "room_id": room_id,
                                "user_id": user_id,
                            }
                        )
                        log_kv({"message": "Replacing room key."})
                        to_insert.append((room_id, session_id, room_key))
                        changed = True

We seem to log to the opentracing, but do you just want regular log lines for these things as well?

(I think that second 'Replacing room key' line might be a mistake as well, actually..)

@richvdh
Copy link
Member

richvdh commented May 28, 2024

We seem to log to the opentracing, but do you just want regular log lines for these things as well?

Yes please. The trouble with opentracing is that it's disabled for most users.

reivilibre added a commit that referenced this issue Jun 7, 2024
…they are replacing other room keys. (#17266)

Fixes: #17013

Add logging for whether room keys are replaced
This is motivated by the Crypto team who need to diagnose crypto issues.

The existing opentracing logging is not enough because it is not enabled
for all users.
Mic92 pushed a commit to Mic92/synapse that referenced this issue Jun 14, 2024
…they are replacing other room keys. (element-hq#17266)

Fixes: element-hq#17013

Add logging for whether room keys are replaced
This is motivated by the Crypto team who need to diagnose crypto issues.

The existing opentracing logging is not enough because it is not enabled
for all users.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants