Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

joining #neb:matrix.org often fails #2956

Open
richvdh opened this issue Mar 7, 2018 · 4 comments
Open

joining #neb:matrix.org often fails #2956

richvdh opened this issue Mar 7, 2018 · 4 comments
Labels
A-Federated-Join joins over federation generally suck O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@richvdh
Copy link
Member

richvdh commented Mar 7, 2018

The first attempt gives:

2018-03-06 17:48:50,126 - synapse.handlers.federation - 1578 - WARNING - POST-237286- Rejecting $1416421111999gtomZ:matrix.org because Invalid ID: 'matrix.org'

and then

2018-03-06 17:48:50,973 - synapse.http.server - 183 - ERROR - POST-237286- Failed handle request synapse.http.server._async_render on <synapse.rest.ClientRestResource object at 0x7f428604b190>: <XForwardedForRequest at 0x7f423878ba70 method=POST uri=/_matrix/client/r0/join/%23neb:matrix.org? clientproto=HTTP/1.1 sit
Failure: twisted.internet.defer.FirstError: FirstError[#0, [Failure instance: Traceback: <type 'exceptions.IndexError'>: list index out of range
/opt/synapse/env/local/lib/python2.7/site-packages/twisted/internet/defer.py:653:_runCallbacks
/opt/synapse/env/local/lib/python2.7/site-packages/twisted/internet/defer.py:1442:gotResult
/opt/synapse/env/local/lib/python2.7/site-packages/twisted/internet/defer.py:1384:_inlineCallbacks
/opt/synapse/env/local/lib/python2.7/site-packages/twisted/python/failure.py:419:throwExceptionIntoGenerator
--- <exception caught here> ---
/opt/synapse/synapse/synapse/storage/events.py:143:handle_queue_loop
/opt/synapse/env/local/lib/python2.7/site-packages/twisted/internet/defer.py:1384:_inlineCallbacks
/opt/synapse/env/local/lib/python2.7/site-packages/twisted/python/failure.py:419:throwExceptionIntoGenerator
/opt/synapse/synapse/synapse/storage/events.py:288:persisting_queue
/opt/synapse/env/local/lib/python2.7/site-packages/twisted/internet/defer.py:1384:_inlineCallbacks
/opt/synapse/env/local/lib/python2.7/site-packages/twisted/python/failure.py:419:throwExceptionIntoGenerator
/opt/synapse/synapse/synapse/storage/events.py:184:f
/opt/synapse/env/local/lib/python2.7/site-packages/twisted/internet/defer.py:1386:_inlineCallbacks
/opt/synapse/synapse/synapse/storage/events.py:421:_persist_events
/opt/synapse/synapse/synapse/server.py:181:is_mine_id
]]

$1416421111999gtomZ:matrix.org looks like:

  {
    "origin": "matrix.org",
    "signatures": {
      "matrix.org": {
        "ed25519:auto": "M/HQNUSCDEmmzdwGFFsAi9lD/V6RoS/tNavEW6w2bM41+RQq6CFm7jdriXpzK2QoZqN9qCuv3sILMZr1Df/lDw"
      }
    },
    "origin_server_ts": 1410782729923,
    "sender": "matrix.org",
    "event_id": "$1416421111999gtomZ:matrix.org",
    "stream_ordering": 5144,
    "prev_events": [
      [
        "$1416421111992SoSWp:matrix.org",
        {
          "sha256": "iJAK2dEcpDiCysSaChnsUJkbJB6sF/B8g3hzCDj3BHo"
        }
      ]
    ],
    "unsigned": {
      "age": 103937416446
    },
    "state_key": "matrix.org",
    "content": {
      "aliases": [
        "#neb:matrix.org"
      ]
    },
    "depth": 2,
    "prev_state": [],
    "room_id": "!aZkanAnWEdxcRIQkWn:matrix.org",
    "auth_events": [],
    "hashes": {
      "sha256": "2wva9kILrW7ngfYQLcwqFhvu+Pc+HgZ+bQIyzRprahI"
    },
    "type": "m.room.aliases"
  }

(note the invalid sender).

However, that is enough for all of the events to be persisted in the database. A second attempt then looks at those events to determine other servers in the room, and has a good chance (note: not guaranteed) of picking one that's not matrix.org to do the join.

Most other servers will not have $1416421111999gtomZ:matrix.org (since they will have rejected it) so do not return it from the send_join, and the join therefore succeeds.

(An alternative approach is to join via a different alias, such as #neb:t2l.io - which, again, means that you do not receive the problem event and the join succeeds.)

@richvdh richvdh changed the title joining #neb:matrix.org only works on the second attempt joining #neb:matrix.org often fails Mar 7, 2018
@richvdh
Copy link
Member Author

richvdh commented Mar 7, 2018

The base problem here is that matrix.org is sending an invalid response to the send_join request. We could fix this in one of two ways:

  • Completely reject the send_join response, and hence (because we don't know of any other servers in the room), the join request. This won't help the user join the room, but might mean we can give back a more helpful error than Internal Server Error (and won't leave the database in a state where we get ERROR store_room with room_id=!aZkanAnWEdxcRIQkWn:matrix.org failed: duplicate key value violates unique constraint "rooms_pkey" in the logs being a red herring).

  • Make the join code tolerate the invalid event. It is worth noting that the exception is being thrown from code which exists solely to update metrics (below), so we could easily be more resilient in how we parse the sender.

    if context.app_service:
    origin_type = "local"
    origin_entity = context.app_service.id
    elif self.hs.is_mine_id(event.sender):
    origin_type = "local"
    origin_entity = "*client*"
    else:
    origin_type = "remote"
    origin_entity = get_domain_from_id(event.sender)
    event_counter.inc(event.type, origin_type, origin_entity)

@babolivier
Copy link
Contributor

babolivier commented Dec 4, 2018

I have the same issue for joining a different room I was invited to, I'm seeing the following error in synapse's logs:

2018-12-04 09:56:59,834 - synapse.federation.federation_client - 512 - WARNING - POST-1303- Failed to send_join via matrix.org
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/synapse/federation/federation_client.py", line 494, in _try_destination_list
    res = yield callback(destination)
SynapseError: 400: Invalid ID: 'matrix.org'

People that are in the room see me join it, but on my side the join fails (Internal server error). This exact scenario has happened twice on two different homeservers, for the same room.

@localguru
Copy link
Contributor

Hi, I get the same error same error "Failed to send_join via any server" from my homeserver. The strange thing is, that I see my user has joined #neb:matrix.org when I look into the channel with my matrix.org user. Same as @babolivier reported.

@clokep
Copy link
Member

clokep commented Nov 18, 2020

We should probably do something like UserID.is_valid(event.sender) before trying to call get_domain_from_id on it.

@MadLittleMods MadLittleMods added the T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. label May 16, 2022
@squahtx squahtx added S-Major Major functionality / product severely impaired, no satisfactory workaround. O-Uncommon Most users are unlikely to come across this or unexpected workflow and removed z-bug (Deprecated Label) z-p2 (Deprecated Label) labels Oct 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Federated-Join joins over federation generally suck O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

No branches or pull requests

7 participants