Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gossip: connectivity logging is extremely spammy in 128 node cluster #30088

Closed
a-robinson opened this issue Sep 11, 2018 · 13 comments
Closed

gossip: connectivity logging is extremely spammy in 128 node cluster #30088

a-robinson opened this issue Sep 11, 2018 · 13 comments
Assignees
Labels
A-kv-gossip C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.
Milestone

Comments

@a-robinson
Copy link
Contributor

a-robinson commented Sep 11, 2018

I spun up a 128 node cluster for tpc-c 50k testing (it was meant to be / will soon be 120, but my brain accidentally rounded up when creating it), and the gossip connectivity logs are overwhelming. First of all, they're huge, taking up almost my entire screen with just one log line:

I180911 18:58:34.645464 127 gossip/gossip.go:941  [n1] gossip connectivity
  n3 [sentinel];
  n1 -> n24; n1 -> n45; n1 -> n46; n1 -> n74; n1 -> n105; n1 -> n122; n2 -> n1; n2 -> n36; n2 -> n56; n2 -> n90; n2 -> n94; n3 -> n1; n3 -> n83; n4 -> n1; n4 -> n21; n4 -> n87; n4 -> n89; n4 -> n106; n4 -
> n109; n5 -> n2; n5 -> n71; n5 -> n79; n5 -> n120; n5 -> n127; n6 -> n4; n6 -> n14; n6 -> n65; n6 -> n68; n6 -> n98; n6 -> n99; n7 -> n4; n7 -> n57; n7 -> n91; n7 -> n94; n7 -> n115; n8 -> n2; n8 -> n35;
 n8 -> n63; n8 -> n100; n8 -> n103; n8 -> n119; n9 -> n3; n9 -> n87; n9 -> n96; n9 -> n99; n9 -> n115; n10 -> n4; n10 -> n30; n10 -> n60; n10 -> n83; n10 -> n127; n11 -> n2; n11 -> n17; n11 -> n29; n11 -> n30; n11 -> n45; n11 -> n84; n12 -> n3; n12 -> n42; n12 -> n87; n13 -> n8; n13 -> n21; n13 -> n22; n13 -> n25; n13 -> n34; n13 -> n76; n14 -> n11; n14 -> n37; n14 -> n62; n14 -> n78; n14 -> n110; n14 -> n117; n15 -> n3; n15 -> n14; n15 -> n49; n15 -> n59; n15 -> n74; n15 -> n92; n16 -> n3; n16 -> n5; n16 -> n79; n16 -> n112; n16 -> n126; n17 -> n9; n17 -> n25; n17 -> n53; n17 -> n70; n17 -> n120; n18 -> n11; n18 -> n40; n18 -> n64; n18 -> n114; n18 -> n128; n19 -> n6; n19 -> n9; n19 -> n37; n19 -> n40; n19 -> n66; n19 -> n77; n20 -> n15; n20 -> n19; n20 -> n48; n20 -> n61; n20 -> n67; n20 -> n116; n21 -> n12; n21 -> n83; n21 -> n96; n21 -> n114; n22 -> n7; n22 -> n51; n22 -> n55; n22 -> n63; n22 -> n76; n22 -> n109; n23 -> n7; n23 -> n20; n23 -> n26; n23 -> n60; n23 -> n62; n23 -> n113; n24 -> n7; n24 -> n10; n24 -> n15; n24 -> n26; n24 -> n31; n24 -> n59; n25 -> n7; n25 -> n40; n25 -> n44; n25 -> n50; n25 -> n119; n25 -> n120; n26 -> n8; n26 -> n47; n26 -> n55; n26 -> n57; n26 -> n111; n26 -> n121; n27 -> n10; n27 -> n33; n27 -> n38; n27 -> n60; n27 -> n62; n28 -> n4; n28 -> n52; n28 -> n54; n28 -> n93; n28 -> n117; n28 -> n126; n29 -> n1; n29 -> n22; n29 -> n65; n29 -> n72; n29 -> n90; n30 -> n3; n30 -> n54; n30 -> n66; n30 -> n79; n30 -> n81; n30 -> n128; n31 -> n29; n31 -> n46; n31 -> n48; n31 -> n59; n31 -> n75; n31 -> n117; n32 -> n29; n32 -> n43; n32 -> n60; n32 -> n73; n32 -> n113; n33 -> n12; n33 -> n51; n33 -> n82; n33 -> n114; n33 -> n126; n34 -> n10; n34 -> n17; n34 -> n18; n34 -> n24; n34 -> n50; n35 -> n7; n35 -> n27; n35 -> n36; n35 -> n66; n35 -> n76; n36 -> n22; n36 -> n30; n36 -> n71; n36 -> n83; n36 -> n96; n36 -> n109; n37 -> n12; n37 -> n16; n37 -> n46; n37 -> n71; n37 -> n101; n37 -> n118; n38 -> n19; n38 -> n29; n38 -> n64; n38 -> n84; n38 -> n99; n38 -> n108; n39 -> n9; n39 -> n37; n39 -> n59; n39 -> n86; n39 -> n99; n39 -> n105; n40 -> n16; n40 -> n22; n40 -> n98; n40 -> n104; n40 -> n119; n40 -> n121; n41 -> n18; n41 -> n29; n41 -> n35; n41 -> n54; n41 -> n94; n41 -> n120; n42 -> n2; n42 -> n20; n42 -> n26; n42 -> n31; n42 -> n94; n42 -> n97; n43 -> n8; n43 -> n18; n43 -> n63; n43 -> n67; n43 -> n117; n44 -> n18; n44 -> n38; n44 -> n39; n44 -> n82; n44 -> n95; n44 -> n113; n45 -> n14; n45 -> n25; n45 -> n28; n45 -> n53; n45 -> n124; n46 -> n27; n46 -> n32; n46 -> n56; n46 -> n65; n46 -> n66; n46 -> n75; n47 -> n15; n47 -> n32; n47 -> n35; n47 -> n41; n47 -> n44; n48 -> n16; n48 -> n25; n48 -> n73; n48 -> n98; n48 -> n121; n48 -> n123; n49 -> n6; n49 -> n13; n49 -> n34; n49 -> n35; n49 -> n37; n49 -> n94; n50 -> n42; n50 -> n55; n50 -> n62; n50 -> n65; n50 -> n73; n50 -> n125; n51 -> n11; n51 -> n20; n51 -> n32; n51 -> n54; n51 -> n57; n52 -> n8; n52 -> n33; n52 -> n76; n52 -> n80; n52 -> n95; n52 -> n102; n53 -> n10; n53 -> n23; n53 -> n41; n53 -> n58; n53 -> n67; n53 -> n105; n54 -> n16; n54 -> n23; n54 -> n47; n54 -> n77; n54 -> n104; n54 -> n123; n55 -> n15; n55 -> n56; n55 -> n101; n55 -> n110; n55 -> n128; n56 -> n21; n56 -> n24; n56 -> n28; n56 -> n60; n56 -> n100; n57 -> n12; n57 -> n28; n57 -> n30; n57 -> n39; n57 -> n49; n58 -> n9; n58 -> n28; n58 -> n38; n58 -> n56; n58 -> n62; n58 -> n94; n59 -> n27; n59 -> n41; n59 -> n44; n59 -> n91; n59 -> n119; n59 -> n120; n60 -> n15; n60 -> n81; n60 -> n96; n60 -> n107; n60 -> n115; n61 -> n2; n61 -> n10; n61 -> n17; n61 -> n41; n61 -> n64; n61 -> n101; n62 -> n30; n62 -> n43; n62 -> n44; n62 -> n88; n63 -> n16; n63 -> n26; n63 -> n47; n63 -> n77; n63 -> n121; n63 -> n123; n64 -> n20; n64 -> n45; n64 -> n77; n64 -> n98; n64 -> n104; n64 -> n106; n65 -> n18; n65 -> n28; n65 -> n32; n65 -> n48; n65 -> n59; n65 -> n84; n66 -> n6; n66 -> n25; n66 -> n57; n66 -> n90; n66 -> n95; n66 -> n119; n67 -> n4; n67 -> n24; n67 -> n49; n67 -> n56; n67 -> n93; n67 -> n105; n68 -> n3; n68 -> n103; n68 -> n109; n68 -> n122; n69 -> n1; n69 -> n34; n69 -> n44; n69 -> n97; n69 -> n109; n69 -> n127; n70 -> n29; n70 -> n39; n70 -> n55; n70 -> n82; n70 -> n111; n71 -> n45; n71 -> n65; n71 -> n69; n71 -> n81; n71 -> n83; n71 -> n106; n72 -> n50; n72 -> n68; n72 -> n87; n72 -> n93; n72 -> n102; n73 -> n16; n73 -> n57; n73 -> n77; n73 -> n98; n73 -> n104; n73 -> n128; n74 -> n2; n74 -> n17; n74 -> n28; n74 -> n52; n74 -> n73; n74 -> n104; n75 -> n41; n75 -> n58; n75 -> n80; n75 -> n95; n75 -> n101; n76 -> n9; n76 -> n20; n76 -> n50; n76 -> n65; n76 -> n74; n77 -> n5; n77 -> n44; n77 -> n51; n77 -> n70; n77 -> n84; n77 -> n118; n78 -> n8; n78 -> n24; n78 -> n39; n78 -> n73; n78 -> n95; n78 -> n127; n79 -> n13; n79 -> n34; n79 -> n35; n79 -> n37; n79 -> n90; n79 -> n113; n80 -> n15; n80 -> n19; n80 -> n67; n80 -> n112; n80 -> n128; n81 -> n7; n81 -> n18; n81 -> n53; n81 -> n80; n81 -> n96; n81 -> n124; n82 -> n68; n82 -> n84; n82 -> n86; n82 -> n105; n82 -> n107; n83 -> n8; n83 -> n33; n83 -> n47; n83 -> n58; n83 -> n70; n84 -> n19; n84 -> n39; n84 -> n52; n84 -> n78; n84 -> n92; n84 -> n100; n85 -> n38; n85 -> n42; n85 -> n69; n85 -> n79; n85 -> n102; n86 -> n11; n86 -> n63; n86 -> n72; n86 -> n76; n86 -> n116; n86 -> n117; n87 -> n31; n87 -> n64; n87 -> n73; n87 -> n95; n87 -> n110; n88 -> n25; n88 -> n34; n88 -> n46; n88 -> n69; n88 -> n74; n88 -> n110; n89 -> n13; n89 -> n42; n89 -> n61; n89 -> n69; n89 -> n81; n89 -> n92; n90 -> n9; n90 -> n19; n90 -> n23; n90 -> n31; n90 -> n36; n90 -> n107; n91 -> n68; n91 -> n86; n91 -> n93; n91 -> n108; n91 -> n112; n92 -> n23; n92 -> n27; n92 -> n69; n92 -> n78; n92 -> n105; n93 -> n61; n93 -> n63; n93 -> n88; n93 -> n89; n93 -> n92; n94 -> n32; n94 -> n38; n94 -> n54; n94 -> n55; n94 -> n66; n94 -> n117; n95 -> n6; n95 -> n30; n95 -> n34; n95 -> n97; n95 -> n107; n96 -> n27; n96 -> n40; n96 -> n43; n96 -> n70; n96 -> n79; n96 -> n92; n97 -> n20; n97 -> n32; n97 -> n36; n97 -> n57; n97 -> n61; n98 -> n5; n98 -> n23; n98 -> n53; n98 -> n75; n98 -> n112; n99 -> n31; n99 -> n51; n99 -> n82; n99 -> n91; n99 -> n110; n99 -> n111; n100 -> n50; n100 -> n61; n100 -> n68; n100 -> n108; n100 -> n115; n101 -> n6; n101 -> n22; n101 -> n92; n101 -> n107; n101 -> n112; n102 -> n38; n102 -> n48; n102 -> n55; n102 -> n76; n102 -> n81; n103 -> n85; n103 -> n89; n103 -> n101; n103 -> n111; n103 -> n115; n104 -> n5; n104 -> n13; n104 -> n53; n104 -> n78; n104 -> n116; n105 -> n14; n105 -> n17; n105 -> n35; n105 -> n52; n105 -> n81; n105 -> n88; n106 -> n33; n106 -> n54; n106 -> n62; n106 -> n70; n106 -> n74; n106 -> n82; n107 -> n21; n107 -> n22; n107 -> n24; n107 -> n31; n107 -> n47; n107 -> n51; n108 -> n14; n108 -> n63; n108 -> n71; n108 -> n85; n108 -> n97; n109 -> n40; n109 -> n58; n109 -> n70; n109 -> n85; n109 -> n108; n109 -> n120; n110 -> n10; n110 -> n19; n110 -> n37; n110 -> n40; n110 -> n43; n110 -> n72; n111 -> n11; n111 -> n27; n111 -> n43; n111 -> n64; n111 -> n90; n112 -> n78; n112 -> n79; n112 -> n88; n112 -> n111; n112 -> n113; n112 -> n126; n113 -> n33; n113 -> n43; n113 -> n64; n113 -> n67; n113 -> n86; n113 -> n100; n114 -> n60; n114 -> n77; n114 -> n98; n114 -> n104; n114 -> n123; n115 -> n89; n115 -> n91; n115 -> n97; n115 -> n106; n115 -> n116; n115 -> n123; n116 -> n26; n116 -> n49; n116 -> n85; n116 -> n114; n116 -> n127; n117 -> n53; n117 -> n68; n117 -> n71; n117 -> n96; n117 -> n116; n117 -> n127; n118 -> n13; n118 -> n36; n118 -> n49; n118 -> n91; n118 -> n124; n119 -> n12; n119 -> n42; n119 -> n106; n119 -> n110; n119 -> n114; n120 -> n48; n120 -> n67; n120 -> n71; n120 -> n86; n120 -> n100; n120 -> n118; n121 -> n5; n121 -> n51; n121 -> n84; n121 -> n89; n121 -> n109; n122 -> n42; n122 -> n75; n122 -> n78; n122 -> n88; n122 -> n89; n123 -> n5; n123 -> n17; n123 -> n56; n123 -> n66; n123 -> n101; n123 -> n119; n124 -> n12; n124 -> n48; n124 -> n52; n124 -> n97; n124 -> n103; n124 -> n107; n125 -> n58; n125 -> n69; n125 -> n85; n125 -> n99; n125 -> n112; n125 -> n115; n126 -> n1; n126 -> n50; n126 -> n74; n126 -> n93; n126 -> n111; n127 -> n52; n127 -> n59; n127 -> n88; n127 -> n106; n127 -> n113; n127 -> n126; n128 -> n4; n128 -> n6; n128 -> n13; n128 -> n47; n128 -> n61; n128 -> n100;

Secondly, they're being logged a few times per second, making it impossible to read anything else out of the logs.

The second point is probably indicative of some amount of instability in the gossip network that should be investigated on its own right. However, even if it was only being logged once a minute as part of the gossip status, this would be quite annoying. The entire "gossip status" message literally doesn't fit on the screen of my 15" macbook:

screen shot 2018-09-11 at 2 15 23 pm

Also, wow that's a lot of data being transmitted over gossip (the server indicates 25GB sent / 12GB received). I suspect a lot of it is the connectivity map being resent in its entirety over and over.

It'd be reasonable to say we should first understand/fix the fact that there are frequent changes happening in the gossip network. I'm still worried about how things will look after that, though.

@petermattis

@a-robinson a-robinson added this to the 2.1 milestone Sep 11, 2018
@petermattis
Copy link
Collaborator

I think we can remove the every minute logging of gossip connectivity. The other logging only occurs when gossip connectivity changes. It shouldn't be changing frequently and we should figure out why it is.

@a-robinson
Copy link
Contributor Author

Yeah, I'll check out what's going on with gossip after the tpc-c runs are done, or more likely I'll just spin up a cluster using smaller nodes later on to focus on the gossip in a more cost effective way.

However, there will always be a lot of logging triggered every time one or more nodes get restarted as the gossip network adjusts to the change(s) and stabilizes, which will certainly add up during an event like a rolling upgrade.

@petermattis
Copy link
Collaborator

On a 3 node cluster we gossip at ~2KB/sec. On a 9 node cluster we gossip at ~12KB/sec. On a 27 node cluster we gossip at ~55KB/sec. None of these numbers scale up to 37GB gossiped over 2 hours. I think there is something to investigate there.

@petermattis
Copy link
Collaborator

I think there is something to investigate there.

We both gossip the cluster-id as well as include it in every gossip.Request. That seems excessive, though I doubt it has a significant impact.

@petermattis
Copy link
Collaborator

On a 3 node cluster, over the course of 1 minute we send/receive the following keys via gossip:

    gossip-clients:1      138     6348
          cluster-id       57     3363
          cluster-id       53     3339
         first-range       35     2870
            sentinel       38     2394
    gossip-clients:3       45     2115
             store:1        6     2100
    gossip-clients:2       35     1680
          liveness:2       15      975
          liveness:3       13      845
          liveness:1       11      715
             store:3        2      708
             store:2        2      708

The second column is the number of times the key was sent/received. The third column is the size of the gossiped value times the number of times it was gossiped (providing an indication of what fraction of the gossiped bytes it is consuming).

Note that the sizes above are only measuring gossip.Info. In my testing, this only accounts for ~40% of the gossip traffic. The other 60% is gossip.{Request,Response} overhead. I suspect most of the extra data is in gossip.HighWaterStamps, though I haven't confirmed that.

@petermattis
Copy link
Collaborator

We need to gossip the sentinel frequently, but there is no need to be gossiping the cluster-id and first-range so fast.

@petermattis
Copy link
Collaborator

The numbers I posted above were the sent/received gossip infos from a single node. The math inclined will notice that we sent/received 138 gossip-clients:1 keys. I looked into this a bit more and it turns out we sent 138 keys during that minute time frame. But something is off here because we are only supposed to send that key every 2 seconds so we should have gossiped it 60 times. A bit more instrumentation shows that every time AddInfo is called for that key, we send it two each connected node twice and sometimes three times. I'm not sure why. Definitely a bug.

@petermattis
Copy link
Collaborator

I'm pretty sure the problem is due to how we update highWaterStamps. For server connections, we only update highWaterStamps upon receipt of a new request from the remote node. But that means that if we have a new info to send, we'll keep sending it until the remote node sends another request. I'm pretty sure the same problem occurs for the client-side of the connection. Rather than waiting for an update from the remote node, I think we need to update our view of the remote's highWaterStamps as soon as we send them something.

petermattis added a commit to petermattis/cockroach that referenced this issue Sep 12, 2018
The previous handling of high water stamps left a gap during which an
info could be resent over a connection multiple times. The general
gossip send loop is to wait for a new info to be added to the local info
store, to compute a delta between the local info store and the remote
info store using the remote's high water stamps and to update the
remote's high water stamps when the remote sends a reply on the
connection. The problem here is that updating the remote's high water
stamps when the remote sends a reply leaves a window where an info added
to the local store would be sent multiple times (every time an
additional info was added) until the remote replied. The fix is to keep
track of a per connection high water stamp map and ratchet the contained
stamps both when we receive a reply from the remote and when we send the
remote an info. The result is a 30-40% reduction in gossip
traffic (bytes sent/received) as measured on idle 3, 9 and 27 node
clusters.

See cockroachdb#30088

Release note: None
petermattis added a commit to petermattis/cockroach that referenced this issue Sep 12, 2018
The previous handling of high water stamps left a gap during which an
info could be resent over a connection multiple times. The general
gossip send loop is to wait for a new info to be added to the local info
store, to compute a delta between the local info store and the remote
info store using the remote's high water stamps and to update the
remote's high water stamps when the remote sends a reply on the
connection. The problem here is that updating the remote's high water
stamps when the remote sends a reply leaves a window where an info added
to the local store would be sent multiple times (every time an
additional info was added) until the remote replied. The fix is to keep
track of a per connection high water stamp map and ratchet the contained
stamps both when we receive a reply from the remote and when we send the
remote an info. The result is a 30-40% reduction in gossip
traffic (bytes sent/received) as measured on idle 3, 9 and 27 node
clusters.

See cockroachdb#30088

Release note: None
@petermattis
Copy link
Collaborator

See #30126. I think that is an explanation for the gigabytes of gossip traffic.

craig bot pushed a commit that referenced this issue Sep 13, 2018
30106: gossip: remove unnecessary gossip traffic r=spencerkimball,a-robinson,bdarnell a=petermattis

The previous handling of high water stamps left a gap during which an
info could be resent over a connection multiple times. The general
gossip send loop is to wait for a new info to be added to the local info
store, to compute a delta between the local info store and the remote
info store using the remote's high water stamps and to update the
remote's high water stamps when the remote sends a reply on the
connection. The problem here is that updating the remote's high water
stamps when the remote sends a reply leaves a window where an info added
to the local store would be sent multiple times (every time an
additional info was added) until the remote replied. The fix is to keep
track of a per connection high water stamp map and ratchet the contained
stamps both when we receive a reply from the remote and when we send the
remote an info. The result is a 30-40% reduction in gossip
traffic (bytes sent/received) as measured on idle 3, 9 and 27 node
clusters.

See #30088

Release note: None

Co-authored-by: Peter Mattis <[email protected]>
petermattis added a commit to petermattis/cockroach that referenced this issue Sep 13, 2018
Make gossip.Request.{Addr,ClusterID} nullable and only send them on the
first request over a connection. Remove gossip.Response.Addr which was
unused.

Only gossip the cluster ID once per server. The cluster ID never
expires, so there was no need to be re-gossiping it frequently.

Keep track of which high water stamps have been sent to the remote side
of a connection and only send stamps that are newer than those already
sent. This has a dramatic impact on gossip traffic as a cluster gets
larger as usually only 1 or 2 stamps have changed on each gossip message
which results in reducing an O(n) map size to a constant.

In a local 27-node cluster, these changes reduce gossip traffic by
~60%.

Fixes cockroachdb#30126
See cockroachdb#30088

Release note: None
justinj pushed a commit to jordanlewis/cockroach that referenced this issue Sep 14, 2018
The previous handling of high water stamps left a gap during which an
info could be resent over a connection multiple times. The general
gossip send loop is to wait for a new info to be added to the local info
store, to compute a delta between the local info store and the remote
info store using the remote's high water stamps and to update the
remote's high water stamps when the remote sends a reply on the
connection. The problem here is that updating the remote's high water
stamps when the remote sends a reply leaves a window where an info added
to the local store would be sent multiple times (every time an
additional info was added) until the remote replied. The fix is to keep
track of a per connection high water stamp map and ratchet the contained
stamps both when we receive a reply from the remote and when we send the
remote an info. The result is a 30-40% reduction in gossip
traffic (bytes sent/received) as measured on idle 3, 9 and 27 node
clusters.

See cockroachdb#30088

Release note: None
petermattis added a commit to petermattis/cockroach that referenced this issue Sep 14, 2018
Make gossip.Request.{Addr,ClusterID} nullable and only send them on the
first request over a connection. Remove gossip.Response.Addr which was
unused.

Only gossip the cluster ID once per server. The cluster ID never
expires, so there was no need to be re-gossiping it frequently.

Keep track of which high water stamps have been sent to the remote side
of a connection and only send stamps that are newer than those already
sent. This has a dramatic impact on gossip traffic as a cluster gets
larger as usually only 1 or 2 stamps have changed on each gossip message
which results in reducing an O(n) map size to a constant.

In a local 27-node cluster, these changes reduce gossip traffic by
~60%.

Fixes cockroachdb#30126
See cockroachdb#30088

Release note: None
petermattis added a commit to petermattis/cockroach that referenced this issue Sep 14, 2018
The previous handling of high water stamps left a gap during which an
info could be resent over a connection multiple times. The general
gossip send loop is to wait for a new info to be added to the local info
store, to compute a delta between the local info store and the remote
info store using the remote's high water stamps and to update the
remote's high water stamps when the remote sends a reply on the
connection. The problem here is that updating the remote's high water
stamps when the remote sends a reply leaves a window where an info added
to the local store would be sent multiple times (every time an
additional info was added) until the remote replied. The fix is to keep
track of a per connection high water stamp map and ratchet the contained
stamps both when we receive a reply from the remote and when we send the
remote an info. The result is a 30-40% reduction in gossip
traffic (bytes sent/received) as measured on idle 3, 9 and 27 node
clusters.

See cockroachdb#30088

Release note: None
a-robinson added a commit to a-robinson/cockroach that referenced this issue Sep 16, 2018
As discussed on cockroachdb#30088. It can be absolutely massive on large clusters,
and isn't needed because we still log the connectivity info whenever it
changes.

Release note: None
craig bot pushed a commit that referenced this issue Sep 17, 2018
30035: builder: upgrade to go1.11 r=bdarnell,raduberinde,nvanbenschoten a=benesch

Go 1.11 brings us some nice performance improvements on single-node
clusters. See #29827 for benchmark results.

Fix #29827.

Release note: None

30285: gossip: Don't include gossip connectivity in periodic status logs r=a-robinson a=a-robinson

As discussed on #30088. It can be absolutely massive on large clusters,
and isn't needed because we still log the connectivity info whenever it
changes.

Release note: None

I still think that logging on every change in connectivity is going to be a mess in large clusters during rolling restarts even if gossip is otherwise perfectly stable, but this is useful regardless of whether we do anything about that.

Co-authored-by: Nikhil Benesch <[email protected]>
Co-authored-by: Alex Robinson <[email protected]>
@knz knz changed the title gossip connectivity logging is extremely spammy in 128 node cluster gossip: connectivity logging is extremely spammy in 128 node cluster Sep 17, 2018
@knz knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-kv-gossip labels Sep 17, 2018
a-robinson added a commit to a-robinson/cockroach that referenced this issue Sep 17, 2018
As discussed on cockroachdb#30088. It can be absolutely massive on large clusters,
and isn't needed because we still log the connectivity info whenever it
changes.

Release note: None
@nstewart nstewart added the S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption. label Sep 18, 2018
@petermattis
Copy link
Collaborator

@a-robinson Do you think there is anything else to address here for 2.1? My understanding is that gossip stabilized on your 100+ node cluster and this spam just affected startup of the cluster.

@a-robinson
Copy link
Contributor Author

Unfortunately, that doesn't seem to be the case. I saved the latest log file from the cluster before taking it down, and all the log files have a lot of the connectivity logs even though no nodes were being added/removed around that time. Two nodes were down due to the bad_alloc issue, but they had been down for well over an hour before the time period these logs cover. Let me know if you'd like to see them for yourself.

I imagine the cull logic in gossip that closes the least useful connection is causing a lot of this, and at a high level I'm not sure that connectivity logging on changes can happily coexist with the culling of an outgoing connection from every node every minute, or at least not on large clusters.

Also, orthogonally from the thrashing, the logging during startup is excessive. We log every time we get new connectivity info that isn't identical to the last thing we logged, but during startup a node can get gossip info piece by piece, leading to log output like the below that keeps going on and on until all the nodes' info is received:

I180918 20:02:51.829369 55 gossip/client.go:129  [n?] started gossip client to 35.229.39.61:26257
I180918 20:02:51.836827 168 gossip/gossip.go:942  [n?] gossip connectivity
  n47 [sentinel];
I180918 20:02:51.837244 168 gossip/gossip.go:942  [n?] gossip connectivity
  n47 [sentinel];
  n32 -> n95; n32 -> n96; n32 -> n122; n32 -> n132; n32 -> n135;
I180918 20:02:51.837609 168 gossip/gossip.go:942  [n?] gossip connectivity
  n47 [sentinel];
  n32 -> n95; n32 -> n96; n32 -> n122; n32 -> n132; n32 -> n135; n66 -> n27; n66 -> n55; n66 -> n56; n66 -> n88; n66 -> n102; n81 -> n9; n81 -> n12; n81 -> n66; n81 -> n76; n81 -> n87;
I180918 20:02:51.837920 168 gossip/gossip.go:942  [n?] gossip connectivity
  n47 [sentinel];
  n32 -> n95; n32 -> n96; n32 -> n122; n32 -> n132; n32 -> n135; n37 -> n8; n37 -> n64; n37 -> n65; n37 -> n97; n37 -> n101; n52 -> n5; n52 -> n19; n52 -> n68; n52 -> n79; n52 -> n106; n55 -> n12; n55 -> n58; n55 -> n68; n55 -> n75; n55 -> n96; n66 -> n27; n66 -> n55; n66 -> n56; n66 -> n88; n66 -> n102; n71 -> n43; n71 -> n75; n71 -> n87; n71 -> n88; n71 -> n128; n81 -> n9; n81 -> n12; n81 -> n66; n81 -> n76; n81 -> n87; n92 -> n10; n92 -> n26; n92 -> n75; n92 -> n94; n92 -> n132; n106 -> n2; n106 -> n10; n106 -> n61; n106 -> n70; n106 -> n93;
I180918 20:02:51.838131 168 gossip/gossip.go:942  [n?] gossip connectivity
  n47 [sentinel];
  n32 -> n95; n32 -> n96; n32 -> n122; n32 -> n132; n32 -> n135; n37 -> n8; n37 -> n64; n37 -> n65; n37 -> n97; n37 -> n101; n52 -> n5; n52 -> n19; n52 -> n68; n52 -> n79; n52 -> n106; n55 -> n12; n55 -> n58; n55 -> n68; n55 -> n75; n55 -> n96; n66 -> n27; n66 -> n55; n66 -> n56; n66 -> n88; n66 -> n102; n71 -> n43; n71 -> n75; n71 -> n87; n71 -> n88; n71 -> n128; n81 -> n9; n81 -> n12; n81 -> n66; n81 -> n76; n81 -> n87; n92 -> n10; n92 -> n26; n92 -> n75; n92 -> n94; n92 -> n132; n104 -> n9; n104 -> n98; n104 -> n110; n104 -> n115; n104 -> n128; n106 -> n2; n106 -> n10; n106 -> n61; n106 -> n70; n106 -> n93;
I180918 20:02:51.838544 168 gossip/gossip.go:942  [n?] gossip connectivity
  n47 [sentinel];
  n7 -> n37; n7 -> n40; n7 -> n42; n7 -> n71; n7 -> n78; n26 -> n29; n26 -> n58; n26 -> n118; n26 -> n120; n26 -> n127; n32 -> n95; n32 -> n96; n32 -> n122; n32 -> n132; n32 -> n135; n37 -> n8; n37 -> n64; n37 -> n65; n37 -> n97; n37 -> n101; n47 -> n11; n47 -> n13; n47 -> n23; n47 -> n76; n47 -> n78; n52 -> n5; n52 -> n19; n52 -> n68; n52 -> n79; n52 -> n106; n55 -> n12; n55 -> n58; n55 -> n68; n55 -> n75; n55 -> n96; n66 -> n27; n66 -> n55; n66 -> n56; n66 -> n88; n66 -> n102; n71 -> n43; n71 -> n75; n71 -> n87; n71 -> n88; n71 -> n128; n81 -> n9; n81 -> n12; n81 -> n66; n81 -> n76; n81 -> n87; n92 -> n10; n92 -> n26; n92 -> n75; n92 -> n94; n92 -> n132; n104 -> n9; n104 -> n98; n104 -> n110; n104 -> n115; n104 -> n128; n106 -> n2; n106 -> n10; n106 -> n61; n106 -> n70; n106 -> n93; n116 -> n12; n116 -> n18; n116 -> n69; n116 -> n73; n116 -> n76;
I180918 20:02:51.838867 168 gossip/gossip.go:942  [n?] gossip connectivity
  n47 [sentinel];
  n7 -> n37; n7 -> n40; n7 -> n42; n7 -> n71; n7 -> n78; n26 -> n29; n26 -> n58; n26 -> n118; n26 -> n120; n26 -> n127; n30 -> n4; n30 -> n31; n30 -> n34; n30 -> n63; n30 -> n79; n32 -> n95; n32 -> n96; n32 -> n122; n32 -> n132; n32 -> n135; n37 -> n8; n37 -> n64; n37 -> n65; n37 -> n97; n37 -> n101; n45 -> n37; n45 -> n46; n45 -> n58; n45 -> n77; n45 -> n85; n47 -> n11; n47 -> n13; n47 -> n23; n47 -> n76; n47 -> n78; n52 -> n5; n52 -> n19; n52 -> n68; n52 -> n79; n52 -> n106; n55 -> n12; n55 -> n58; n55 -> n68; n55 -> n75; n55 -> n96; n66 -> n27; n66 -> n55; n66 -> n56; n66 -> n88; n66 -> n102; n71 -> n43; n71 -> n75; n71 -> n87; n71 -> n88; n71 -> n128; n72 -> n17; n72 -> n18; n72 -> n77; n72 -> n90; n72 -> n130; n81 -> n9; n81 -> n12; n81 -> n66; n81 -> n76; n81 -> n87; n92 -> n10; n92 -> n26; n92 -> n75; n92 -> n94; n92 -> n132; n94 -> n42; n94 -> n43; n94 -> n115; n94 -> n129; n94 -> n134; n104 -> n9; n104 -> n98; n104 -> n110; n104 -> n115; n104 -> n128; n106 -> n2; n106 -> n10; n106 -> n61; n106 -> n70; n106 -> n93; n116 -> n12; n116 -> n18; n116 -> n69; n116 -> n73; n116 -> n76; n130 -> n4; n130 -> n11; n130 -> n19; n130 -> n20; n130 -> n65;
I180918 20:02:51.839272 168 gossip/gossip.go:942  [n?] gossip connectivity
  n47 [sentinel];
  n7 -> n37; n7 -> n40; n7 -> n42; n7 -> n71; n7 -> n78; n26 -> n29; n26 -> n58; n26 -> n118; n26 -> n120; n26 -> n127; n30 -> n4; n30 -> n31; n30 -> n34; n30 -> n63; n30 -> n79; n32 -> n95; n32 -> n96; n32 -> n122; n32 -> n132; n32 -> n135; n37 -> n8; n37 -> n64; n37 -> n65; n37 -> n97; n37 -> n101; n45 -> n37; n45 -> n46; n45 -> n58; n45 -> n77; n45 -> n85; n47 -> n11; n47 -> n13; n47 -> n23; n47 -> n76; n47 -> n78; n52 -> n5; n52 -> n19; n52 -> n68; n52 -> n79; n52 -> n106; n55 -> n12; n55 -> n58; n55 -> n68; n55 -> n75; n55 -> n96; n63 -> n17; n63 -> n36; n63 -> n39; n63 -> n101; n63 -> n119; n66 -> n27; n66 -> n55; n66 -> n56; n66 -> n88; n66 -> n102; n71 -> n43; n71 -> n75; n71 -> n87; n71 -> n88; n71 -> n128; n72 -> n17; n72 -> n18; n72 -> n77; n72 -> n90; n72 -> n130; n81 -> n9; n81 -> n12; n81 -> n66; n81 -> n76; n81 -> n87; n92 -> n10; n92 -> n26; n92 -> n75; n92 -> n94; n92 -> n132; n94 -> n42; n94 -> n43; n94 -> n115; n94 -> n129; n94 -> n134; n104 -> n9; n104 -> n98; n104 -> n110; n104 -> n115; n104 -> n128; n106 -> n2; n106 -> n10; n106 -> n61; n106 -> n70; n106 -> n93; n116 -> n12; n116 -> n18; n116 -> n69; n116 -> n73; n116 -> n76; n130 -> n4; n130 -> n11; n130 -> n19; n130 -> n20; n130 -> n65;
...

@a-robinson
Copy link
Contributor Author

a-robinson commented Sep 27, 2018

I've been tossing around the idea of rate limiting this logging to happen no more than once a minute, which would alleviate most of the pain. Would that still accomplish most of the original goal? I realize we've discovered some gossip problems lately, but I'm not really convinced that this information belongs in non-verbose logs.

@petermattis
Copy link
Collaborator

It's true that the gossip connectivity logs might be addressing yesterday's war. Perhaps we should re-add them to the once per minute logging, but only output the message if it has changed since the last log line.

a-robinson added a commit to a-robinson/cockroach that referenced this issue Sep 27, 2018
Fixes cockroachdb#30088

This will still probably result in logging it once a minute on large
clusters due to the inherent instability caused by connection culling,
but once a minute is a lot more bearable than the current state of
things.

Release note: None
craig bot pushed a commit that referenced this issue Sep 27, 2018
30730: gossip: Log connectivity no more than once a minute r=a-robinson a=a-robinson

Fixes #30088

This will still probably result in logging it once a minute on large
clusters due to the inherent instability caused by connection culling,
but once a minute is a lot more bearable than the current state of
things.

Release note: None

Co-authored-by: Alex Robinson <[email protected]>
@craig craig bot closed this as completed in #30730 Sep 27, 2018
a-robinson added a commit to a-robinson/cockroach that referenced this issue Sep 29, 2018
Fixes cockroachdb#30088

This will still probably result in logging it once a minute on large
clusters due to the inherent instability caused by connection culling,
but once a minute is a lot more bearable than the current state of
things.

Release note: None
petermattis added a commit to petermattis/cockroach that referenced this issue Nov 15, 2018
Make gossip.Request.{Addr,ClusterID} nullable and only send them on the
first request over a connection. Remove gossip.Response.Addr which was
unused.

Only gossip the cluster ID once per server. The cluster ID never
expires, so there was no need to be re-gossiping it frequently.

Keep track of which high water stamps have been sent to the remote side
of a connection and only send stamps that are newer than those already
sent. This has a dramatic impact on gossip traffic as a cluster gets
larger as usually only 1 or 2 stamps have changed on each gossip message
which results in reducing an O(n) map size to a constant.

In a local 27-node cluster, these changes reduce gossip traffic by
~60%.

Fixes cockroachdb#30126
See cockroachdb#30088

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-gossip C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.
Projects
None yet
Development

No branches or pull requests

4 participants