Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize diff() for bigger data sets #5013

Closed
Reinmar opened this issue Jan 11, 2019 · 9 comments · Fixed by ckeditor/ckeditor5-utils#274
Closed

Optimize diff() for bigger data sets #5013

Reinmar opened this issue Jan 11, 2019 · 9 comments · Fixed by ckeditor/ckeditor5-utils#274
Assignees
Labels
package:utils status:discussion type:improvement This issue reports a possible enhancement of an existing feature.
Milestone

Comments

@Reinmar
Copy link
Member

Reinmar commented Jan 11, 2019

When researching engine's performance issues I found out that calling the diff() function with a and b of length 1000 (use data set from #1398) will take diff() 10s to finish. That's a terrible result.

There's no way to significantly speed up the diff() function itself. It's a highly optimal minimum diff function. Its implementation is based on a popular paper and I don't think it's reasonable to look deeper into this. Especially that we need to improve it at least 100x.

What we could do, though, is resorting to a dumb but fast function when a and b are longer than some threshold (e.g. 100 items). We already have fastDiff() but it accepts a bit different params than diff() at the moment. If we'd align fastDiff() to diff() we could make a simple change in the diff() function:

if ( a.length > 100 && b.length > 100 ) {
    return fastDiff( a, b, comp );
}

RFC.

@f1ames
Copy link
Contributor

f1ames commented Jan 11, 2019

There's no way to significantly speed up the diff() function itself. It's a highly optimal minimum diff function. Its implementation is based on a popular paper and I don't think it's reasonable to look deeper into this. Especially that we need to improve it at least 100x.

What about the fast-diff implementation I have mentioned in https://github.com/ckeditor/ckeditor5-engine/issues/403#issuecomment-375282791? It is significantly faster so maybe we can switch from diff to this one. And then for longer strings which still takes too long use our simplified fastDiff() (btw. I think it should be called partialDiff(), it's faster because it only compares part of the strings).

From what I have posted in https://github.com/ckeditor/ckeditor5-engine/issues/403#issuecomment-375282791 it seems fast-diff is like 30x to 100x times faster than our current diff() (the longer the modified parts the greater the difference). However, it will still need some performance checks.

@f1ames
Copy link
Contributor

f1ames commented Jan 11, 2019

@pjasiun
Copy link

pjasiun commented Jan 11, 2019

Actually, we could do it even smarter. Diff algorithm if very fast (linear) if arrayA == arrayB and takes longer and longer if arrayA has less and less common parts with arrayB. For instance:

diff( 'aaaaaaaaaa', 'aaaaaaaaaa' ); // takes 10 comparisons
diff( 'aaaaaaaaaa', 'bbbbbbbbbb' ); // takes 100 comparisons
diff( 'aaaa', 'bbbb' ); // takes 16 comparisons

So it is not the matter of the string length, but the length of the different part.

Fortunately, how diff works is:

  • check if strings are the same,
  • check if it can find a match with 1 different character,
  • check if it can find a match with 2 different characters,
  • check if it can find a match with 3 different characters,
  • ...

We can basically stop it at some point and say that if there is more than (for instance) 30 different characters, fast diff should be used.

@Reinmar
Copy link
Member Author

Reinmar commented Jan 14, 2019

We can think about this in the future, but KISS now. We need a patch ASAP.

@f1ames
Copy link
Contributor

f1ames commented Feb 6, 2019

I did some testing when looking for proper threshold when to switch to fastDiff() checking execution time of the whole diff() function and number of internal snake() function calls (which is the main mechanism in diff()). It looks like this:

text1 length text2 length diff() times (ms) avg. diff() time (ms) snake() calls
100 90 11, 7, 5, 3, 4, 12, 6, 7, 4, 5 6.4 3111
499 450 218, 342, 232, 238, 300, 245, 178, 190, 189, 167 229.9 73656
599 570 368, 1368, 1314, 1435, 1437, 1424, 1383, 1353, 1364, 1335 1278.1 111680
699 680 2359, 2370, 2461, 2378, 2323, 2317, 2499, 2332, 2334, 2327 2370 152400
799 777 3536, 4644, 3845, 3982, 4395, 4069, 3737, 3608, 3466, 3639 3892.1 196128
200 1281 1132, 971, 904, 917, 913, 890, 895, 884, 921, 915 934.2 41366

So there is clearly an issue when both compared texts are long, the number of snake() calls seems to increase exponentially when length of both texts increases. I was thinking about having a safe threshold somewhere around 600 characters, but then there could be situations like 500 + 1500 or 400 + 2000 which still will take loooooooong to diff. So I think we should go with something like if smaller length is above 400 the sum above 1200 triggers a switch to fastDiff() (I have to still check those numbers and adjust them, but you get the idea).

One more thing about snake() function - each snake() call uses cmp() function (comparator function passed to diff()) at least once. While I have used === in tests (comparing strings), for DOM nodes/elements we use sameNodes() function is more complex and more costly itself so this should also be taken into account (I will check if it has some significant execution time itself).

@f1ames
Copy link
Contributor

f1ames commented Feb 6, 2019

It seems the sameNodes() takes only a fraction of snake() execution time which seems acceptable:

image

I also found one more interesting thing. It looks like the most of the exec time is spend in renderer._updateChildrenMappings() method which ofc calls diff() internally:

image

Without renderer._updateChildrenMappings() when setting very long content (data from #1398) there is only one diff() call on rather not complex content (estimated by number of snake() calls):

diff - snakes 2209

when I enabled renderer._updateChildrenMappings() and run the same scenario again (setting mentioned content two times in a row):

diff - snakes 2209
diff - snakes 1
diff - snakes 1
diff - snakes 9
diff - snakes 1
diff - snakes 1
diff - snakes 9
diff - snakes 1
diff - snakes 1
diff - snakes 9
diff - snakes 1
diff - snakes 4
diff - snakes 1
diff - snakes 1
diff - snakes 1
diff - snakes 4
diff - snakes 1
diff - snakes 1
diff - snakes 4
diff - snakes 1
diff - snakes 1
diff - snakes 1
diff - snakes 144
diff - snakes 1
diff - snakes 1
diff - snakes 4225
diff - snakes 1
diff - snakes 1
diff - snakes 9
diff - snakes 1
diff - snakes 564001
diff - snakes 1
diff - snakes 1
diff - snakes 9
diff - snakes 1
diff - snakes 521284
diff - snakes 1
diff - snakes 1
diff - snakes 1
diff - snakes 4
diff - snakes 1
diff - snakes 1
diff - snakes 9
diff - snakes 1
diff - snakes 4
diff - snakes 1
diff - snakes 4
diff - snakes 1
diff - snakes 4
diff - snakes 1
diff - snakes 4
diff - snakes 1
diff - snakes 4
diff - snakes 1
diff - snakes 25
diff - snakes 1
diff - snakes 1
diff - snakes 1
diff - snakes 1
diff - snakes 4
diff - snakes 1

most calls are just comparing single nodes (like p), but there are two significant ones diff - snakes 521284 and diff - snakes 564001 which seems to be comparing the whole content. Not sure if we can do something about it (well, optimizing diff is ofc one of the solutions), but this looks like the main cause of this performance issue.

@Reinmar
Copy link
Member Author

Reinmar commented Feb 11, 2019

Not sure if we can do something about it (well, optimizing diff is ofc one of the solutions), but this looks like the main cause of this performance issue.

Optimising the rendering algorithm is, of course, one of the things we could look into, but I expect that it won't be easy. You know how much time we can spend fixing the renderer – months. That's why I didn't even want to think about it. Especially that renderer is just too complex already. Adding some optimisations there would most likely make it even more so.

@f1ames
Copy link
Contributor

f1ames commented Feb 11, 2019

Optimising the rendering algorithm is, of course, one of the things we could look into, but I expect that it won't be easy. You know how much time we can spend fixing the renderer – months...

That's true and also a reason we have https://github.com/ckeditor/ckeditor5-engine/issues/1456 :) I think going with fastDiff() is a good step for improving performance here and then, maybe one day, a time will come for more complex renderer optimization.

@f1ames
Copy link
Contributor

f1ames commented Feb 11, 2019

So I think we should go with something like: if smaller length is above 400 and the sum above 1200, switch to fastDiff().

I did some profiling and it looks like below:

Chrome

Testing... (t1 length - t2 length - avg time - snake() calls - times)
l1: 300 l2: 700 avg: 254ms snakes(): 66276 (15) [279, 226, 222, 202, 415, 423, 285, 198, 233, 218, 221, 221, 268, 197, 200]
l1: 350 l2: 700 avg: 321ms snakes(): 81600 (15) [269, 307, 323, 333, 304, 297, 297, 401, 402, 301, 289, 313, 452, 264, 266]
l1: 400 l2: 700 avg: 354ms snakes(): 94464 (15) [364, 333, 474, 321, 369, 356, 349, 360, 351, 341, 358, 352, 323, 317, 339]
l1: 450 l2: 700 avg: 460ms snakes(): 109691 (15) [422, 403, 643, 738, 418, 404, 411, 429, 397, 403, 475, 526, 409, 429, 393]
l1: 300 l2: 800 avg: 266ms snakes(): 69996 (15) [252, 293, 239, 240, 284, 254, 285, 367, 238, 242, 277, 240, 254, 280, 241]
l1: 350 l2: 800 avg: 335ms snakes(): 87016 (15) [314, 355, 343, 318, 345, 319, 328, 364, 320, 321, 351, 344, 328, 354, 316]
l1: 400 l2: 800 avg: 464ms snakes(): 104400 (15) [412, 449, 413, 416, 425, 410, 425, 419, 419, 466, 541, 452, 550, 557, 604]
l1: 450 l2: 800 avg: 540ms snakes(): 121475 (15) [559, 566, 538, 553, 528, 504, 527, 534, 536, 548, 510, 517, 602, 544, 527]
l1: 300 l2: 900 avg: 319ms snakes(): 74836 (15) [298, 352, 307, 307, 312, 334, 299, 312, 325, 362, 306, 326, 306, 335, 310]
l1: 350 l2: 900 avg: 438ms snakes(): 94944 (15) [416, 455, 416, 453, 429, 448, 416, 457, 442, 427, 437, 458, 419, 490, 407]
l1: 400 l2: 900 avg: 565ms snakes(): 112224 (15) [579, 530, 590, 553, 552, 556, 516, 612, 581, 626, 565, 533, 590, 548, 539]
l1: 450 l2: 900 avg: 732ms snakes(): 132559 (15) [699, 742, 691, 730, 692, 755, 760, 749, 726, 727, 744, 724, 715, 752, 770]
l1: 300 l2: 1000 avg: 435ms snakes(): 79101 (15) [445, 436, 399, 433, 448, 410, 421, 441, 434, 436, 457, 435, 439, 461, 426]
l1: 350 l2: 1000 avg: 618ms snakes(): 99584 (15) [583, 549, 585, 549, 564, 634, 626, 587, 541, 560, 599, 736, 741, 694, 725]
l1: 400 l2: 1000 avg: 799ms snakes(): 124369 (15) [800, 753, 766, 775, 793, 800, 906, 853, 858, 802, 762, 786, 824, 731, 769]
l1: 450 l2: 1000 avg: 934ms snakes(): 142464 (15) [957, 913, 932, 897, 893, 948, 928, 933, 941, 944, 942, 943, 951, 961, 933]
l1: 300 l2: 1200 avg: 758ms snakes(): 84796 (15) [755, 785, 758, 741, 764, 759, 750, 755, 776, 753, 777, 692, 764, 757, 787]
l1: 350 l2: 1200 avg: 1037ms snakes(): 108819 (15) [1069, 1033, 1136, 1139, 1086, 988, 1020, 1046, 1011, 1054, 1020, 992, 978, 961, 1015]
l1: 400 l2: 1200 avg: 1321ms snakes(): 133764 (15) [1355, 1285, 1327, 1355, 1331, 1313, 1310, 1312, 1332, 1308, 1268, 1336, 1306, 1316, 1355]
l1: 450 l2: 1200 avg: 1707ms snakes(): 162976 (15) [1697, 1736, 1719, 1623, 1662, 1668, 1745, 1706, 1769, 1685, 1714, 1747, 1707, 1720, 1701]
l1: 300 l2: 1500 avg: 1494ms snakes(): 82225 (15) [1285, 1339, 1326, 1384, 1316, 1547, 1786, 1505, 1449, 1785, 1360, 1321, 1356, 1814, 1836]
l1: 350 l2: 1500 avg: 2335ms snakes(): 111600 (15) [1861, 2905, 2154, 1924, 1997, 2433, 2334, 2921, 2279, 1835, 1995, 2603, 2269, 2910, 2608]
l1: 400 l2: 1500 avg: 2690ms snakes(): 150429 (15) [2544, 2545, 3293, 2704, 2636, 2609, 2610, 2448, 2425, 3155, 2811, 2770, 2863, 2523, 2413]
l1: 450 l2: 1500 avg: 2991ms snakes(): 177304 (15) [2960, 3392, 2853, 2934, 3338, 2862, 2791, 2800, 2784, 2816, 3026, 2905, 3180, 3288, 2935]

Firefox

Testing... (t1 length - t2 length - avg time - snake() calls - times)
l1: 300 l2: 700 avg: 633ms snakes(): 66276 Array(15) [ 943, 771, 444, 1007, 344, 894, 396, 843, 544, 459, … ]
l1: 350 l2: 700 avg: 757ms snakes(): 81600 Array(15) [ 867, 1124, 802, 723, 643, 1031, 653, 758, 635, 689, … ]
l1: 400 l2: 700 avg: 1094ms snakes(): 94464 Array(15) [ 1007, 854, 817, 1017, 827, 1030, 1283, 946, 1229, 822, … ]
l1: 450 l2: 700 avg: 1014ms snakes(): 109691 Array(15) [ 1151, 1436, 1318, 1037, 1089, 951, 885, 1154, 1098, 1203, … ]
l1: 300 l2: 800 avg: 673ms snakes(): 69996 Array(15) [ 714, 453, 573, 540, 815, 845, 682, 502, 520, 621, … ]
l1: 350 l2: 800 avg: 834ms snakes(): 87016 Array(15) [ 748, 819, 846, 652, 993, 761, 587, 1541, 752, 804, … ]
l1: 400 l2: 800 avg: 984ms snakes(): 104400 Array(15) [ 919, 1167, 1120, 857, 931, 1120, 860, 911, 1032, 909, … ]
l1: 450 l2: 800 avg: 1170ms snakes(): 121475 Array(15) [ 1466, 1220, 1198, 1016, 1311, 1065, 1134, 1146, 1301, 1145, … ]
l1: 300 l2: 900 avg: 714ms snakes(): 74836 Array(15) [ 652, 792, 500, 626, 713, 901, 673, 548, 887, 579, … ]
l1: 350 l2: 900 avg: 976ms snakes(): 94944 Array(15) [ 1123, 672, 1200, 1073, 938, 907, 894, 1113, 731, 1221, … ]
l1: 400 l2: 900 avg: 1128ms snakes(): 112224 Array(15) [ 1170, 941, 1079, 1247, 1122, 966, 1268, 1107, 1188, 1251, … ]
l1: 450 l2: 900 avg: 1339ms snakes(): 132559 Array(15) [ 1447, 1461, 1320, 1386, 1401, 1182, 1481, 1340, 1137, 1330, … ]
l1: 300 l2: 1000 avg: 785ms snakes(): 79101 Array(15) [ 644, 714, 814, 821, 846, 776, 704, 843, 665, 886, … ]
l1: 350 l2: 1000 avg: 1062ms snakes(): 99584 Array(15) [ 819, 1250, 843, 1162, 1338, 775, 1397, 1081, 842, 1070, … ]
l1: 400 l2: 1000 avg: 1296ms snakes(): 124369 Array(15) [ 1365, 1272, 1218, 1142, 1362, 1309, 1208, 1230, 1229, 1368, … ]
l1: 450 l2: 1000 avg: 1440ms snakes(): 142464 Array(15) [ 1386, 1530, 1505, 1435, 1254, 1495, 1549, 1452, 1409, 1418, … ]
l1: 300 l2: 1200 avg: 1085ms snakes(): 84796 Array(15) [ 864, 1305, 1010, 967, 979, 808, 1196, 1000, 815, 1260, … ]
l1: 350 l2: 1200 avg: 1700ms snakes(): 108819 Array(15) [ 1344, 1981, 1639, 1615, 974, 2583, 1815, 955, 2315, 1720, … ]
l1: 400 l2: 1200 avg: 2246ms snakes(): 133764 Array(15) [ 3217, 1958, 2203, 1847, 2141, 2843, 2669, 2197, 2395, 1840, … ]
l1: 450 l2: 1200 avg: 2381ms snakes(): 162976 Array(15) [ 2249, 2318, 2137, 2112, 2000, 2846, 2527, 2584, 2213, 3091, … ]
l1: 300 l2: 1500 avg: 1304ms snakes(): 82225 Array(15) [ 999, 1390, 1826, 1904, 944, 1561, 1487, 988, 1502, 815, … ]
l1: 350 l2: 1500 avg: 1703ms snakes(): 111600 Array(15) [ 1644, 1687, 1678, 2404, 1342, 2041, 1983, 2394, 1974, 1174, … ]
l1: 400 l2: 1500 avg: 1824ms snakes(): 150429 Array(15) [ 1848, 1366, 1961, 1473, 2245, 1653, 2190, 1421, 1849, 1731, … ]
l1: 450 l2: 1500 avg: 1985ms snakes(): 177304 Array(15) [ 1657, 2082, 2124, 2002, 1734, 2048, 1924, 1814, 1738, 2172, … ]

Safari

Testing... (t1 length - t2 length - avg time - snake() calls - times)
l1: 300 – "l2: 700" – "avg: 106ms" – "snakes(): 66276" – [150, 147, 107, …] (15)
l1: 350 – "l2: 700" – "avg: 124ms" – "snakes(): 81600" – [134, 120, 125, …] (15)
l1: 400 – "l2: 700" – "avg: 156ms" – "snakes(): 94464" – [176, 160, 163, …] (15)
l1: 450 – "l2: 700" – "avg: 183ms" – "snakes(): 109691" – [190, 184, 185, …] (15)
l1: 300 – "l2: 800" – "avg: 109ms" – "snakes(): 69996" – [113, 109, 110, …] (15)
l1: 350 – "l2: 800" – "avg: 137ms" – "snakes(): 87016" – [150, 144, 139, …] (15)
l1: 400 – "l2: 800" – "avg: 174ms" – "snakes(): 104400" – [180, 175, 174, …] (15)
l1: 450 – "l2: 800" – "avg: 216ms" – "snakes(): 121475" – [220, 217, 222, …] (15)
l1: 300 – "l2: 900" – "avg: 123ms" – "snakes(): 74836" – [133, 123, 125, …] (15)
l1: 350 – "l2: 900" – "avg: 167ms" – "snakes(): 94944" – [175, 173, 170, …] (15)
l1: 400 – "l2: 900" – "avg: 202ms" – "snakes(): 112224" – [213, 206, 201, …] (15)
l1: 450 – "l2: 900" – "avg: 263ms" – "snakes(): 132559" – [266, 255, 255, …] (15)
l1: 300 – "l2: 1000" – "avg: 167ms" – "snakes(): 79101" – [160, 145, 158, …] (15)
l1: 350 – "l2: 1000" – "avg: 198ms" – "snakes(): 99584" – [225, 202, 197, …] (15)
l1: 400 – "l2: 1000" – "avg: 261ms" – "snakes(): 124369" – [258, 269, 279, …] (15)
l1: 450 – "l2: 1000" – "avg: 321ms" – "snakes(): 142464" – [311, 327, 324, …] (15)
l1: 300 – "l2: 1200" – "avg: 235ms" – "snakes(): 84796" – [229, 226, 228, …] (15)
l1: 350 – "l2: 1200" – "avg: 345ms" – "snakes(): 108819" – [309, 318, 329, …] (15)
l1: 400 – "l2: 1200" – "avg: 422ms" – "snakes(): 133764" – [529, 411, 411, …] (15)
l1: 450 – "l2: 1200" – "avg: 518ms" – "snakes(): 162976" – [604, 499, 499, …] (15)
l1: 300 – "l2: 1500" – "avg: 334ms" – "snakes(): 82225" – [331, 339, 325, …] (15)
l1: 350 – "l2: 1500" – "avg: 492ms" – "snakes(): 111600" – [452, 480, 497, …] (15)
l1: 400 – "l2: 1500" – "avg: 698ms" – "snakes(): 150429" – [665, 677, 697, …] (15)
l1: 450 – "l2: 1500" – "avg: 852ms" – "snakes(): 177304" – [754, 755, 767, …] (15)

One visible thing is that the diff() time differs for different browsers (with Safari being the fastest 🤔) so we should also take this into consideration. I would consider exec time below 1500ms to be acceptable (taking into consideration that it is only diff() processing so there is other processing apart from that (whole conversion pipeline) and also the fact that since it runs in the browser it depends also on the end machine performance).

Looking at the data above and earlier findings (that running for text with both 600+ chars takes 1500ms+) I would go with threshold (switching to fastDiff()) when shorter string/array has above 400 elements and the length sum for both is above 1300 or if the length sum is above 2000 (for scenarios like 300 - 1800 or 200 - 2000 etc.).

Reinmar referenced this issue in ckeditor/ckeditor5-utils Feb 13, 2019
Other: Optimized `diff()` function to use `fastDiff()` function internally for large data sets. Closes #269.
@mlewand mlewand added this to the iteration 22 milestone Oct 9, 2019
@mlewand mlewand added status:discussion type:improvement This issue reports a possible enhancement of an existing feature. package:utils labels Oct 9, 2019
@mlewand mlewand transferred this issue from ckeditor/ckeditor5-utils Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
package:utils status:discussion type:improvement This issue reports a possible enhancement of an existing feature.
Projects
None yet
4 participants