0.10.0-wip (was 0.9.3-wip) #73

rvagg · 2013-11-12T06:07:29Z

No description provided.

rvagg · 2013-11-12T06:08:54Z

With [email protected] and this we should have MakeCallback with all our v8 function calls, which means we are properly domains compatible and @tjfontaine's dtrace script should work on this. Just need some test code for pushing the batch "leak", anyone?

rvagg · 2013-11-12T06:50:25Z

check out test/leak-tester-batch.js.

chained batch:

writeCount = 1 , rss = 127% 12M ["0","0","0","0","0","0","0"]
writeCount = 101 , rss = 606% 56M ["2","1","1","0","0","0","0"]
writeCount = 201 , rss = 1117% 103M ["7","1","1","0","0","0","0"]
writeCount = 301 , rss = 1083% 99M ["8","10","1","0","0","0","0"]
writeCount = 401 , rss = 1109% 102M ["1","25","12","0","0","0","0"]
writeCount = 501 , rss = 1116% 103M ["6","22","15","0","0","0","0"]
writeCount = 601 , rss = 1118% 103M ["11","20","17","0","0","0","0"]
writeCount = 701 , rss = 1074% 99M ["4","38","23","0","0","0","0"]
writeCount = 801 , rss = 1109% 102M ["9","36","25","0","0","0","0"]
writeCount = 901 , rss = 1135% 104M ["2","38","49","0","0","0","0"]
writeCount = 1001 , rss = 1162% 107M ["7","36","51","0","0","0","0"]
writeCount = 1101 , rss = 1148% 105M ["12","32","54","0","0","0","0"]
writeCount = 1201 , rss = 1237% 114M ["5","41","70","0","0","0","0"]
writeCount = 1301 , rss = 1125% 103M ["10","38","73","0","0","0","0"]
writeCount = 1401 , rss = 1162% 107M ["3","44","95","0","0","0","0"]
writeCount = 1501 , rss = 1149% 105M ["8","41","98","0","0","0","0"]
writeCount = 1601 , rss = 1160% 107M ["1","42","120","0","0","0","0"]
writeCount = 1701 , rss = 1130% 104M ["6","41","121","0","0","0","0"]
writeCount = 1801 , rss = 1228% 113M ["11","39","123","0","0","0","0"]
writeCount = 1901 , rss = 1275% 117M ["4","43","145","0","0","0","0"]
writeCount = 2001 , rss = 1241% 114M ["9","41","147","0","0","0","0"]
writeCount = 2101 , rss = 1216% 112M ["2","48","154","16","0","0","0"]
writeCount = 2201 , rss = 1227% 113M ["7","46","155","16","0","0","0"]
writeCount = 2301 , rss = 1194% 110M ["12","44","157","16","0","0","0"]

array batch:

writeCount = 100 , rss = 690% 63M ["2","1","1","0","0","0","0"]
writeCount = 200 , rss = 1254% 115M ["7","1","1","0","0","0","0"]
writeCount = 300 , rss = 1856% 170M ["12","1","1","0","0","0","0"]
writeCount = 400 , rss = 2143% 196M ["1","25","12","0","0","0","0"]
writeCount = 500 , rss = 2612% 239M ["6","23","14","0","0","0","0"]
writeCount = 600 , rss = 3073% 281M ["11","21","16","0","0","0","0"]
writeCount = 700 , rss = 3557% 326M ["4","39","23","0","0","0","0"]
writeCount = 800 , rss = 4038% 370M ["9","37","25","0","0","0","0"]
writeCount = 900 , rss = 4523% 414M ["2","39","48","0","0","0","0"]
writeCount = 1000 , rss = 5008% 458M ["7","37","50","0","0","0","0"]
writeCount = 1100 , rss = 5459% 500M ["12","33","53","0","0","0","0"]
writeCount = 1200 , rss = 5884% 539M ["5","39","71","0","0","0","0"]
writeCount = 1300 , rss = 6405% 586M ["10","37","73","0","0","0","0"]
writeCount = 1400 , rss = 6869% 629M ["3","43","95","0","0","0","0"]
writeCount = 1500 , rss = 7310% 669M ["8","40","98","0","0","0","0"]
writeCount = 1600 , rss = 7786% 713M ["1","42","121","0","0","0","0"]
writeCount = 1700 , rss = 8280% 758M ["6","41","122","0","0","0","0"]
writeCount = 1800 , rss = 8742% 800M ["11","38","124","0","0","0","0"]
writeCount = 1900 , rss = 9297% 851M ["4","43","145","0","0","0","0"]
writeCount = 2000 , rss = 9720% 890M ["9","42","146","0","0","0","0"]
writeCount = 2100 , rss = 10223% 936M ["2","48","153","16","0","0","0"]
writeCount = 2200 , rss = 10708% 980M ["7","47","154","16","0","0","0"]
writeCount = 2300 , rss = 11190% 1024M ["12","45","156","16","0","0","0"]

is this the problem we're looking for? is chained batch immune to the leak in people's experience?

/cc @brycebaril @mcollina

juliangruber · 2013-11-12T07:25:53Z

as far as i remember i think i've seen similar memory growth in leveled

rvagg · 2013-11-12T10:27:40Z

yeah baby! I pulled in #70 and did a complete job with it (you also need to DisposeBufferOrSlice() to clean up after you've made the Slice). I also found a missing delete batch; in the post-async batch() operation so the LevelDB WriteBatch object was persisting when you used an array-form batch() operation. The delete batch; was present for chained-form, so this explains the difference in memory ballooning between the two.

I'm currently running both chained and array style batch operations in my batch leak tester and I'm up over 2G and memory usage is consistent around the 70M mark, so WOOHOO!

This is going to be 0.10.0 I think, too big a change with the Persistent removal to be a patch version, thoughts?

rvagg · 2013-11-12T10:28:01Z

oh, and can someone who has a segfault or leak situation please try this branch and report back?

juliangruber · 2013-11-12T10:54:14Z

aren't those changes only fixes?

juliangruber · 2013-11-12T10:54:38Z

congrats!!

mcollina · 2013-11-12T11:15:26Z

Congrats @rvagg for putting it all together!

👯 and 🍻 to everybody!

I'm up for 0.10, as its a too-heavy fix to be a patch release. This can have bugs, and if you are not experiencing the issue you do not want a possible unsafe upgrade.

I'll give this a go in my segfault case to see if I still get problems.

juliangruber · 2013-11-12T11:19:21Z

ok, 0.10 sounds good then

mcollina · 2013-11-12T11:22:50Z

This solves my segfault on node v0.10. 🍻

mcollina · 2013-11-12T11:35:39Z

I spoke too early. I still have a segfault in LevelGraph which I am not able to reproduce using just levelup/leveldown :( using chained batch.

rvagg · 2013-11-12T12:11:22Z

@mcollina can you do a node-gyp rebuild --debug, run levelgraph case with gdb and try and cause the segfault and then give us a complete backtrace so we can see where it's having trouble?

mcollina · 2013-11-12T13:46:37Z

First run (without segfault):

(gdb) run writesStream.js
Starting program: /Users/matteo/.nvm/v0.10.20/bin/node writesStream.js
Reading symbols for shared libraries . done

events.js:72
        throw er; // Unhandled 'error' event
              ^
WriteError: Corruption: WriteBatch has wrong count
    at /Users/matteo/Repositories/levelgraph/node_modules/levelup/lib/batch.js:68:39

Program exited with code 010.

Second run, with segfault:

(gdb) run writesStream.js
Starting program: /Users/matteo/.nvm/v0.10.20/bin/node writesStream.js

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: 13 at address: 0x0000000000000000
[Switching to process 12220 thread 0x1a03]
0x0000000100ec77ec in leveldb::DecodeFixed32 (ptr=0xe00000001023000a <Address 0xe00000001023000a out of bounds>) at coding.h:62
62          memcpy(&result, ptr, sizeof(result));  // gcc optimizes this to a plain load
(gdb) backtrace
#0  0x0000000100ec77ec in leveldb::DecodeFixed32 (ptr=0xe00000001023000a <Address 0xe00000001023000a out of bounds>) at coding.h:62
#1  0x0000000100ec70d3 in leveldb::WriteBatchInternal::Count (b=0x102300c20) at ../deps/leveldb/leveldb-1.14.0/db/write_batch.cc:83
#2  0x0000000100ec7041 in leveldb::WriteBatch::Iterate (this=0x102300c20, handler=0x102209c68) at ../deps/leveldb/leveldb-1.14.0/db/write_batch.cc:75
#3  0x0000000100ec72b0 in leveldb::WriteBatchInternal::InsertInto (b=0x102300c20, memtable=0x102300400) at ../deps/leveldb/leveldb-1.14.0/db/write_batch.cc:133
#4  0x0000000100ea2ce7 in leveldb::DBImpl::Write (this=0x102300060, options=@0x102300d60, my_batch=0x102300c20) at ../deps/leveldb/leveldb-1.14.0/db/db_impl.cc:1192
#5  0x0000000100e88ebe in leveldown::Database::WriteBatchToDatabase (this=0x100c00800, options=0x102300d60, batch=0x102300c20) at ../src/database.cc:74
#6  0x0000000100e83e03 in leveldown::Batch::Write (this=0x102300ad0) at ../src/batch.cc:28
#7  0x0000000100e881da in leveldown::BatchWriteWorker::Execute (this=0x102300c40) at ../src/batch_async.cc:25
#8  0x0000000100e878d1 in NanAsyncExecute (req=0x102300c48) at nan.h:752
#9  0x000000010012c8d0 in worker (arg=<value temporarily unavailable, due to optimizations>) at ../deps/uv/src/unix/threadpool.c:74
#10 0x000000010012330b in uv__thread_start (ctx_v=<value temporarily unavailable, due to optimizations>) at ../deps/uv/src/uv-common.c:322
#11 0x00007fff8a752899 in _pthread_body ()
#12 0x00007fff8a75272a in _pthread_start ()
#13 0x00007fff8a756fc9 in thread_start ()

rvagg · 2013-11-12T23:24:36Z

from what I can see, the only possible way this can happen is if the WriteBatch object we're operating on has been deleted before Write() is called, at every stage the internal rep_ representation of the batch data is initialised ore reset, it is set to 12 bytes long which should be enough to encode the "count" value that's causing this segfault.

@mcollina perhaps we have some odd V8 GC thing happening, can you try holding on to a reference to the batch you're working on in your code and reference it after you get the callback from the write()? Something like this ought to suffice I think:

var batch = db.batch()
batch.put('foo', 'bar')
batch.write(function (err) {
  batch.dummy = 'foobar'
})

this is to ensure that V8 doesn't decide that batch can be garbage collected until after we're finished with it, perhaps it's getting cleaned up before the batch is completed.

rvagg · 2013-11-13T08:33:57Z

so @KyotoWeb helpfully picked up that we're not actually using the bloom filter on this branch! ooops, that's added now.

rvagg · 2013-11-13T08:49:28Z

Could I please get a contributor who is on OSX to check out the latest on this branch, go in to deps/snappy/snappy-1.1.1/ and run ./configure, then copy the config.h generated into deps/snappy/mac/, commit and push back to this branch?

mcollina · 2013-11-13T09:08:23Z

Done :).

mcollina · 2013-11-13T09:25:06Z

I think the segfault happens when LevelDB do compaction, the batch is 'stopped' and then the GC has time to collect it. I was removing a reference to the batch after starting the write.
Level/level-ws@e252652

The fact that this is triggered by LevelGraph is due to the kind of data: LG stores six very different sets of keys in the same batch.

mcollina · 2013-11-13T09:38:37Z

The travis build fails with my config.h commit. Did I do something wrong?

rvagg · 2013-11-13T09:39:49Z

sorry, my fault, I haven't fully wired up snappy-1.1.1 yet, just needs a snappy.gyp change which you're welcome to do or I'll do when I get back to the computer.

mcollina · 2013-11-13T10:00:24Z

Done!

…3-wip

rvagg · 2013-11-14T01:06:50Z

this won't compile for me on Windows, because of the Snappy upgrade, and I can't remember what I had to do to make it work last time ... so .. still "work in progress"

rvagg · 2013-11-15T06:19:43Z

My current thoughts on the GC issue: I think what we might do is assign a Persistent handle for the parent object each time we do an async operation, when the async is finished then the Persistent can be freed. This should happen for Database, Iterator and Batch. Each of these could be dropped in JS-land while in an async operation and it's only the fact that most of the time the operation is faster than GC can get to it that we survive.

I'll investigate extending NanAsyncWorker in NAN to more easily support this but it should just be a matter of writing a Persistent to the worker each time we make one and then ensuring that any Persistents in the worker are freed when it is cleaned up.

mcollina · 2013-11-15T08:39:19Z

I agree with you... this leveldown release will be huge.

rvagg · 2013-11-15T10:13:36Z

need to wait for nodejs/nan#41 but as soon as that's released I'll push my latest changes to do what I described above.

…nto 0.9.3-wip

No9 · 2013-11-15T17:09:04Z

So Snappy build looks broken on the windows anyhoo
https://code.google.com/p/snappy/issues/detail?id=79

The lines in configure.ac that are mentioned are added in our one too.
https://github.com/rvagg/node-leveldown/blob/0.9.3-wip/deps/snappy/snappy-1.1.1/configure.ac#L21

By adding

#include <BaseTsd.h>
typedef SSIZE_T ssize_t;

Here
https://github.com/rvagg/node-leveldown/blob/0.9.3-wip/deps/snappy/snappy-1.1.1/snappy-stubs-public.h#L95
Edit:
Should be within the windows if condition
https://github.com/rvagg/node-leveldown/blob/0.9.3-wip/deps/snappy/snappy-1.1.1/snappy-stubs-public.h#L89

Fixes it.
Seems like we might need some additional gyp foo to copy this in from oursource tree?

Thoughts?

rvagg · 2013-11-17T03:02:26Z

@No9 my thoughts are that this worked fine on windows with 1.1.0 and the changes for 1.1.1 are relatively minor so we should back up and see what's changed, perhaps I had a fix manually in 1.1.0 for Windows or perhaps I'm not quite doing something right now!

rvagg · 2013-11-17T10:16:05Z

@No9 sorry, I've followed up yourl inks and understand what you're talking about now! the ssize_t thing indeed is new in 1.1.1, I started off just throwing a -Dssize_t=int at it but there's a suggestion that there's a perf impact from that (I seriously doubt it'd be a measurable perf difference!). So I've gone ahead and added the SSIZE_T -> ssize_t thing and I've also fixed some other things up. Hopefully we have a more complete and efficient snappy build for all operating systems now.

Anyone got anything else that needs to go in to a 0.10.0? I think we might be done here. Final comments?

mcollina · 2013-11-17T10:32:11Z

I'm for getting leveldown 0.10 out. It's already a big release, with lots of new stuff and improvements.
I'm not sure about version handling in levelup/level, but we can discuss it there!

rvagg · 2013-11-17T10:33:12Z

@mcollina what we've done for leveldown minors is to bump levelup & level minors too even if there aren't any changes there. I think that's reasonable and so far nobody has suggested otherwise.

mcollina · 2013-11-17T10:37:50Z

Cool! Ok. I thought 0.18 was the one without WriteStream, but he PR is still not merged, so we should be OK.

0.10.0-wip (was 0.9.3-wip)

rvagg · 2013-11-18T01:19:04Z

published as 0.10.0!

mcollina and others added 2 commits November 3, 2013 14:38

Commented out references in batches.

95fff15

use MakeCallback() instead of Call(), closes #62

0fee1d9

added batch() leak tester

056009f

rvagg added 2 commits November 12, 2013 20:48

Merge remote-tracking branch 'origin/batch-no-references' into 0.9.3-wip

ccfa1da

fix batch() memory leaks & don't persist inputs

9d1977c

mcollina mentioned this pull request Nov 12, 2013

Memleak levelgraph/levelgraph#40

Closed

use bloom filter

f5463ab

rvagg added 4 commits November 13, 2013 19:38

added snappy-1.1.1

f5d14b0

remove snappy 1.1.0

85b2f88

added new linux config.h

8c97f4b

added new solaris config.h

308b7fa

rvagg and others added 2 commits November 13, 2013 20:07

update windows snappy version descriptor

71cddb6

Mac snappy config.h.

402fdae

Updated snappy.gyp to point to snappy-1.1.1.

898d7b7

Merge branch '0.9.3-wip' of github.com:rvagg/node-leveldown into 0.9.…

d474ac8

…3-wip

rvagg and others added 2 commits November 14, 2013 16:26

updated freebsd config.h

3fd5a99

Compression should be enabled by default (ref. doc).

966760e

rvagg added 2 commits November 15, 2013 21:18

save parent object as persistent during async work

428e120

Merge branch 'master' of https://github.com/Kyotoweb/node-leveldown i…

33e5dd3

…nto 0.9.3-wip

update deps

17b3ad4

rvagg added 4 commits November 17, 2013 20:57

-Dssize_t=int for win32

cb198f8

update abstract-leveldown devDep

9f52a6d

added basic config.h for win32

4c021ad

HAVE_CONFIG_H=1 for all, undo ssize_t define

26c0962

ignore rocksdb submodule

89e83cd

rvagg added a commit that referenced this pull request Nov 18, 2013

Merge pull request #73 from rvagg/0.9.3-wip

a937180

0.10.0-wip (was 0.9.3-wip)

rvagg merged commit a937180 into master Nov 18, 2013

rvagg deleted the 0.9.3-wip branch November 18, 2013 00:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.10.0-wip (was 0.9.3-wip) #73

0.10.0-wip (was 0.9.3-wip) #73

rvagg commented Nov 12, 2013

rvagg commented Nov 12, 2013

rvagg commented Nov 12, 2013

juliangruber commented Nov 12, 2013

rvagg commented Nov 12, 2013

rvagg commented Nov 12, 2013

juliangruber commented Nov 12, 2013

juliangruber commented Nov 12, 2013

mcollina commented Nov 12, 2013

juliangruber commented Nov 12, 2013

mcollina commented Nov 12, 2013

mcollina commented Nov 12, 2013

rvagg commented Nov 12, 2013

mcollina commented Nov 12, 2013

rvagg commented Nov 12, 2013

rvagg commented Nov 13, 2013

rvagg commented Nov 13, 2013

mcollina commented Nov 13, 2013

mcollina commented Nov 13, 2013

mcollina commented Nov 13, 2013

rvagg commented Nov 13, 2013

mcollina commented Nov 13, 2013

rvagg commented Nov 14, 2013

rvagg commented Nov 15, 2013

mcollina commented Nov 15, 2013

rvagg commented Nov 15, 2013

No9 commented Nov 15, 2013

rvagg commented Nov 17, 2013

rvagg commented Nov 17, 2013

mcollina commented Nov 17, 2013

rvagg commented Nov 17, 2013

mcollina commented Nov 17, 2013

rvagg commented Nov 18, 2013

0.10.0-wip (was 0.9.3-wip) #73

0.10.0-wip (was 0.9.3-wip) #73

Conversation

rvagg commented Nov 12, 2013

rvagg commented Nov 12, 2013

rvagg commented Nov 12, 2013

juliangruber commented Nov 12, 2013

rvagg commented Nov 12, 2013

rvagg commented Nov 12, 2013

juliangruber commented Nov 12, 2013

juliangruber commented Nov 12, 2013

mcollina commented Nov 12, 2013

juliangruber commented Nov 12, 2013

mcollina commented Nov 12, 2013

mcollina commented Nov 12, 2013

rvagg commented Nov 12, 2013

mcollina commented Nov 12, 2013

rvagg commented Nov 12, 2013

rvagg commented Nov 13, 2013

rvagg commented Nov 13, 2013

mcollina commented Nov 13, 2013

mcollina commented Nov 13, 2013

mcollina commented Nov 13, 2013

rvagg commented Nov 13, 2013

mcollina commented Nov 13, 2013

rvagg commented Nov 14, 2013

rvagg commented Nov 15, 2013

mcollina commented Nov 15, 2013

rvagg commented Nov 15, 2013

No9 commented Nov 15, 2013

rvagg commented Nov 17, 2013

rvagg commented Nov 17, 2013

mcollina commented Nov 17, 2013

rvagg commented Nov 17, 2013

mcollina commented Nov 17, 2013

rvagg commented Nov 18, 2013