Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow commits to be created without index #63

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

FrozenCow
Copy link

I wanted to make commits without a working directory (nor index), so I've added the minimal set of functions that I could find to create commits: createBlobFromDisk, createTree, createCommit. These 3 functions all reside in Repository.

These commits might not be following the guidelines of node-gitteh and feel a bit messy compared to the rest of the code. I couldn't find good examples for such 'big' functions. Please give me pointers to what should be changed.

Please note the following:

  • The input of createCommit and createTree isn't validated correctly at the moment. I couldn't find a good example of how to validate complete structures (array with objects with keys). It could also be that the inputs will need actual objects like Blob, instead of the oid of a Blob. Please give me some guidelines for this.
  • Also the results of the functions are oid, not actual objects. I used this approach, since I did not want to fill the objects like Blob with their values, which would require to load all bytes of the Blob into memory and pass it to JS. Making a Blob object that implements getters would be a better approach, but with my limited knowledge it would be best to first get some feedback.
  • There seemed to be a max line-width used, but I wasn't sure what it was set to, so that might need some formatting too.

The following code should give some indication on how the functions are used:

gitteh.initRepository(repoPath, true, function(err,repo) {
    repo.createBlobFromDisk('README.md',function(err,blob) {
        console.log('Blob',arguments);
        var treeEntry = {
            id: blob,
            name: 'README.md',
            mode: 0x81A4
        };
        repo.createTree([treeEntry],function(err,tree) {
            console.log('Tree',arguments);
            repo.createCommit({
                message: 'First commit',
                committer: {
                    name: 'FrozenCow',
                    email: '[email protected]'
                },
                tree: tree,
                parents: []
            },function(err,commit) {
                console.log('Commit',arguments);
            });
        });
    });
});

Based on http://libgit2.github.com/libgit2/#v0.18.0/group/blob/git_blob_create_fromdisk

CreateBlobFromDisk reads a file specified by its path and creates a
blob. The Result of this function is the oid of the blob and not a blob
object. This is to circumvent reading the bytes of the blob into memory.
CreateTree will create a tree object from an array of TreeEntry. A
TreeEntry should follow have:

* name: the name/path of the file
* id: the id of the blob object
* mode: the mode of the file (as described here: http://libgit2.github.com/libgit2/#v0.18.0/group/treebuilder/git_treebuilder_insert)

This makes use of git_treebuilder_* of libgit2 to create the tree.
CreateCommit is based on
http://libgit2.github.com/libgit2/#v0.18.0/group/commit/git_commit_create.

CreateCommit results in the oid of the commit that was created.
CreateCommit is called with the following keys:

* updateref: maps to update_ref and can be unused. (optional)
* author: a signature of the author (optional, committer is used)
* committer: a signature of the committer
* message: the message of the commit
* tree: the id of a tree object
* parents: an array of parents

A signature has the following keys:

* name: the name of the author/committer
* email: the email of the author/committer
* time: the time of the commit (optional)
* offset: timezone offset of the commit-time (optional)
@lushzero
Copy link

You need to patch bindings.js don't you?

@FrozenCow
Copy link
Author

@lushzero I've updated bindings.js to match the current implementation of the new functions. Though the documentation as a whole doesn't yet match the actual implemented functions, many of the functions that are in the documentation aren't implemented yet as far as I can see.

@lushzero
Copy link

I see that change but I'm still not able to see your new function exposed on the Gitteh object. I guess the cakefile or coffeescript that generates the binding also needs to be updated.

TypeError: Object # has no method 'createBlobFromDisk'
at /var/www/anot8/js/clone.js:11:10
at /var/www/anot8/js/node_modules/gitteh/lib/gitteh.js:789:14
at /var/www/anot8/js/node_modules/gitteh/lib/gitteh.js:56:17

@lushzero
Copy link

@FrozenCow I'm very interested in your code but I just can't seem to get the bindings on the JS side to work. Do you have any working example code or have you not tested this at all yet? Thanks.

@FrozenCow
Copy link
Author

Yea, I couldn't get things to work first too. I had a hunch there was some post-install stuff going on, but wasn't sure what it was. So I just changed in package.json:

"main": "./lib/gitteh"

to:

"main": "./build/Release/gitteh"

So that I called the C++ library (gitteh.node) directly from Javascript. What the Cakefile does is creating a wrapper that mostly passes things through.

But now that I know about the wrapper, I'll fix the coffeescript too.

@FrozenCow
Copy link
Author

An example to use these functions can be found here: https://gist.github.com/FrozenCow/6318468

Note that it is not yet possible to create a new commit based on a previous one, since there is no way to create a tree based on another tree, like libgit2 solves using http://libgit2.github.com/libgit2/#HEAD/type/git_treebuilder . The createTree function might need to be rewritten to a TreeBuilder. I'm not 100% sure whether this those functions will need to be async, since the libgit2 documentation states:

The tree builder can be used to create or modify trees in memory and write them as tree objects to the database.

So if it's indeed all in-memory, it can drop the callbacks, but if it does sneakily use IO somewhere in those functions it should have callbacks.

@lushzero
Copy link

@FrozenCow Tested your example and worked fine for me. Thanks for the excellent inline comments as well. I'm looking to implement "git_index_add_all" and related into bindings. My needs are to be able to create repos, make FS changes, commit those changes, and be able to pull versions from a timeline and revert to a specific version in the timeline. I have what I need testing in C already on libgit2 so there is library support and I just want to implement the bindings for node. I would like to contribute it to this project but it's getting close to the point for me where it would be easier to just implement a fresh simple bindings library than to struggle with all the other potential cleanup this project seems to need.

@FrozenCow
Copy link
Author

I don't think it'll be easier to implement your own library. This library is structured alright, doing it over would be quite a lot of work. Don't forget the complexity/work you need to get straightened to handle all functions asynchronously.

What this library needs is up-to-date and correct documentation. Also some guidelines/examples on how to contribute libgit2 bindings would be helpful. The functions that are already in gitteh are quite simple (simple arguments) and the commits I did could be a start some more complex functions, but I'm still not sure whether it's following the same mindset as the rest of gitteh. In particular 'createTree' does things different and might need a rewrite.

For my purposes it's nearly there. I want to retrieve files from any commit so that they can be viewed from a web-interface, this is already fully possible. Next to that I want to save new/changes/deleted files into new commits based on the one that is being viewed. Later I'll probably need some merging functionality, I don't know whether that's already implemented into libgit2, but we'll see.

I think for your usecases it's also close. Creating new repos is possible (initRepository), commiting based on a working directory is also already implemented (Index), retrieving commits is possible (GetObject) and resetting a branch to that commit is also already possible (I think, CreateSymReference). Not entirely sure I'm correct here, but it seems these functions are implemented in C++.

@FrozenCow
Copy link
Author

Here are some (new/updated) examples that uses the new functions:
https://gist.github.com/FrozenCow/6318468

It's now possible to create trees based on previous (retrieved) trees, since retrieved tree-entries and the createTree arguments both have the same properties now. However retrieved tree-entries do need to be cloned, since they are made immutable in coffeescript.

@mildsunrise
Copy link
Contributor

I just saw this and my first impression is 👍 👍 👍!
I'll look at the code later, but the examples are really
promising!

Unless you're writing a CLI, chances are you're working with
server-side bare repositories, so you'll unevitably need this.
Good work!

@FrozenCow
Copy link
Author

Unless you're writing a CLI, chances are you're working with server-side bare repositories, so you'll unevitably need this.

Exactly, this is what I'm using the functions for now and they seem to be working correctly. The only problem I'm having is having a dependency from my application to "git://github.com/FrozenCow/node-gitteh.git#master" will somehow be missing the Cakefile (among other files). I can't figure out what's up with that, but I guess it's also (hopefully) unrelated to the changes I made.

EDIT: It seems to be because ".npmignore" contains "Cakefile" and "cake build" is never called from npm when installing node-gitteh from a custom source (like git). The published version seems to be quite different since it doesn't have much entries in ".npmignore". I don't have much experience with "package.json" on this level: Coffeescript and native compilation. Could someone with more experience take a look at this?

@mildsunrise
Copy link
Contributor

@FrozenCow @lushzero @iamwilhelm

I suggest we agree on a temporary repository to work on Gitteh while
@samcday is busy relocating(?), so we can then push changes back
to this one when he returns.

Also, this repo is full of old issues so we may focus better on what to
do if these are not there.

I can delete my fork and recreate it so we have a clean master to
push changes to. I'll then add you as contributors.

What do you think?
PS: it doesn't have to be on my account, it was just an example.

@lushzero
Copy link

I don't want to overcomplicate things but I'm pushing ahead with my node-sgit rewrite and have discovered that for the most part the threading in libgit2 is now pretty good (which it was not when gitteh was started) so it looks like the whole async baton handling and cludgy lock/mutex that were done in C++ are no longer needed. Callbacks can be done directly from C++ on direct calls to the underlying C functions and any additional async handling can move to node which from the looks of it is going to make a much smaller, more streamlined library.

@mildsunrise
Copy link
Contributor

I don't want to overcomplicate things

It's just to make PRs to a common repo, and then push them here.
But if you don't like it, it's okay.

so it looks like the whole async baton handling and cludgy lock/mutex that were done in C++ are no longer needed.

agree. that was one of the first things I removed at my [abandoned] rewrite.

@mildsunrise
Copy link
Contributor

But it'll help if we can merge our improvements and work
on top of these. If we don't do that, our PRs will probably
conflict and it's gonna be a mess until @samcday returns.

@FrozenCow
Copy link
Author

@lushzero do you have an example of how functions are now implemented? It does sound good. The baton did indeed complicate things, especially since the code of one function is scattered around the project (baton there, async function here, coffeescript function over there, function registration also somewhere). Also, how much is compatible with gitteh?

@lushzero
Copy link

https://github.com/lushzero/node-sgit was just updated with the init_repository function working. I need to figure out mocha to do unit tests which I haven't worked with before but as soon as I do I think many of the other libgit2 functions should come along quickly because it's very linear now. I am still a little undecided on the naming conventions, I am a C guy, this mix of JS, C and C++ is just a clusterfuck in that regard, so I kept the naming in C style reflective of the underlying repository. init_repository instead of initRepository or initRepo.

Lot's to do still, I need to test around threading some more because the libgit2 documentation is a little fuzzy on exactly where the threading faults still lie and make sure that the build setup properly enables libpthread and checks if it is not. I need to implement a bindings wrapper in JS to handle argument checking and such. Async should be able to be handled now just using Node's async module but I need to include some examples and do a bit more testing. I need to figure out a way to check for memory leaks but this sandwich of C, C++ and Node is giving me a headache about the best way to do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants