leah blogs

April 2006

11apr2006 · My DVCS wishlist

After last week’s intermezzo with Git, my curiosity for distributed version control systems (DVCS) reinflamed again. I also imported the Ruby CVS history into Monotone, which has a pretty fast CVS importer, and Mercurial, which CVS importer seamt to be even faster (cvs20hg), but unfortunately is not complete yet. However, Mercurial also can import from Git, so I went that way.

My projects will continue to be kept in Darcs for near future, but so far no DVCS really could convince me. Wondering about which lacked what, I thought it would be useful to write up what I want to have. So far, I tried: Darcs, Git/Cogito, Mercurial and Monotone. I also dabbled into Bazaar (seems to be discontinued), Bazaar-NG, FastCST (seems to be discontinued) and SVK (IMO just a hack).

So, here is my wishlist (roughly ordered in decreasing importance):

  • Prefer file storage over patch storage, it’s just easier to deal with in practice. It took be a long time to figure this out, but I actually think it’s the more pragmatic solution. I noticed this when I saw how the Git repository just merged with the Gitk repository, even if both didn’t share a single revision. Darcs, on the other hand, even had problems doing merges which were factually the same, but just couldn’t be arranged the right way. The theory of patches sounds nice, but it doesn’t work out.

    Note that this doesn’t exclude diff storage, this of course should be done to save disk space and bandwidth.

    Provided by: Bazaar-NG (I think), Git/Cogito, Mercurial, Monotone.

  • Revisions need to be identified by a globally unique identifier, e.g. a SHA1-hash or a GUID.

    Provided by: Bazaar (theoretically), Bazaar-NG, Darcs, Git/Cogito, Mercurial, Monotone.

  • Revision storage should be implemented as write-once files. Once a file has been written, it should not be touched afterwards. This eases incremental backup and generally improves safety. Alternatively, if files are append-only, this is acceptable too. Changing files leaves a bad taste. (It’s okay for index files and other unessential information.)

    Provided by: Bazaar, Darcs, Git/Cogito, Mercurial.

  • File permissions must be saved, at least the executable bit. Also, the VCS shouldn’t touch the contents of the files at all (no newline conversion, no keywords by default).

    Provided by: Bazaar, Bazaar-NG, Git/Cogito, Mercurial, Monotone.

  • Easy setup of repositories: Setting up a new repository needs to be possible with a single command, usually that’s xxx init—it will turn the current directory into a fresh repository (or even import the files of the current directory, as Cogito does).

    Provided by: Bazaar-NG, Darcs, Git/Cogito, Mercurial.

  • Support multiple heads of development in a single repository. This encourages microbranching and eases incremental development without keeping loads of working directories around.

    Provided by: Git/Cogito, Mercurial [Added 22apr2006, thanks to Daniel NĂ©ri for noticing], Monotone.

  • It has to be possible to export patches with full metadata (e.g. renames) as ASCII files, e.g. to send via mail or share in other ways. It needs to support binary files, too. (Think of contributing graphics to a game.)

    Provided by: Bazaar-NG, Darcs (very good), Git/Cogito (no binary, renames partly), Mercurial (bundles, but they are not ASCII, renames partly), Monotone (packets, good).

  • It needs to be possible to contribute patches via mail. This is the way most non-regular commiters send patches.

    Provided by: Bazaar-NG, Darcs, Git/Cogito, Mercurial, Monotone.

  • Serving repositories over dumb HTTP: This is essential to allow people easily setting up repositories on their cheap webspace. Systems that require CGIs would be acceptable too, here (Mercurial without old-http); opening new ports isn’t. It doesn’t need to be the most efficient way of accessing, but must not be unreasonably inefficient.

    Provided by: Bazaar-NG, Bazaar (slow), Darcs, Git/Cogito, Mercurial, Monotone (soon).

  • I definitely need good Emacs integration, preferably with DVC, alternatively, a good standalone-mode can be enough too.

    Provided by: Bazaar (DVC), Bazaar-NG (DVC), Darcs (own, partly DVC), Git/Cogito (own, DVC), Mercurial (own, DVC), Monotone (own).

  • It needs to provide a GUI repository viewer that can show change history as a tree and diffs for each revision. I’ve found such a tool indispensably since I’ve discovered Gitk, especially if you microbranch a lot.

    Provided by: Bazaar-NG, Git/Cogito, Mercurial (hack), Monotone.

  • It needs a good and fast tool to import CVS trees. I’ve found this absolutely needed to convert legacy repositiories and capture the history of older projects locally.

    Provided by: Git/Cogito (git-cvsimport, parsecvs), Mercurial (cvs20hg, partly), Monotone (own, very good).

  • A library to access all features of the VCS from other tools. If a very comprehensive set of commands is available, this will be acceptable too.

    Provided by: Bazaar (shell), Bazaar-NG (Python), Darcs (shell, XML), Git/Cogito (shell, very good), Mercurial (Python), Monotone (Lua, shell).

If you find any mistakes or misattribution, please post a comment and I’ll correct it.

Writing a good DVCS is not that hard in theory, but very hard in practice—not only for technical reasons. Implementing DVCS is a community effort, I’d even state it’s pointless today to start yet another VCS, unless you are a celebrity that already has a big community behind (cf. Git).

NP: The Smiths—You Just Haven’t Earned It Yet Baby

Copyright © 2004–2022