leah blogs: Euruko 2004

On Sunday there were mostly talks at the second European Ruby Conference, which I’d like to summarize here.

First, James Britt gave his talk about ruby-doc.org that was scheduled for Saturday, recycling his slides from the Fourth International Ruby Conference (hey, no bad thing!).

To start, he gave an overview of the site and showed what it provides: all kind of Ruby documentation, ranging from the core documentation made using RDoc and the standard library to various additional stuff like videos from Euruko 2003 (none made this year, sorry) and translations of various Ruby tutorials to languages like German and French to pictures of this year’s Euruko (I hope, at least :-)).

He went on talking about the more technical details of the site, about how the site works and what different interfaces it provides. For example, you can query the ri documentation via RESTful HTTP, like this: http://www.ruby-doc.org/find/pickaxe/Array/.

After an short excursion on the history of the site and it’s various stages, he started showing some problems of ruby-doc.org, among them the broken ri support (ruby-doc.org still uses the “old” 1.6 ri, as it changed significantly for Ruby 1.8). Also, searching the site is still a problem, googling seems to be the best way to get what you want.

Another problem is that there is some kind of data duplication needed if you want to add new material to the site, making it harder to maintain all the stuff currently available online.

James freely admitted that he is lazy (not essentially a bad thing for a programmer ;-)), so he will be going to automate everything. That is: no handcoded pages, no duplicate information and—to me, very interesting—his requirements to blogging systems that empower the site. He wants some kind of automatic categorization based on XFML and explained a bit about how it is supposed to be. This was the most important aspect of the talk to me; I’ll definitely need to spend some time on the concepts of XFML and faceted metadata.

The new ruby-doc.org will therefore be using categorization by tags and XFML. He went on talking about aspects of post-modern blogging, the core of which he dubbed annoted view. Basically, this means to generate pages dynamically based on queries by the user and incooperate feeds and external data. Post-modern blogs also support multiple output formats, for example XHTML, RSS, XTM, XFML, and Atom. It is quite logical that such blogs will be accessed by RESTful queries.

Writing the software behind ruby-doc.org, he tries to keep parts of it general purpose libraries so they can be used for other stuff too; he prefers lightweight frameworks to tightly coupled classes (Don’t we all?).

At the end of his talk he asked what people want from ruby-doc.org, but I don’t think he got lots of new ideas, so ruby-doc.org seems to do it’s job pretty well.

The first talk on this Sunday ended with the rhetorical question on how to handle community input without getting big, bloated and hard to manage.

After this, Kingsley Hendrickse presented his website tool Staticweb in a very short talk.

He gave a quick overview of the project, which is to generate some static pages quickly and he showed us interactively how to add new pages and how the templating works.

Kingsley actually wrote that software in two hours while he was sick and couldn’t work, and it was quite impressive for that. A funny thing about the code is that he named all classes after his ex-girlfriends. Whether this improves understanding of the code is questionable, but it’s a nice idea, no? :-)

The next talk was by Armin Roehrl titled Small World in year 2.

He calls Small World—a collaborative distributed contact and knowledge management system—an Eierlegende Wollmichsau—a thing that is supposed to simply do everything.

The special thing about this talk was that Armin didn’t use slides to present his application; instead, he showed a mindmap drawn by FreeMind, and browsed it during his talk. This was a very nice way of presentation, because you always had an overview of the talk and what he currently talked about.

Armin made Small World to keep knowledge among members inside a company that are distributed all over the world. He explained that Small World consists of lots of available software used together in a flexible way and he told that the key to Small World was not a revolution, but heavy development.

Small World is not only used to share knowledge and coordinate projects but also to keep customer contacts and information about them. There were “too many interesting things to do” with it, he told. :-)

The general problem addressed by Small World is the problem of getting information. Armin can’t use a shared filesystem, and he is known for being “messy and lazy”. Versioning and backup of data is a problem too.

The primary path to find the information you want is not to get the information directly, but know people that know the information (or other people that may know it…).

He showed a few ways on how to find data, ranging from a simple grep -r over collecting feedback from individuals and the mass (he gave the Amazon book recommendations as an example) and monitoring for changes up to complex search engines like Clusty and Google.

Although Small World can be used for a personal semantic web, it benefits a lot from the network by making use of the “network effect” as seen on Wikis, for example.

He explained that there are very different types of information ranging from short-term information like email over forums and blogs to files published on the net, which are used for long-time storage mostly. All these kinds of informations need to be managed.

Small World features trivial text classification and collaborated knowledge.

In order to build Small World, Armin explained that he needed only some things, the most important being cheap harddisk and standing on the shoulder of giants, by making use of existing programs like Samizdat, Squish, Estraier, Graphviz, Wikipedia, Freemind, and, of course, Ruby.

He went on with a list of things to do to improve Small World in the future: Writing RDoc documentation, adding a WebDAV interface, building a better search-engine, providing RSS with smart filtering, having better content annotating and so-called multiblogs, which really are blogs of blogs (I think this is comparable to Planets).

Unfortunately, Armin hasn’t yet released any code yet, since the project is still very much being developed and they also are exploring business strategies using it.

After this, Mathieu Bouchard presented some hacks using MetaRuby, for example the InstanceVariableHash which is best described using a short example:

h  # => {}
@foo = 42
h  # => {"@foo" => 42}
h["@bar"] = :mumble
h = {"@foo" => 42, "@bar" => :mumble}
@bar  # => :mumble

He also showed the UndoableArray that provides a undo-method to revert changes to the Array.

An example of SubArray followed:

a  # => [1,2,3,4,5]
b = a.part(1,2)
b  # => [2,3]
b[1] = 42
a  # => [1,2,42,4,5]

MetaRuby was written in 2001, and Mathieu didn’t yet have a chance to update it to Ruby 1.8, so the code only runs with 1.6.8 for now.

The memorable quote about this presentation was the answer on whether he wanted to distribute his own version of Ruby: At least not yet., he told us. ;-)

The turn was now up to Claudius Link who presented rake—Ruby make.

rake is used for build automation, that is creation of executables for multiple configurations, creation of documentation, and building and running tests.

He compared rake to other systems like make and ant and explained that these other systems are limited to a static description of build rules or a special structure, which rake is not. In fact, rake scripts are “usual” ruby programs, so you have the full flexibility and power of ruby at your fingertips. rake basically is a Ruby framework and provides utilities for building.

This was followed by a “Getting started” section that included a simple Rakefile to which dependencies were added in the second step. He showed how you can easily do file tasks that do a timestamp check to see if the task needs to be run. A generalization of this are expression rules or even complex rules whose values can be computed using lambda {}.

Advanced rake users often define new tasks or inherit from existing tasks to do new things and simplify the Rakefiles. If this is done correctly, the Rakefile is written in a domain specific language and even non-programmers (or programmers that don’t know Ruby) can write and adapt Rakefiles easily.

As a final example, Claudius showed how rake is used at his company. They have a quite complex build system there that needs to interact with Rational Rose Real-Time to do Model-Driven Architecture. (I don’t want to say what I think about this, he told us.) The build code needs to support multiple platforms. Additionally, he needs to automate running unit tests and publish their results.

Having used rake in such a complex situation, he discovered some issues with it, among them the lack of “lazy dependencies”, which are expanded only if the given task is actually to be run (I showed him how to do that quite easily, though), multiple dependencies and a more structured error handling. Other problems included lacking parallel and distributed builds, and last but not least the “shaky” documentation, as he put it.

The next—and last—talk was given by Mathieu Bouchard again, this time about RubyX11.

He started comparing RubyX11 to other implementations and their sizes: Xlib, the original and most used C library to speak to X servers, is about 3000kb of sources, which Mathieu thinks is much too much for a “rather simple” protocol like X11. A part of the reason is that it has a lot of comments, and also many shortcuts to the protocol.

Then he showed CLX, a Common Lisp implementation of Xlib which has 800kb, but still is too lowlevel code. To paraphrase him: “I thought all Common Lisp programmers were good programmers.” He mentioned the GNU Smalltalk X11 implementation with 400kb source code and the Perl module X11::Protocol which is 160kb large, but uses numbered parameters ($_[1]), and makes heavy use of pack and unpack, therefore being confusing and unreadable. All these alternatives—and RubyX11 too—do not depend on Xlib.

Now, he got to RubyX11, which is only 85kb in size and consists of three parts: 20kb implement RPC and marshalling code, 50kb reimplement xlib.h, and 15kb are mostly keyname data.

He gave an example of RubyX11 which was a nice screensaver that rendered nice curves using a iterated function system (IFS).

Then he focused on the implementation of RubyX11. He simplified some calls to X11 because they would take 11 arguments in Xlib and it would be nearly impossible to remember their order. :-)

RubyX11 basically works by defining a domain specific language inside Ruby to describe the binary protocols, then X11 would be defined on top of that.

Mathieu also pointed out some performance issues that still exist. While he used an generic encoder up to RubyX11 0.5, which basically was a big function that would format a string, he uses a “compiler” now in the CVS version. It will generate some code on the first run that is getting evaled into a method which can be called then. RubyX11 therefore is a general example for generating code at runtime using a domain specific language. This made a large performance improvement, but Mathieu could even imagine generating C code at installation time…

The Sunday ended with some discussion, and people split up in groups to talk together. They would leave the conference place shortly after, and everyone would be trying to come next year again…

leah blogs

10oct2004 · Euruko 2004 - Day Two