Saturday, October 21, 2006

DivaScheme - Structural Editing for DrScheme

A small team consisting of Romain Legendre, Guillaume Marceau, Danny Yoo, Kathi Fisler and Shriram Krishnamurthi have created a set of alternative keybindings for DrScheme.

DrScheme has always had keybindings that enabled you to work with whole s-expressions. Like Emacs the normal keybindings require a lot of finger gymnastics due to the use of ctrl/alt/shift. Inspired by Vi the DivaScheme bindings are unchorded. Instead of modifier keys, the editor now have two modes: a command/navigation mode and an insert mode. In command mode x is simply bound to cut, where as x in insert mode inserts an x.

DivaScheme has several features that die-hard Emacs-fans will appreciate. For example keyword-completion a la etags from Emacs. The move-by-searching idiom is also supported via the s command.

It is difficult to explain how the keybindings work in practice, bun fortunately Danny Yoo made this little video to introduce DivaScheme:

Read more on the DivaScheme homepage and in the DivaScheme documentation.

Scheme Search gets pattern matching

This week I had time to work on the search engine again.

From a user view point the most important change is the addition of pattern matching. Until now it was possible to find all documents, where a particular identifier occurs. If on the other hand you were unsure which identifier to search for, you were out of luck. Now you can search for occurrences of identifiers matching a regular expression.

Say you vaguely remember someone using a define- something - struct. A pattern match search for define alone gives 6993 hits. But a search for define-.*-struct gives only 22 hits (the first of which contains define-serializable-struct).

The implementation of the pattern matching search is kept very simple. There are only 150.000 search terms, so the naïve approach of simply matching all terms against the pattern one at a time is fast enough. At least for the moment... After the matching terms have been found, they are looked up in the index and finally the list of documents are ranked.

Under the hood the representation of the lexicon changed from an in-memory hash-table to a disk-based representation. This has two advantages: the web-server uses less memory and the lexicon uses less space on disk (although it resides in-memory, it was read in, when the web-server started). The disadvantage is that searches now requires disk-access. To keep disk access to a minimum, the lexicon is read in blocks of 100 terms, and a few recently used blocks are cached. If there is a need to look up several terms, it is now best to look them up in alphabetical order.

The plan is to use the disk space saved for more indexes. Perhaps an index over the documentation a la the HelpDesk? Unfortunately I haven't got enough disk-space enough to implement "preview" of search results.

In other news Google has released their Google Code Search. It indexes source from Sourceforge, Google Code and other public repositories.

To search for Scheme source add
to the beginning of your query.

Try the above and search for "srfi" to see the how many Scheme projects they have found.