Everything Scheme
This blog will be about all things Scheme.
The main theme will in the beginning be musings about building a search engine with Scheme. Having a concrete project to blog about will hopefully improve both the quality and the frequency of posts. I am easily amused, so expect diversions along the way.
But - why on earth write a search engine?
Recently I hacked together the PLT Source Browser which enables you to browse the source code for all packages submitted to the PLaneT Package Repository as well as the source for the various builtin collections (libraries) of PLT Scheme. The source browser is a great tool to study other peoples code - that is - if you know where to look. Looking for usages of a particular functions is at the moment difficult. Adding a Google SiteSearch helped a little, but only a little. It is a problem that the Google index is never completely up-to-date, that Google interprets "-" as a delimeter which leads to false hits, and finally that there is no way to help Google rank the various hits.
Whether the need for a custom search engine is perceived or not, the inner workings of a search engine is a fun playground for both algorithms for compression as well as for datastructures. What more can one wish for in a project?
The main theme will in the beginning be musings about building a search engine with Scheme. Having a concrete project to blog about will hopefully improve both the quality and the frequency of posts. I am easily amused, so expect diversions along the way.
But - why on earth write a search engine?
Recently I hacked together the PLT Source Browser which enables you to browse the source code for all packages submitted to the PLaneT Package Repository as well as the source for the various builtin collections (libraries) of PLT Scheme. The source browser is a great tool to study other peoples code - that is - if you know where to look. Looking for usages of a particular functions is at the moment difficult. Adding a Google SiteSearch helped a little, but only a little. It is a problem that the Google index is never completely up-to-date, that Google interprets "-" as a delimeter which leads to false hits, and finally that there is no way to help Google rank the various hits.
Whether the need for a custom search engine is perceived or not, the inner workings of a search engine is a fun playground for both algorithms for compression as well as for datastructures. What more can one wish for in a project?
3 Comments:
I'm glad to see this new Scheme blog.
About search engines... Large companies with search engines distribute their work to thousands of machines. I can see how Scheme could help with a search on one machine, but I'm trying to picture how the search could be distributed successfully using Scheme.
I am glad you like it.
How Scheme can be used to implement distributed search? Hmm. Just like any other language I guess. I believe you don't need more than basic networking capabilities.
Thanks.
Post a Comment
<< Home