Clariti

Eliminating confusion

Clariti: August 2007

2007/08/18

 

Desktop search with Thunderbird and Beagle

Desktop search has quickly become an essential part of modern computing environments. For Windows operating systems there are a couple of alternatives. On Linux there's only one option: Beagle. Beagle is a reasonably mature desktop search engine, but it lacks support for Thunderbird. Fortunately this is improving. In this article I explore the current state of the art.

This was tested on Ubuntu Feisty (7.04).

Desktop search

My first experience with desktop search was Copernic Desktop search on Windows. I loved it from the first time I used it: from now on I could search through all my private information as easily as searching the web! Access to the right documents, e-mails, music... within seconds.

Ubuntu: great, but how do I find my e-mail?

In October last year I switched to Ubuntu Linux as my primary operating system. Setting up and using Ubuntu has been a great experience. Most things worked right away; a lot of stuff requires a bit of fiddling to accomplish but in the end my system was much more pleasant to use than my Windows system. Except for one thing: desktop search.

Beagle: Linux desktop search

Enter Beagle.
"Beagle is a search tool that ransacks your personal information space to find whatever you're looking for. More technically, Beagle is a Linux desktop-independent service which transparently and unobtrusively indexes your data in real-time."
Ergo, a desktop search solution for Linux, and from the description it's a pretty good one too. It lacked one essential feature for me however: it can't search Thunderbird messages. Integration with Thunderbird was removed a while ago.

Fortunately a fellow named Pierre Östlund took on the task of rewriting the integration between Beagle and Thunderbird. This has taken the form of a Thunderbird extension to export the data to a Beagle index, which the Thunderbird backend for Beagle parses fast. This is highly beta, there's no release version yet.

Setting up Beagle to search Thunderbird e-mails

Setting it up takes four steps:
  1. Install Thunderbird 2.0, which is not in Feisty by default, but required by the extension.
  2. Install the Thunderbird extension.
  3. Build and install Beagle from the latest SVN.
  4. Let them index your stuff.

Installing Thunderbird 2.0 on Feisty

Backup your profile data, it's in ~/.mozilla-thunderbird. Then follow the instructions to install TB 2.0 on help.ubuntu.com. Use the preferred method with the third-party repository.

Download and install the extension to index your mail.

Building Beagle from source

I followed the installation guide. First, uninstall beagle if it's currently installed.
$ sudo aptitude remove beagle
Create a directory and download the source in it.
$ svn checkout http://svn.gnome.org/svn/beagle/trunk/beagle
To be able to build the source you'll need a lot of libraries. This list of packages for Edgy is helpful. Install them all using aptitude or apt-get.

One thing left to do that's not in the manual. Install automake 1.9:
$ sudo aptitude install automake1.9
Open autogen.sh in a text editor and change the REQUIRED_AUTOMAKE_VERSION to 1.9. This will prevent a critical problem with the sequence of compilation.

Now run autogen.sh:
$ ./autogen.sh
This creates all configuration and make files required to compile the source. It checks all the prerequisites; if you miss any of the required packages it will report this. Install them and run autogen again until it reports all is fine and gives you an overview of which Beagle options will be included. Thunderbird support is included by default.

Then let it roll:
$ make
This will produce a lot of output including some warnings, but no critical errors. This took a couple of minutes on my system. You can check whether the compile was successful by looking for the Beagle binaries beagled, beagle-status, etc. in the beagle folder.

When I compile stuff from source I like to use checkinstall to install the package. This creates a neat .deb and installs it in the software repository like any other package, allowing for easy uninstallation. This failed however. I'm not an expert in checkinstall; tips on how to get that to work will be appreciated.

For now we'll do it the not-so-neat-way:
$ make install
The binaries are installed and we're ready to roll.

Indexing your stuff

Now Beagle is installed, the Thunderbird extension can be put to work. There should be a menu option Tools > Beagle indexing settings. Enable indexing and set the speed to Very fast, this will make it process your archive fast. It went through my 41000 e-mails in about 15 minutes.

Now start the beagle daemon. There's an option documented in the FAQ which makes it index at maximum speed. Enable this to index your complete archive on the first run. Note that this will keep your PC very busy, possibly for hours if you also configured Beagle to index many folders with documents.
$ beagle-shutdown
$ export BEAGLE_EXERCISE_THE_DOG=1
$ beagled
You can check how many Thunderbird messages it has left with the following command:
$ ls ~/.beagle/Indexes/ThunderbirdIndex/ToIndex/ | wc
This took under an hour on my machine. Indexing is finished, now let's search!

Start searching

The beagle search tool is started with the beagle-search command or from the Applications menu: Accessoiries > Search.

There's another nice option though: Gnome contains a nifty tool called the Deskbar, which can be used to start Beagle but also to start applications, start writing an e-mail by typing the name of a person in your address book, etc. It's a bit like Quicksilver and AppRocket. Not nearly as neat, but pretty functional.

My experiences

The good:
  • Searching is blazingly fast.
  • It finds some good results.
  • Double clicking on a search result opens the right message in Thunderbird, but only if Thunderbird is not currently running.
The bugs:
  • Double clicking on a search result opens an empty message in Thunderbird if Thunderbird is already running.
  • It doesn't find all results it should, and many times it finds nothing. For example the query "triathlon holten" in the screenshot above gives 27 matches in Beagle search (no hidden matches). The same query in Thunderbird (subject or body match) gives 41 matches. I get similar results for other search terms; one example term gives 3 matches in Beagle and almost a hundred in Thunderbird.
The bad:
  • The Beagle search interface isn't very good.
    • It doesn't show the contents of the e-mails it finds. For other results, for example documents, it shows only a single line. It would be ideal to see the entire e-mail and document right away, but it should show at least a couple of lines in plain text.
    • It shows results from different sources in a fixed sequence, which isn't customizable. E-mail results always end up at the bottom; I'd always want them at the top because they're usually all I need.

Conclusion

The basics of the new Thunderbird extension and Beagle backend work well. Although it's unfinished software and setting it up requires installing development packages and compiling sources, this is relatively easy to do successfully. There are some bugs left to squash; I'm confident this will happen in the near future and we'll see a stable version enter major distributions. This is a great step forward for Linux desktop search.

The overall experience of using Beagle could be greatly improved by improving the usability, functionality and overall sexiness of the Beagle search tool. The technical core is well designed, but a great user experience is required to make end users benefit from it.

Labels: , , ,