2003-10-28: Wet evening in York

We now have many gigabytes of data on our computers. Sometimes we'd like to find some of that data without trawling through all the files at the time of the search. Luckily we know which bits of data we are likely to be looking for (e.g. contents of My Documents)

The solution offered by Microsoft is Index Server. You can configure catalogs to act as indexes for your data. You can control how often this indexing takes place, how large the summary is and other things. It can get quite sophisticated using XML parsers and it's own query language to pull back data based on attributes.

Sadly, it just doesn't work. After taking the time to setup your Active Server Page to make the query to the relevant catalog you'll notice that the results aren't what you expected. It gives false negatives - I've not yet seen a false positive.

As a simple example, you would expect

#filename *tree*.jpg
to be the union of
#filename *tree.jpg
#filename tree*.jpg 
It isn't. The first query returns things with tree in the name as long as they don't begin or end in tree. The more restrictive queries behave as expected.

Another example is my missing pigeons. I have a file called

Pigeon Perch.jpg
but the query
#filename Pigeon*.jpg
returns no results. I have yet to determine why this is.

And then of course there are the cryptic error messages returned to users without explaining what might be wrong. If you submit

#filename = |*tree.jpg
by mistake then you get back
CreateRecordset error '80040e14' 

One or more errors occurred during processing of command. 
which doesn't really help anyone.