The Terrier Project is a document search engine which is developed by the IR Group of University of Glasgow. There are a lot of things that the search engine can do especially on the TREC datasets but since I don’t have access to those huge sets of data yet, I’ll be giving small updates on even smaller aspects of the search engine as I discover them.
It is written in Java and is open source (obviously).Its quite simple to install and try it out. If you are using an Ubuntu OS, you will have OpenJDK installed as the default Java JDK. I personally prefer the Sun EULA licensed JDK. If you do too, then you can install that from Synaptic, its easy and headache-free.
Download the code from the Terrier page. Its going to be in
.tar.gz format and unzip. Use
ant to compile.
As I have written earlier, I’ll be joining the University of Glasgow in September of this year for an MSc in Information Retrieval Systems. I’m not sure what I’ll be getting into after the masters, though I am leaning heavily on a PhD. Research and teaching are things I like doing, but finding funding is going to be challenging in this economic climate.
But putting thoughts of what will happen in more than a year’s time aside, I think I should just focus on learning as much as I can while I can and take it from there. From that point of view, I’ll be writing a number of posts about what I learn about the Terrier project over the days ahead. It could be interesting. (It could also be boring, but I’m too sunny a person for that 🙂 )