Low-cost parallel Text retrieval using PC-cluster



We present a parallel vector space based Text retrieval prototype implemented on a low-cost PC cluster running Linux operating system, using the PVM Message passing library. We also embed the inverted file structure into our proposed prototype for fast retrieval. From several experiments derived from the standard TREC-9 collection, this prototype can index up to 500,000 web pages per hour using a simple x86 machine. We also obtain 5.4 seconds Query response time on searching in the one and a half million TREC-9 web pages, using 2 machines. © Springer-Verlag Berlin Heidelberg 2001.

