That's one thing I've been meaning to look into. I thought there was a quick tool (perl or C-based) that converted a pdf to plain text, but I can't find it right now.. I currently use the
catdoc tools to convert word/excel files to text which are then imported into mysql for searching via
mnogo search. (mnogo actually does all the hard work.. I just tell it to shell out to catdoc)