INDRI - Language modeling meets inference networks
INDRI
Language modeling meets inference networks
Indri is a new search engine from the Lemur project; a
cooperative effort between the University of Massachusetts
and Carnegie Mellon University
to build language modeling information retrieval tools.Effective
- Best-in-class ad hoc retrieval performance
Flexible
- Supports popular structured query operators from INQUERY
- Open source, with a flexible BSD-inspired license
- Parses PDF, HTML, XML, and TREC documents
- Word and PowerPoint parsing (Windows only)
Usable
- Supports UTF-8 encoded text
- Language independent tokenization of UTF-8 encoded documents.
- Includes both command line tools and a Java user interface
- API can be used from Java, PHP, or C++
- Works on Windows, Linux, Solaris and Mac OS X
Powerful
- Can be used on a cluster of machines for faster indexing and retrieval
- Suffix-based wildcard term matching
- Field retrieval
- Passage retrieval
- Scales to terabyte-sized collections
Related Links