|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.kit.furia.index.AbstractIRIndex<O>
O
- The basic unit in which all the information is divided. In the
case of natural language documents, this would be a word.public abstract class AbstractIRIndex<O extends org.ajmm.obsearch.OB>
AbstractIRIndex holds the basic functionality for an Information Retrieval system that works on OB objects (please see www.obsearch.net). By using a distance function d, we transform the queries in terms of the closest elements that are in the database, and once this transformation is performed, we utilize an information retrieval system (Apache's Lucene) to perform the matching.
Nested Class Summary | |
---|---|
protected static class |
AbstractIRIndex.FieldName
Lucene has the concepts of fields of a document. |
protected class |
AbstractIRIndex.Word
Represents an OB object. |
Field Summary | |
---|---|
protected org.apache.lucene.index.IndexReader |
indexReader
This object is used to read different data from the index. |
protected org.apache.lucene.index.IndexWriter |
indexWriter
This object is used to add elements to the index. |
protected float |
mSetScoreThreshold
At least the given naive mset score must be obtained to consider a term in the result. |
protected org.apache.lucene.search.Searcher |
searcher
This object is used to search the index; |
protected float |
setScoreThreshold
At least the given naive set score must be obtained to consider a term in the result. |
protected boolean |
validationMode
Tells whether or not the index is in validation mode. |
Constructor Summary | |
---|---|
AbstractIRIndex(java.io.File dbFolder)
Creates a new IR index if none is available in the given path. |
Method Summary | |
---|---|
protected ResultCandidate |
calculateSimilarity(org.apache.lucene.document.Document document,
java.util.Map<java.lang.Integer,java.lang.Integer> normalizedQuery,
float score)
Calculates the ResultCandidate between a normalized query and a Lucene document. |
void |
close()
Closes the databases. |
protected java.util.PriorityQueue<AbstractIRIndex.Word> |
createPriorityQueue(java.util.Map<java.lang.Integer,java.lang.Integer> words)
Create a PriorityQueue from a word->tf map. |
int |
delete(java.lang.String documentName)
Deletes the given string document from the database. |
void |
freeze()
Freezes the index. |
float |
getMSetScoreThreshold()
The M-set score threshold is the minimum naive score for multi-sets that the index will accept. |
float |
getSetScoreThreshold()
* The Set score threshold is the minimum naive score for Sets that the index will accept. |
int |
getSize()
Returns the # of documents in this DB. |
void |
insert(Document<O> document)
Inserts a new document into the database. |
boolean |
isValidationMode()
Tells whether or not the index is in validation mode. |
protected java.util.List<ResultCandidate> |
processQueryResults(java.util.Map<java.lang.Integer,java.lang.Integer> normalizedQuery,
short n,
Document query)
|
void |
setMSetScoreThreshold(float setScoreThreshold)
The M-set score threshold is the minimum naive score for multi-sets that the index will accept. |
void |
setSetScoreThreshold(float setScoreThreshold)
The Set score threshold is the minimum naive score for Sets that the index will accept. |
void |
setValidationMode(boolean validationMode)
Sets whether or not the index is in validation mode. |
boolean |
shouldSkipDoc(Document<O> x)
Returns true if the document corresponding to x's name exists in the DB. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.kit.furia.IRIndex |
---|
getIndex, getWordsSize |
Field Detail |
---|
protected org.apache.lucene.index.IndexWriter indexWriter
protected org.apache.lucene.index.IndexReader indexReader
protected org.apache.lucene.search.Searcher searcher
protected float mSetScoreThreshold
protected float setScoreThreshold
protected boolean validationMode
Constructor Detail |
---|
public AbstractIRIndex(java.io.File dbFolder) throws java.io.IOException
dbFolder
- The folder in which Lucene's files will be stored
java.io.IOException
- If the given directory does not exist or if some other IO
error occursMethod Detail |
---|
public int delete(java.lang.String documentName) throws IRException
IRIndex
delete
in interface IRIndex<O extends org.ajmm.obsearch.OB>
IRException
- If something goes wrong with the IR engine or with
OBSearch.public boolean shouldSkipDoc(Document<O> x) throws java.io.IOException
shouldSkipDoc
in interface IRIndex<O extends org.ajmm.obsearch.OB>
x
-
java.io.IOException
protected ResultCandidate calculateSimilarity(org.apache.lucene.document.Document document, java.util.Map<java.lang.Integer,java.lang.Integer> normalizedQuery, float score)
public int getSize()
getSize
in interface IRIndex<O extends org.ajmm.obsearch.OB>
protected java.util.List<ResultCandidate> processQueryResults(java.util.Map<java.lang.Integer,java.lang.Integer> normalizedQuery, short n, Document query) throws IRException
IRException
public void insert(Document<O> document) throws IRException
IRIndex
insert
in interface IRIndex<O extends org.ajmm.obsearch.OB>
document
- The document to be inserted.
IRException
- If something goes wrong with the IR engine or with
OBSearch.public void freeze() throws IRException
IRIndex
freeze
in interface IRIndex<O extends org.ajmm.obsearch.OB>
IRException
- If something goes wrong with the IR engine or with
OBSearch.public void close() throws IRException
IRIndex
close
in interface IRIndex<O extends org.ajmm.obsearch.OB>
IRException
- If something goes wrong with the IR engine or with
OBSearch.protected java.util.PriorityQueue<AbstractIRIndex.Word> createPriorityQueue(java.util.Map<java.lang.Integer,java.lang.Integer> words) throws java.io.IOException
words
- a map of words keyed on the word(String) with Int objects
as the values.
java.io.IOException
public float getMSetScoreThreshold()
IRIndex
getMSetScoreThreshold
in interface IRIndex<O extends org.ajmm.obsearch.OB>
public void setMSetScoreThreshold(float setScoreThreshold)
IRIndex
setMSetScoreThreshold
in interface IRIndex<O extends org.ajmm.obsearch.OB>
setScoreThreshold
- the new thresholdpublic float getSetScoreThreshold()
IRIndex
getSetScoreThreshold
in interface IRIndex<O extends org.ajmm.obsearch.OB>
public void setSetScoreThreshold(float setScoreThreshold)
IRIndex
setSetScoreThreshold
in interface IRIndex<O extends org.ajmm.obsearch.OB>
setScoreThreshold
- the new thresholdpublic boolean isValidationMode()
IRIndex
isValidationMode
in interface IRIndex<O extends org.ajmm.obsearch.OB>
public void setValidationMode(boolean validationMode)
IRIndex
setValidationMode
in interface IRIndex<O extends org.ajmm.obsearch.OB>
validationMode
- The new validation mode.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |