How to Parse Freetext String to Query Java



org.apache.lucene.queryParser
Class QueryParser

java.lang.Object        extended by         org.apache.lucene.queryParser.QueryParser      
All Implemented Interfaces:
QueryParserConstants
Direct Known Subclasses:
MultiFieldQueryParser

public class            QueryParser            
extends Object
implements QueryParserConstants

This class is generated by JavaCC. The most important method is parse(String). The syntax for query strings is as follows: A Query is a series of clauses. A clause may be prefixed by:

  • a plus (+) or a minus (-) sign, indicating that the clause is required or prohibited respectively; or
  • a term followed by a colon, indicating the field to be searched. This enables one to construct queries which search multiple fields.
A clause may be either:
  • a term, indicating all the documents that contain this term; or
  • a nested query, enclosed in parentheses. Note that this may be used with a +/- prefix to require any of a set of terms.
Thus, in BNF, the query grammar is:
        Query  ::= ( Clause )*    Clause ::= ["+", "-"] [<TERM> ":"] ( <TERM> | "(" Query ")" )      

Examples of appropriately formatted queries can be found in the query syntax documentation.

In TermRangeQuerys, QueryParser tries to detect date values, e.g. date:[6/1/2005 TO 6/4/2005] produces a range query that searches for "date" fields between 2005-06-01 and 2005-06-04. Note that the format of the accepted input depends on the locale. By default a date is converted into a search term using the deprecated DateField for compatibility reasons. To use the new DateTools to convert dates, a DateTools.Resolution has to be set.

The date resolution that shall be used for RangeQueries can be set using setDateResolution(DateTools.Resolution) or setDateResolution(String, DateTools.Resolution). The former sets the default date resolution for all fields, whereas the latter can be used to set field specific date resolutions. Field specific date resolutions take, if set, precedence over the default date resolution.

If you use neither DateField nor DateTools in your index, you can create your own query parser that inherits QueryParser and overwrites getRangeQuery(String, String, String, boolean) to use a different method for date conversion.

Note that QueryParser is not thread-safe.

NOTE: there is a new QueryParser in contrib, which matches the same syntax as this class, but is more modular, enabling substantial customization to how a query is created.

NOTE: You must specify the required Version compatibility when creating QueryParser:

  • As of 2.9, setEnablePositionIncrements(boolean) is true by default.
  • As of 3.1, setAutoGeneratePhraseQueries(boolean) is false by default.

Nested Class Summary
static class QueryParser.Operator
The default operator for parsing queries.
Field Summary
static QueryParser.Operator AND_OPERATOR
Alternative form of QueryParser.Operator.AND
 Token jj_nt
Next token.
static QueryParser.Operator OR_OPERATOR
Alternative form of QueryParser.Operator.OR
 Token token
Current token.
 QueryParserTokenManager token_source
Generated Token Manager.
Fields inherited from interface org.apache.lucene.queryParser.QueryParserConstants
_ESCAPED_CHAR, _NUM_CHAR, _QUOTED_CHAR, _TERM_CHAR, _TERM_START_CHAR, _WHITESPACE, AND, Boost, CARAT, COLON, DEFAULT, EOF, FUZZY_SLOP, LPAREN, MINUS, NOT, NUMBER, OR, PLUS, PREFIXTERM, QUOTED, RangeEx, RANGEEX_END, RANGEEX_GOOP, RANGEEX_QUOTED, RANGEEX_START, RANGEEX_TO, RangeIn, RANGEIN_END, RANGEIN_GOOP, RANGEIN_QUOTED, RANGEIN_START, RANGEIN_TO, RPAREN, STAR, TERM, tokenImage, WILDTERM
Constructor Summary
protected QueryParser(CharStream stream)
Constructor with user supplied CharStream.
protected QueryParser(QueryParserTokenManager tm)
Constructor with generated Token Manager.
QueryParser(Version matchVersion, String f, Analyzer a)
Constructs a query parser.
Method Summary
protected  void addClause(List<BooleanClause> clauses, int conj, int mods, Query q)
 Query Clause(String field)
 int Conjunction()
 void disable_tracing()
Disable tracing.
 void enable_tracing()
Enable tracing.
static String escape(String s)
Returns a String where those characters that QueryParser expects to be escaped are escaped by a preceding \.
 ParseException generateParseException()
Generate ParseException.
 boolean getAllowLeadingWildcard()
 Analyzer getAnalyzer()
 boolean getAutoGeneratePhraseQueries()
protected  Query getBooleanQuery(List<BooleanClause> clauses)
Factory method for generating query, given a set of clauses.
protected  Query getBooleanQuery(List<BooleanClause> clauses, boolean disableCoord)
Factory method for generating query, given a set of clauses.
 DateTools.Resolution getDateResolution(String fieldName)
Returns the date resolution that is used by RangeQueries for the given field.
 QueryParser.Operator getDefaultOperator()
Gets implicit operator setting, which will be either AND_OPERATOR or OR_OPERATOR.
 boolean getEnablePositionIncrements()
 String getField()
protected  Query getFieldQuery(String field, String queryText)
Deprecated.Use getFieldQuery(String,String,boolean) instead.
protected  Query getFieldQuery(String field, String queryText, boolean quoted)
protected  Query getFieldQuery(String field, String queryText, int slop)
Base implementation delegates to getFieldQuery(String,String,boolean).
 float getFuzzyMinSim()
Get the minimal similarity for fuzzy queries.
 int getFuzzyPrefixLength()
Get the prefix length for fuzzy queries.
protected  Query getFuzzyQuery(String field, String termStr, float minSimilarity)
Factory method for generating a query (similar to getWildcardQuery(java.lang.String, java.lang.String)).
 Locale getLocale()
Returns current locale, allowing access by subclasses.
 boolean getLowercaseExpandedTerms()
 MultiTermQuery.RewriteMethod getMultiTermRewriteMethod()
 Token getNextToken()
Get the next Token.
 int getPhraseSlop()
Gets the default slop for phrases.
protected  Query getPrefixQuery(String field, String termStr)
Factory method for generating a query (similar to getWildcardQuery(java.lang.String, java.lang.String)).
 Collator getRangeCollator()
protected  Query getRangeQuery(String field, String part1, String part2, boolean inclusive)
 Token getToken(int index)
Get the specific Token.
protected  Query getWildcardQuery(String field, String termStr)
Factory method for generating a query.
static void main(String[] args)
Command line tool to test QueryParser, using SimpleAnalyzer.
 int Modifiers()
protected  BooleanClause newBooleanClause(Query q, BooleanClause.Occur occur)
Builds a new BooleanClause instance
protected  BooleanQuery newBooleanQuery(boolean disableCoord)
Builds a new BooleanQuery instance
protected  Query newFuzzyQuery(Term term, float minimumSimilarity, int prefixLength)
Builds a new FuzzyQuery instance
protected  Query newMatchAllDocsQuery()
Builds a new MatchAllDocsQuery instance
protected  MultiPhraseQuery newMultiPhraseQuery()
Builds a new MultiPhraseQuery instance
protected  PhraseQuery newPhraseQuery()
Builds a new PhraseQuery instance
protected  Query newPrefixQuery(Term prefix)
Builds a new PrefixQuery instance
protected  Query newRangeQuery(String field, String part1, String part2, boolean inclusive)
Builds a new TermRangeQuery instance
protected  Query newTermQuery(Term term)
Builds a new TermQuery instance
protected  Query newWildcardQuery(Term t)
Builds a new WildcardQuery instance
 Query parse(String query)
Parses a query string, returning a Query.
 Query Query(String field)
 void ReInit(CharStream stream)
Reinitialise.
 void ReInit(QueryParserTokenManager tm)
Reinitialise.
 void setAllowLeadingWildcard(boolean allowLeadingWildcard)
Set to true to allow leading wildcard characters.
 void setAutoGeneratePhraseQueries(boolean value)
Set to true if phrase queries will be automatically generated when the analyzer returns more than one term from whitespace delimited text.
 void setDateResolution(DateTools.Resolution dateResolution)
Sets the default date resolution used by RangeQueries for fields for which no specific date resolutions has been set.
 void setDateResolution(String fieldName, DateTools.Resolution dateResolution)
Sets the date resolution used by RangeQueries for a specific field.
 void setDefaultOperator(QueryParser.Operator op)
Sets the boolean operator of the QueryParser.
 void setEnablePositionIncrements(boolean enable)
Set to true to enable position increments in result query.
 void setFuzzyMinSim(float fuzzyMinSim)
Set the minimum similarity for fuzzy queries.
 void setFuzzyPrefixLength(int fuzzyPrefixLength)
Set the prefix length for fuzzy queries.
 void setLocale(Locale locale)
Set locale used by date range parsing.
 void setLowercaseExpandedTerms(boolean lowercaseExpandedTerms)
Whether terms of wildcard, prefix, fuzzy and range queries are to be automatically lower-cased or not.
 void setMultiTermRewriteMethod(MultiTermQuery.RewriteMethod method)
By default QueryParser uses MultiTermQuery.CONSTANT_SCORE_AUTO_REWRITE_DEFAULT when creating a PrefixQuery, WildcardQuery or RangeQuery.
 void setPhraseSlop(int phraseSlop)
Sets the default slop for phrases.
 void setRangeCollator(Collator rc)
Sets the collator used to determine index term inclusion in ranges for RangeQuerys.
 Query Term(String field)
 Query TopLevelQuery(String field)
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Field Detail

AND_OPERATOR

public static final QueryParser.Operator        AND_OPERATOR      
Alternative form of QueryParser.Operator.AND

OR_OPERATOR

public static final QueryParser.Operator        OR_OPERATOR      
Alternative form of QueryParser.Operator.OR

token_source

public QueryParserTokenManager        token_source      
Generated Token Manager.

token

public Token        token      
Current token.

jj_nt

public Token        jj_nt      
Next token.
Constructor Detail

QueryParser

public        QueryParser(Version matchVersion,                    String f,                    Analyzer a)
Constructs a query parser.
Parameters:
matchVersion - Lucene version to match. See above.
f - the default field for query terms.
a - used to find terms in the query text.

QueryParser

protected        QueryParser(CharStream stream)
Constructor with user supplied CharStream.

QueryParser

protected        QueryParser(QueryParserTokenManager tm)
Constructor with generated Token Manager.
Method Detail

parse

public Query        parse(String query)             throws ParseException
Parses a query string, returning a Query.
Parameters:
query - the query string to be parsed.
Throws:
ParseException - if the parsing fails

getAnalyzer

public Analyzer        getAnalyzer()
Returns:
Returns the analyzer.

getField

public String        getField()
Returns:
Returns the field.

getAutoGeneratePhraseQueries

public final boolean        getAutoGeneratePhraseQueries()
See Also:
setAutoGeneratePhraseQueries(boolean)

setAutoGeneratePhraseQueries

public final void        setAutoGeneratePhraseQueries(boolean value)
Set to true if phrase queries will be automatically generated when the analyzer returns more than one term from whitespace delimited text. NOTE: this behavior may not be suitable for all languages.

Set to false if phrase queries should only be generated when surrounded by double quotes.


getFuzzyMinSim

public float        getFuzzyMinSim()
Get the minimal similarity for fuzzy queries.

setFuzzyMinSim

public void        setFuzzyMinSim(float fuzzyMinSim)
Set the minimum similarity for fuzzy queries. Default is 0.5f.

getFuzzyPrefixLength

public int        getFuzzyPrefixLength()
Get the prefix length for fuzzy queries.
Returns:
Returns the fuzzyPrefixLength.

setFuzzyPrefixLength

public void        setFuzzyPrefixLength(int fuzzyPrefixLength)
Set the prefix length for fuzzy queries. Default is 0.
Parameters:
fuzzyPrefixLength - The fuzzyPrefixLength to set.

setPhraseSlop

public void        setPhraseSlop(int phraseSlop)
Sets the default slop for phrases. If zero, then exact phrase matches are required. Default value is zero.

getPhraseSlop

public int        getPhraseSlop()
Gets the default slop for phrases.

setAllowLeadingWildcard

public void        setAllowLeadingWildcard(boolean allowLeadingWildcard)
Set to true to allow leading wildcard characters.

When set, * or ? are allowed as the first character of a PrefixQuery and WildcardQuery. Note that this can produce very slow queries on big indexes.

Default: false.


getAllowLeadingWildcard

public boolean        getAllowLeadingWildcard()
See Also:
setAllowLeadingWildcard(boolean)

setEnablePositionIncrements

public void        setEnablePositionIncrements(boolean enable)
Set to true to enable position increments in result query.

When set, result phrase and multi-phrase queries will be aware of position increments. Useful when e.g. a StopFilter increases the position increment of the token that follows an omitted token.

Default: false.


getEnablePositionIncrements

public boolean        getEnablePositionIncrements()
See Also:
setEnablePositionIncrements(boolean)

setDefaultOperator

public void        setDefaultOperator(QueryParser.Operator op)
Sets the boolean operator of the QueryParser. In default mode (OR_OPERATOR) terms without any modifiers are considered optional: for example capital of Hungary is equal to capital OR of OR Hungary.
In AND_OPERATOR mode terms are considered to be in conjunction: the above mentioned query is parsed as capital AND of AND Hungary

getDefaultOperator

public QueryParser.Operator        getDefaultOperator()
Gets implicit operator setting, which will be either AND_OPERATOR or OR_OPERATOR.

setLowercaseExpandedTerms

public void        setLowercaseExpandedTerms(boolean lowercaseExpandedTerms)
Whether terms of wildcard, prefix, fuzzy and range queries are to be automatically lower-cased or not. Default is true.

getLowercaseExpandedTerms

public boolean        getLowercaseExpandedTerms()
See Also:
setLowercaseExpandedTerms(boolean)

setMultiTermRewriteMethod

public void        setMultiTermRewriteMethod(MultiTermQuery.RewriteMethod method)
By default QueryParser uses MultiTermQuery.CONSTANT_SCORE_AUTO_REWRITE_DEFAULT when creating a PrefixQuery, WildcardQuery or RangeQuery. This implementation is generally preferable because it a) Runs faster b) Does not have the scarcity of terms unduly influence score c) avoids any "TooManyBooleanClauses" exception. However, if your application really needs to use the old-fashioned BooleanQuery expansion rewriting and the above points are not relevant then use this to change the rewrite method.

getMultiTermRewriteMethod

public MultiTermQuery.RewriteMethod        getMultiTermRewriteMethod()
See Also:
setMultiTermRewriteMethod(org.apache.lucene.search.MultiTermQuery.RewriteMethod)

setLocale

public void        setLocale(Locale locale)
Set locale used by date range parsing.

getLocale

public Locale        getLocale()
Returns current locale, allowing access by subclasses.

setDateResolution

public void        setDateResolution(DateTools.Resolution dateResolution)
Sets the default date resolution used by RangeQueries for fields for which no specific date resolutions has been set. Field specific resolutions can be set with setDateResolution(String, DateTools.Resolution).
Parameters:
dateResolution - the default date resolution to set

setDateResolution

public void        setDateResolution(String fieldName,                               DateTools.Resolution dateResolution)
Sets the date resolution used by RangeQueries for a specific field.
Parameters:
fieldName - field for which the date resolution is to be set
dateResolution - date resolution to set

getDateResolution

public DateTools.Resolution        getDateResolution(String fieldName)
Returns the date resolution that is used by RangeQueries for the given field. Returns null, if no default or field specific date resolution has been set for the given field.

setRangeCollator

public void        setRangeCollator(Collator rc)
Sets the collator used to determine index term inclusion in ranges for RangeQuerys.

WARNING: Setting the rangeCollator to a non-null collator using this method will cause every single index Term in the Field referenced by lowerTerm and/or upperTerm to be examined. Depending on the number of index Terms in this Field, the operation could be very slow.

Parameters:
rc - the collator to use when constructing RangeQuerys

getRangeCollator

public Collator        getRangeCollator()
Returns:
the collator used to determine index term inclusion in ranges for RangeQuerys.

addClause

protected void        addClause(List<BooleanClause> clauses,                          int conj,                          int mods,                          Query q)

getFieldQuery

        @Deprecated        protected Query        getFieldQuery(String field,                                          String queryText)                        throws ParseException
Deprecated.Use getFieldQuery(String,String,boolean) instead.
Throws:
ParseException

getFieldQuery

protected Query        getFieldQuery(String field,                               String queryText,                               boolean quoted)                        throws ParseException
Throws:
ParseException - throw in overridden method to disallow

getFieldQuery

protected Query        getFieldQuery(String field,                               String queryText,                               int slop)                        throws ParseException
Base implementation delegates to getFieldQuery(String,String,boolean). This method may be overridden, for example, to return a SpanNearQuery instead of a PhraseQuery.
Throws:
ParseException - throw in overridden method to disallow

getRangeQuery

protected Query        getRangeQuery(String field,                               String part1,                               String part2,                               boolean inclusive)                        throws ParseException
Throws:
ParseException - throw in overridden method to disallow

newBooleanQuery

protected BooleanQuery        newBooleanQuery(boolean disableCoord)
Builds a new BooleanQuery instance
Parameters:
disableCoord - disable coord
Returns:
new BooleanQuery instance

newBooleanClause

protected BooleanClause        newBooleanClause(Query q,                                          BooleanClause.Occur occur)
Builds a new BooleanClause instance
Parameters:
q - sub query
occur - how this clause should occur when matching documents
Returns:
new BooleanClause instance

newTermQuery

protected Query        newTermQuery(Term term)
Builds a new TermQuery instance
Parameters:
term - term
Returns:
new TermQuery instance

newPhraseQuery

protected PhraseQuery        newPhraseQuery()
Builds a new PhraseQuery instance
Returns:
new PhraseQuery instance

newMultiPhraseQuery

protected MultiPhraseQuery        newMultiPhraseQuery()
Builds a new MultiPhraseQuery instance
Returns:
new MultiPhraseQuery instance

newPrefixQuery

protected Query        newPrefixQuery(Term prefix)
Builds a new PrefixQuery instance
Parameters:
prefix - Prefix term
Returns:
new PrefixQuery instance

newFuzzyQuery

protected Query        newFuzzyQuery(Term term,                               float minimumSimilarity,                               int prefixLength)
Builds a new FuzzyQuery instance
Parameters:
term - Term
minimumSimilarity - minimum similarity
prefixLength - prefix length
Returns:
new FuzzyQuery Instance

newRangeQuery

protected Query        newRangeQuery(String field,                               String part1,                               String part2,                               boolean inclusive)
Builds a new TermRangeQuery instance
Parameters:
field - Field
part1 - min
part2 - max
inclusive - true if range is inclusive
Returns:
new TermRangeQuery instance

newMatchAllDocsQuery

protected Query        newMatchAllDocsQuery()
Builds a new MatchAllDocsQuery instance
Returns:
new MatchAllDocsQuery instance

newWildcardQuery

protected Query        newWildcardQuery(Term t)
Builds a new WildcardQuery instance
Parameters:
t - wildcard term
Returns:
new WildcardQuery instance

getBooleanQuery

protected Query        getBooleanQuery(List<BooleanClause> clauses)                          throws ParseException
Factory method for generating query, given a set of clauses. By default creates a boolean query composed of clauses passed in. Can be overridden by extending classes, to modify query being returned.
Parameters:
clauses - List that contains BooleanClause instances to join.
Returns:
Resulting Query object.
Throws:
ParseException - throw in overridden method to disallow

getBooleanQuery

protected Query        getBooleanQuery(List<BooleanClause> clauses,                                 boolean disableCoord)                          throws ParseException
Factory method for generating query, given a set of clauses. By default creates a boolean query composed of clauses passed in. Can be overridden by extending classes, to modify query being returned.
Parameters:
clauses - List that contains BooleanClause instances to join.
disableCoord - true if coord scoring should be disabled.
Returns:
Resulting Query object.
Throws:
ParseException - throw in overridden method to disallow

getWildcardQuery

protected Query        getWildcardQuery(String field,                                  String termStr)                           throws ParseException
Factory method for generating a query. Called when parser parses an input term token that contains one or more wildcard characters (? and *), but is not a prefix term token (one that has just a single * character at the end)

Depending on settings, prefix term may be lower-cased automatically. It will not go through the default Analyzer, however, since normal Analyzers are unlikely to work properly with wildcard templates.

Can be overridden by extending classes, to provide custom handling for wildcard queries, which may be necessary due to missing analyzer calls.

Parameters:
field - Name of the field query will use.
termStr - Term token that contains one or more wild card characters (? or *), but is not simple prefix term
Returns:
Resulting Query built for the term
Throws:
ParseException - throw in overridden method to disallow

getPrefixQuery

protected Query        getPrefixQuery(String field,                                String termStr)                         throws ParseException
Factory method for generating a query (similar to getWildcardQuery(java.lang.String, java.lang.String)). Called when parser parses an input term token that uses prefix notation; that is, contains a single '*' wildcard character as its last character. Since this is a special case of generic wildcard term, and such a query can be optimized easily, this usually results in a different query object.

Depending on settings, a prefix term may be lower-cased automatically. It will not go through the default Analyzer, however, since normal Analyzers are unlikely to work properly with wildcard templates.

Can be overridden by extending classes, to provide custom handling for wild card queries, which may be necessary due to missing analyzer calls.

Parameters:
field - Name of the field query will use.
termStr - Term token to use for building term for the query (without trailing '*' character!)
Returns:
Resulting Query built for the term
Throws:
ParseException - throw in overridden method to disallow

getFuzzyQuery

protected Query        getFuzzyQuery(String field,                               String termStr,                               float minSimilarity)                        throws ParseException
Factory method for generating a query (similar to getWildcardQuery(java.lang.String, java.lang.String)). Called when parser parses an input term token that has the fuzzy suffix (~) appended.
Parameters:
field - Name of the field query will use.
termStr - Term token to use for building term for the query
Returns:
Resulting Query built for the term
Throws:
ParseException - throw in overridden method to disallow

escape

public static String        escape(String s)
Returns a String where those characters that QueryParser expects to be escaped are escaped by a preceding \.

main

public static void        main(String[] args)                  throws Exception
Command line tool to test QueryParser, using SimpleAnalyzer. Usage:
java org.apache.lucene.queryParser.QueryParser <input>
Throws:
Exception

Conjunction

public final int        Conjunction()                       throws ParseException
Throws:
ParseException

Modifiers

public final int        Modifiers()                     throws ParseException
Throws:
ParseException

TopLevelQuery

public final Query        TopLevelQuery(String field)                           throws ParseException
Throws:
ParseException

Query

public final Query        Query(String field)                   throws ParseException
Throws:
ParseException

Clause

public final Query        Clause(String field)                    throws ParseException
Throws:
ParseException

Term

public final Query        Term(String field)                  throws ParseException
Throws:
ParseException

ReInit

public void        ReInit(CharStream stream)
Reinitialise.

ReInit

public void        ReInit(QueryParserTokenManager tm)
Reinitialise.

getNextToken

public final Token        getNextToken()
Get the next Token.

getToken

public final Token        getToken(int index)
Get the specific Token.

generateParseException

public ParseException        generateParseException()
Generate ParseException.

enable_tracing

public final void        enable_tracing()
Enable tracing.

disable_tracing

public final void        disable_tracing()
Disable tracing.


Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.

How to Parse Freetext String to Query Java

Source: https://lucene.apache.org/core/3_1_0/api/core/org/apache/lucene/queryParser/QueryParser.html

0 Response to "How to Parse Freetext String to Query Java"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel