How to Parse Freetext String to Query Java
org.apache.lucene.queryParser
Class QueryParser
java.lang.Object org.apache.lucene.queryParser.QueryParser
- All Implemented Interfaces:
- QueryParserConstants
- Direct Known Subclasses:
- MultiFieldQueryParser
-
public class QueryParser
- extends Object
- implements QueryParserConstants
This class is generated by JavaCC. The most important method is parse(String)
. The syntax for query strings is as follows: A Query is a series of clauses. A clause may be prefixed by:
- a plus (
+
) or a minus (-
) sign, indicating that the clause is required or prohibited respectively; or - a term followed by a colon, indicating the field to be searched. This enables one to construct queries which search multiple fields.
- a term, indicating all the documents that contain this term; or
- a nested query, enclosed in parentheses. Note that this may be used with a
+
/-
prefix to require any of a set of terms.
Query ::= ( Clause )* Clause ::= ["+", "-"] [<TERM> ":"] ( <TERM> | "(" Query ")" )
Examples of appropriately formatted queries can be found in the query syntax documentation.
In TermRangeQuery
s, QueryParser tries to detect date values, e.g. date:[6/1/2005 TO 6/4/2005] produces a range query that searches for "date" fields between 2005-06-01 and 2005-06-04. Note that the format of the accepted input depends on the locale
. By default a date is converted into a search term using the deprecated DateField
for compatibility reasons. To use the new DateTools
to convert dates, a DateTools.Resolution
has to be set.
The date resolution that shall be used for RangeQueries can be set using setDateResolution(DateTools.Resolution)
or setDateResolution(String, DateTools.Resolution)
. The former sets the default date resolution for all fields, whereas the latter can be used to set field specific date resolutions. Field specific date resolutions take, if set, precedence over the default date resolution.
If you use neither DateField
nor DateTools
in your index, you can create your own query parser that inherits QueryParser and overwrites getRangeQuery(String, String, String, boolean)
to use a different method for date conversion.
Note that QueryParser is not thread-safe.
NOTE: there is a new QueryParser in contrib, which matches the same syntax as this class, but is more modular, enabling substantial customization to how a query is created.
NOTE: You must specify the required Version
compatibility when creating QueryParser:
- As of 2.9,
setEnablePositionIncrements(boolean)
is true by default. - As of 3.1,
setAutoGeneratePhraseQueries(boolean)
is false by default.
Nested Class Summary | |
---|---|
static class | QueryParser.Operator The default operator for parsing queries. |
Field Summary | |
---|---|
static QueryParser.Operator | AND_OPERATOR Alternative form of QueryParser.Operator.AND |
Token | jj_nt Next token. |
static QueryParser.Operator | OR_OPERATOR Alternative form of QueryParser.Operator.OR |
Token | token Current token. |
QueryParserTokenManager | token_source Generated Token Manager. |
Fields inherited from interface org.apache.lucene.queryParser.QueryParserConstants |
---|
_ESCAPED_CHAR, _NUM_CHAR, _QUOTED_CHAR, _TERM_CHAR, _TERM_START_CHAR, _WHITESPACE, AND, Boost, CARAT, COLON, DEFAULT, EOF, FUZZY_SLOP, LPAREN, MINUS, NOT, NUMBER, OR, PLUS, PREFIXTERM, QUOTED, RangeEx, RANGEEX_END, RANGEEX_GOOP, RANGEEX_QUOTED, RANGEEX_START, RANGEEX_TO, RangeIn, RANGEIN_END, RANGEIN_GOOP, RANGEIN_QUOTED, RANGEIN_START, RANGEIN_TO, RPAREN, STAR, TERM, tokenImage, WILDTERM |
Constructor Summary | |
---|---|
protected | QueryParser(CharStream stream) Constructor with user supplied CharStream. |
protected | QueryParser(QueryParserTokenManager tm) Constructor with generated Token Manager. |
| QueryParser(Version matchVersion, String f, Analyzer a) Constructs a query parser. |
Method Summary | |
---|---|
protected void | addClause(List<BooleanClause> clauses, int conj, int mods, Query q) |
Query | Clause(String field) |
int | Conjunction() |
void | disable_tracing() Disable tracing. |
void | enable_tracing() Enable tracing. |
static String | escape(String s) Returns a String where those characters that QueryParser expects to be escaped are escaped by a preceding \ . |
ParseException | generateParseException() Generate ParseException. |
boolean | getAllowLeadingWildcard() |
Analyzer | getAnalyzer() |
boolean | getAutoGeneratePhraseQueries() |
protected Query | getBooleanQuery(List<BooleanClause> clauses) Factory method for generating query, given a set of clauses. |
protected Query | getBooleanQuery(List<BooleanClause> clauses, boolean disableCoord) Factory method for generating query, given a set of clauses. |
DateTools.Resolution | getDateResolution(String fieldName) Returns the date resolution that is used by RangeQueries for the given field. |
QueryParser.Operator | getDefaultOperator() Gets implicit operator setting, which will be either AND_OPERATOR or OR_OPERATOR. |
boolean | getEnablePositionIncrements() |
String | getField() |
protected Query | getFieldQuery(String field, String queryText) Deprecated.Use getFieldQuery(String,String,boolean) instead. |
protected Query | getFieldQuery(String field, String queryText, boolean quoted) |
protected Query | getFieldQuery(String field, String queryText, int slop) Base implementation delegates to getFieldQuery(String,String,boolean) . |
float | getFuzzyMinSim() Get the minimal similarity for fuzzy queries. |
int | getFuzzyPrefixLength() Get the prefix length for fuzzy queries. |
protected Query | getFuzzyQuery(String field, String termStr, float minSimilarity) Factory method for generating a query (similar to getWildcardQuery(java.lang.String, java.lang.String) ). |
Locale | getLocale() Returns current locale, allowing access by subclasses. |
boolean | getLowercaseExpandedTerms() |
MultiTermQuery.RewriteMethod | getMultiTermRewriteMethod() |
Token | getNextToken() Get the next Token. |
int | getPhraseSlop() Gets the default slop for phrases. |
protected Query | getPrefixQuery(String field, String termStr) Factory method for generating a query (similar to getWildcardQuery(java.lang.String, java.lang.String) ). |
Collator | getRangeCollator() |
protected Query | getRangeQuery(String field, String part1, String part2, boolean inclusive) |
Token | getToken(int index) Get the specific Token. |
protected Query | getWildcardQuery(String field, String termStr) Factory method for generating a query. |
static void | main(String[] args) Command line tool to test QueryParser, using SimpleAnalyzer . |
int | Modifiers() |
protected BooleanClause | newBooleanClause(Query q, BooleanClause.Occur occur) Builds a new BooleanClause instance |
protected BooleanQuery | newBooleanQuery(boolean disableCoord) Builds a new BooleanQuery instance |
protected Query | newFuzzyQuery(Term term, float minimumSimilarity, int prefixLength) Builds a new FuzzyQuery instance |
protected Query | newMatchAllDocsQuery() Builds a new MatchAllDocsQuery instance |
protected MultiPhraseQuery | newMultiPhraseQuery() Builds a new MultiPhraseQuery instance |
protected PhraseQuery | newPhraseQuery() Builds a new PhraseQuery instance |
protected Query | newPrefixQuery(Term prefix) Builds a new PrefixQuery instance |
protected Query | newRangeQuery(String field, String part1, String part2, boolean inclusive) Builds a new TermRangeQuery instance |
protected Query | newTermQuery(Term term) Builds a new TermQuery instance |
protected Query | newWildcardQuery(Term t) Builds a new WildcardQuery instance |
Query | parse(String query) Parses a query string, returning a Query . |
Query | Query(String field) |
void | ReInit(CharStream stream) Reinitialise. |
void | ReInit(QueryParserTokenManager tm) Reinitialise. |
void | setAllowLeadingWildcard(boolean allowLeadingWildcard) Set to true to allow leading wildcard characters. |
void | setAutoGeneratePhraseQueries(boolean value) Set to true if phrase queries will be automatically generated when the analyzer returns more than one term from whitespace delimited text. |
void | setDateResolution(DateTools.Resolution dateResolution) Sets the default date resolution used by RangeQueries for fields for which no specific date resolutions has been set. |
void | setDateResolution(String fieldName, DateTools.Resolution dateResolution) Sets the date resolution used by RangeQueries for a specific field. |
void | setDefaultOperator(QueryParser.Operator op) Sets the boolean operator of the QueryParser. |
void | setEnablePositionIncrements(boolean enable) Set to true to enable position increments in result query. |
void | setFuzzyMinSim(float fuzzyMinSim) Set the minimum similarity for fuzzy queries. |
void | setFuzzyPrefixLength(int fuzzyPrefixLength) Set the prefix length for fuzzy queries. |
void | setLocale(Locale locale) Set locale used by date range parsing. |
void | setLowercaseExpandedTerms(boolean lowercaseExpandedTerms) Whether terms of wildcard, prefix, fuzzy and range queries are to be automatically lower-cased or not. |
void | setMultiTermRewriteMethod(MultiTermQuery.RewriteMethod method) By default QueryParser uses MultiTermQuery.CONSTANT_SCORE_AUTO_REWRITE_DEFAULT when creating a PrefixQuery, WildcardQuery or RangeQuery. |
void | setPhraseSlop(int phraseSlop) Sets the default slop for phrases. |
void | setRangeCollator(Collator rc) Sets the collator used to determine index term inclusion in ranges for RangeQuerys. |
Query | Term(String field) |
Query | TopLevelQuery(String field) |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
AND_OPERATOR
public static final QueryParser.Operator AND_OPERATOR
- Alternative form of QueryParser.Operator.AND
OR_OPERATOR
public static final QueryParser.Operator OR_OPERATOR
- Alternative form of QueryParser.Operator.OR
token_source
public QueryParserTokenManager token_source
- Generated Token Manager.
token
public Token token
- Current token.
jj_nt
public Token jj_nt
- Next token.
Constructor Detail |
---|
QueryParser
public QueryParser(Version matchVersion, String f, Analyzer a)
- Constructs a query parser.
- Parameters:
-
matchVersion
- Lucene version to match. See above. -
f
- the default field for query terms. -
a
- used to find terms in the query text.
QueryParser
protected QueryParser(CharStream stream)
- Constructor with user supplied CharStream.
QueryParser
protected QueryParser(QueryParserTokenManager tm)
- Constructor with generated Token Manager.
Method Detail |
---|
parse
public Query parse(String query) throws ParseException
- Parses a query string, returning a
Query
. -
-
- Parameters:
-
query
- the query string to be parsed. - Throws:
-
ParseException
- if the parsing fails
getAnalyzer
public Analyzer getAnalyzer()
-
-
- Returns:
- Returns the analyzer.
getField
public String getField()
-
-
- Returns:
- Returns the field.
getAutoGeneratePhraseQueries
public final boolean getAutoGeneratePhraseQueries()
-
-
- See Also:
-
setAutoGeneratePhraseQueries(boolean)
setAutoGeneratePhraseQueries
public final void setAutoGeneratePhraseQueries(boolean value)
- Set to true if phrase queries will be automatically generated when the analyzer returns more than one term from whitespace delimited text. NOTE: this behavior may not be suitable for all languages.
Set to false if phrase queries should only be generated when surrounded by double quotes.
-
-
getFuzzyMinSim
public float getFuzzyMinSim()
- Get the minimal similarity for fuzzy queries.
-
-
setFuzzyMinSim
public void setFuzzyMinSim(float fuzzyMinSim)
- Set the minimum similarity for fuzzy queries. Default is 0.5f.
-
-
getFuzzyPrefixLength
public int getFuzzyPrefixLength()
- Get the prefix length for fuzzy queries.
-
-
- Returns:
- Returns the fuzzyPrefixLength.
setFuzzyPrefixLength
public void setFuzzyPrefixLength(int fuzzyPrefixLength)
- Set the prefix length for fuzzy queries. Default is 0.
-
-
- Parameters:
-
fuzzyPrefixLength
- The fuzzyPrefixLength to set.
setPhraseSlop
public void setPhraseSlop(int phraseSlop)
- Sets the default slop for phrases. If zero, then exact phrase matches are required. Default value is zero.
-
-
getPhraseSlop
public int getPhraseSlop()
- Gets the default slop for phrases.
-
-
setAllowLeadingWildcard
public void setAllowLeadingWildcard(boolean allowLeadingWildcard)
- Set to
true
to allow leading wildcard characters.When set,
*
or?
are allowed as the first character of a PrefixQuery and WildcardQuery. Note that this can produce very slow queries on big indexes.Default: false.
-
-
getAllowLeadingWildcard
public boolean getAllowLeadingWildcard()
-
-
- See Also:
-
setAllowLeadingWildcard(boolean)
setEnablePositionIncrements
public void setEnablePositionIncrements(boolean enable)
- Set to
true
to enable position increments in result query.When set, result phrase and multi-phrase queries will be aware of position increments. Useful when e.g. a StopFilter increases the position increment of the token that follows an omitted token.
Default: false.
-
-
getEnablePositionIncrements
public boolean getEnablePositionIncrements()
-
-
- See Also:
-
setEnablePositionIncrements(boolean)
setDefaultOperator
public void setDefaultOperator(QueryParser.Operator op)
- Sets the boolean operator of the QueryParser. In default mode (
OR_OPERATOR
) terms without any modifiers are considered optional: for examplecapital of Hungary
is equal tocapital OR of OR Hungary
.
InAND_OPERATOR
mode terms are considered to be in conjunction: the above mentioned query is parsed ascapital AND of AND Hungary
-
-
getDefaultOperator
public QueryParser.Operator getDefaultOperator()
- Gets implicit operator setting, which will be either AND_OPERATOR or OR_OPERATOR.
-
-
setLowercaseExpandedTerms
public void setLowercaseExpandedTerms(boolean lowercaseExpandedTerms)
- Whether terms of wildcard, prefix, fuzzy and range queries are to be automatically lower-cased or not. Default is
true
. -
-
getLowercaseExpandedTerms
public boolean getLowercaseExpandedTerms()
-
-
- See Also:
-
setLowercaseExpandedTerms(boolean)
setMultiTermRewriteMethod
public void setMultiTermRewriteMethod(MultiTermQuery.RewriteMethod method)
- By default QueryParser uses
MultiTermQuery.CONSTANT_SCORE_AUTO_REWRITE_DEFAULT
when creating a PrefixQuery, WildcardQuery or RangeQuery. This implementation is generally preferable because it a) Runs faster b) Does not have the scarcity of terms unduly influence score c) avoids any "TooManyBooleanClauses" exception. However, if your application really needs to use the old-fashioned BooleanQuery expansion rewriting and the above points are not relevant then use this to change the rewrite method. -
-
getMultiTermRewriteMethod
public MultiTermQuery.RewriteMethod getMultiTermRewriteMethod()
-
-
- See Also:
-
setMultiTermRewriteMethod(org.apache.lucene.search.MultiTermQuery.RewriteMethod)
setLocale
public void setLocale(Locale locale)
- Set locale used by date range parsing.
-
-
getLocale
public Locale getLocale()
- Returns current locale, allowing access by subclasses.
-
-
setDateResolution
public void setDateResolution(DateTools.Resolution dateResolution)
- Sets the default date resolution used by RangeQueries for fields for which no specific date resolutions has been set. Field specific resolutions can be set with
setDateResolution(String, DateTools.Resolution)
. -
-
- Parameters:
-
dateResolution
- the default date resolution to set
setDateResolution
public void setDateResolution(String fieldName, DateTools.Resolution dateResolution)
- Sets the date resolution used by RangeQueries for a specific field.
-
-
- Parameters:
-
fieldName
- field for which the date resolution is to be set -
dateResolution
- date resolution to set
getDateResolution
public DateTools.Resolution getDateResolution(String fieldName)
- Returns the date resolution that is used by RangeQueries for the given field. Returns null, if no default or field specific date resolution has been set for the given field.
-
-
setRangeCollator
public void setRangeCollator(Collator rc)
- Sets the collator used to determine index term inclusion in ranges for RangeQuerys.
WARNING: Setting the rangeCollator to a non-null collator using this method will cause every single index Term in the Field referenced by lowerTerm and/or upperTerm to be examined. Depending on the number of index Terms in this Field, the operation could be very slow.
-
-
- Parameters:
-
rc
- the collator to use when constructing RangeQuerys
getRangeCollator
public Collator getRangeCollator()
-
-
- Returns:
- the collator used to determine index term inclusion in ranges for RangeQuerys.
addClause
protected void addClause(List<BooleanClause> clauses, int conj, int mods, Query q)
getFieldQuery
@Deprecated protected Query getFieldQuery(String field, String queryText) throws ParseException
- Deprecated.Use
getFieldQuery(String,String,boolean)
instead. -
-
- Throws:
-
ParseException
getFieldQuery
protected Query getFieldQuery(String field, String queryText, boolean quoted) throws ParseException
-
-
- Throws:
-
ParseException
- throw in overridden method to disallow
getFieldQuery
protected Query getFieldQuery(String field, String queryText, int slop) throws ParseException
- Base implementation delegates to
getFieldQuery(String,String,boolean)
. This method may be overridden, for example, to return a SpanNearQuery instead of a PhraseQuery. -
-
- Throws:
-
ParseException
- throw in overridden method to disallow
getRangeQuery
protected Query getRangeQuery(String field, String part1, String part2, boolean inclusive) throws ParseException
-
-
- Throws:
-
ParseException
- throw in overridden method to disallow
newBooleanQuery
protected BooleanQuery newBooleanQuery(boolean disableCoord)
- Builds a new BooleanQuery instance
-
-
- Parameters:
-
disableCoord
- disable coord - Returns:
- new BooleanQuery instance
newBooleanClause
protected BooleanClause newBooleanClause(Query q, BooleanClause.Occur occur)
- Builds a new BooleanClause instance
-
-
- Parameters:
-
q
- sub query -
occur
- how this clause should occur when matching documents - Returns:
- new BooleanClause instance
newTermQuery
protected Query newTermQuery(Term term)
- Builds a new TermQuery instance
-
-
- Parameters:
-
term
- term - Returns:
- new TermQuery instance
newPhraseQuery
protected PhraseQuery newPhraseQuery()
- Builds a new PhraseQuery instance
-
-
- Returns:
- new PhraseQuery instance
newMultiPhraseQuery
protected MultiPhraseQuery newMultiPhraseQuery()
- Builds a new MultiPhraseQuery instance
-
-
- Returns:
- new MultiPhraseQuery instance
newPrefixQuery
protected Query newPrefixQuery(Term prefix)
- Builds a new PrefixQuery instance
-
-
- Parameters:
-
prefix
- Prefix term - Returns:
- new PrefixQuery instance
newFuzzyQuery
protected Query newFuzzyQuery(Term term, float minimumSimilarity, int prefixLength)
- Builds a new FuzzyQuery instance
-
-
- Parameters:
-
term
- Term -
minimumSimilarity
- minimum similarity -
prefixLength
- prefix length - Returns:
- new FuzzyQuery Instance
newRangeQuery
protected Query newRangeQuery(String field, String part1, String part2, boolean inclusive)
- Builds a new TermRangeQuery instance
-
-
- Parameters:
-
field
- Field -
part1
- min -
part2
- max -
inclusive
- true if range is inclusive - Returns:
- new TermRangeQuery instance
newMatchAllDocsQuery
protected Query newMatchAllDocsQuery()
- Builds a new MatchAllDocsQuery instance
-
-
- Returns:
- new MatchAllDocsQuery instance
newWildcardQuery
protected Query newWildcardQuery(Term t)
- Builds a new WildcardQuery instance
-
-
- Parameters:
-
t
- wildcard term - Returns:
- new WildcardQuery instance
getBooleanQuery
protected Query getBooleanQuery(List<BooleanClause> clauses) throws ParseException
- Factory method for generating query, given a set of clauses. By default creates a boolean query composed of clauses passed in. Can be overridden by extending classes, to modify query being returned.
-
-
- Parameters:
-
clauses
- List that containsBooleanClause
instances to join. - Returns:
- Resulting
Query
object. - Throws:
-
ParseException
- throw in overridden method to disallow
getBooleanQuery
protected Query getBooleanQuery(List<BooleanClause> clauses, boolean disableCoord) throws ParseException
- Factory method for generating query, given a set of clauses. By default creates a boolean query composed of clauses passed in. Can be overridden by extending classes, to modify query being returned.
-
-
- Parameters:
-
clauses
- List that containsBooleanClause
instances to join. -
disableCoord
- true if coord scoring should be disabled. - Returns:
- Resulting
Query
object. - Throws:
-
ParseException
- throw in overridden method to disallow
getWildcardQuery
protected Query getWildcardQuery(String field, String termStr) throws ParseException
- Factory method for generating a query. Called when parser parses an input term token that contains one or more wildcard characters (? and *), but is not a prefix term token (one that has just a single * character at the end)
Depending on settings, prefix term may be lower-cased automatically. It will not go through the default Analyzer, however, since normal Analyzers are unlikely to work properly with wildcard templates.
Can be overridden by extending classes, to provide custom handling for wildcard queries, which may be necessary due to missing analyzer calls.
-
-
- Parameters:
-
field
- Name of the field query will use. -
termStr
- Term token that contains one or more wild card characters (? or *), but is not simple prefix term - Returns:
- Resulting
Query
built for the term - Throws:
-
ParseException
- throw in overridden method to disallow
getPrefixQuery
protected Query getPrefixQuery(String field, String termStr) throws ParseException
- Factory method for generating a query (similar to
getWildcardQuery(java.lang.String, java.lang.String)
). Called when parser parses an input term token that uses prefix notation; that is, contains a single '*' wildcard character as its last character. Since this is a special case of generic wildcard term, and such a query can be optimized easily, this usually results in a different query object.Depending on settings, a prefix term may be lower-cased automatically. It will not go through the default Analyzer, however, since normal Analyzers are unlikely to work properly with wildcard templates.
Can be overridden by extending classes, to provide custom handling for wild card queries, which may be necessary due to missing analyzer calls.
-
-
- Parameters:
-
field
- Name of the field query will use. -
termStr
- Term token to use for building term for the query (without trailing '*' character!) - Returns:
- Resulting
Query
built for the term - Throws:
-
ParseException
- throw in overridden method to disallow
getFuzzyQuery
protected Query getFuzzyQuery(String field, String termStr, float minSimilarity) throws ParseException
- Factory method for generating a query (similar to
getWildcardQuery(java.lang.String, java.lang.String)
). Called when parser parses an input term token that has the fuzzy suffix (~) appended. -
-
- Parameters:
-
field
- Name of the field query will use. -
termStr
- Term token to use for building term for the query - Returns:
- Resulting
Query
built for the term - Throws:
-
ParseException
- throw in overridden method to disallow
escape
public static String escape(String s)
- Returns a String where those characters that QueryParser expects to be escaped are escaped by a preceding
\
. -
-
main
public static void main(String[] args) throws Exception
- Command line tool to test QueryParser, using
SimpleAnalyzer
. Usage:
java org.apache.lucene.queryParser.QueryParser <input>
-
-
- Throws:
-
Exception
Conjunction
public final int Conjunction() throws ParseException
-
-
- Throws:
-
ParseException
Modifiers
public final int Modifiers() throws ParseException
-
-
- Throws:
-
ParseException
TopLevelQuery
public final Query TopLevelQuery(String field) throws ParseException
-
-
- Throws:
-
ParseException
Query
public final Query Query(String field) throws ParseException
-
-
- Throws:
-
ParseException
Clause
public final Query Clause(String field) throws ParseException
-
-
- Throws:
-
ParseException
Term
public final Query Term(String field) throws ParseException
-
-
- Throws:
-
ParseException
ReInit
public void ReInit(CharStream stream)
- Reinitialise.
-
-
ReInit
public void ReInit(QueryParserTokenManager tm)
- Reinitialise.
-
-
getNextToken
public final Token getNextToken()
- Get the next Token.
-
-
getToken
public final Token getToken(int index)
- Get the specific Token.
-
-
generateParseException
public ParseException generateParseException()
- Generate ParseException.
-
-
enable_tracing
public final void enable_tracing()
- Enable tracing.
-
-
disable_tracing
public final void disable_tracing()
- Disable tracing.
-
-
Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.
How to Parse Freetext String to Query Java
Source: https://lucene.apache.org/core/3_1_0/api/core/org/apache/lucene/queryParser/QueryParser.html
0 Response to "How to Parse Freetext String to Query Java"
Post a Comment