whoosh 全文搜索

在做一个全文搜索的工具，用到 whoosh, 但是对于短语的搜索，不是很明白要怎么用，请大神指教一下。
我看到的例子是：
http://whoosh.readthedocs.org/en/latest/quickstart.html

>>> from whoosh.index import create_in
>>> from whoosh.fields import *
>>> schema = Schema(title=TEXT(stored=True), path=ID(stored=True), content=TEXT)
>>> ix = create_in("indexdir", schema)
>>> writer = ix.writer()
>>> writer.add_document(title=u"First document", path=u"/a",
... content=u"This is the first document we've added!")
>>> writer.add_document(title=u"Second document", path=u"/b",
... content=u"The second one is even more interesting!")
>>> writer.commit()
>>> from whoosh.qparser import QueryParser
>>> with ix.searcher() as searcher:
... query = QueryParser("content", ix.schema).parse("first")
... results = searcher.search(query)
... results[0]

我用了这样的代码，但是之后发现 parse("first") 里面跟的这个会被拆分成若干个独立的 word, 所以如果我用 parse("first man to buy a apple") 这样会找到各个字段对应的，而不是这一条短语。

然后我看到了 QueryParser 里面，是有这个 parameter 的，不知道是不是可以达到我的目的，但是我不知道要怎么拿来用：
phraseclass – the query class to use for phrases. The default is whoosh.query.Phrase.

class whoosh.query.Phrase(fieldname, words, slop=1, boost=1.0, char_ranges=None)

Matches documents containing a given phrase.

query

whoosh

document

first

1 条回复 • 2015-10-22 16:03:05 +08:00

fire5

2015-10-22 16:03:05 +08:00

用这个吧 http://es.xiaoleilu.com/

whoosh 全文 搜索

whoosh 全文搜索