java - Lucene migration text field difference in 3.0.3 and 5 -


i have problem migration of lucene field version 3.0.3 5.x . prepared 2 junit test programs (one 3.0.3 , other 5.x) compare behavior.

lucene 3:

analyzer = new standardanalyzer(version.lucene_30); indexwriter = new indexwriter(dir, analyzer, true, maxfieldlength.unlimited); .... document doc = new document(); doc.add(new field("keyword", "another test@foo-bar", field.store.yes,             field.index.analyzed)); indexwriter.adddocument(doc); indexwriter.commit(); ....  indexreader = indexreader.open(fsdirectory.open(path.tofile()), false); searcher = new indexsearcher(indexreader); queryparser parser = new queryparser(version.lucene_30, "keyword", analyzer); query query = parser.parse("test"); searcher.search(query, searcher.maxdoc()); topdocs topdocs = searcher.search(query, searcher.maxdoc()); scoredoc[] hits = topdocs.scoredocs; doc = indexreader.document(hits[0].doc); // doc null <- expected assertnull(result); 

the similar test lucene 5.x (only changed code lines):

analyzer = new standardanalyzer(); indexwriterconfig indexconfig = new indexwriterconfig(analyzer)             .setcommitonclose(true).setopenmode(openmode); // create index writer indexwriter = new indexwriter(dir, indexconfig); ... // line old style (lucene 3) doc.add(new field("keyword", "another test@foo-bar", field.store.yes,             field.index.analyzed)); // or new field types (enable 1 line) doc.add(new textfield("keyword", "another test@foo-bar", field.store.yes)); ... query query = new queryparser(field, analyzer).parse(field + ":"                 + value); doc = indexreader.document(hits[0].doc); // returns document each time assertnull(doc); // fails! 

i used following migration document https://lucene.apache.org/core/4_8_0/migrate.html replace field class textfield class. search works different.

question: how can create same result new lucene 5.x before lucene 3?

the lucene 3 analyzer seems split input string on spaces only. lucene 5 version of analyzer seems split on space, '@' , '-'. :/


Comments

Popular posts from this blog

c - Bitwise operation with (signed) enum value -

xslt - Unnest parent nodes by child node -

python - Healpy: From Data to Healpix map -