solr4 - Solr CollapsingQParserPlugin with group.facet=on style facet counts -


i have solr index of 5 million documents @ 8gb using solr 4.7.0. require grouping in solr, find slow. here group configuration:

group=on group.facet=on group.field=workid group.ngroups=on 

the machine has ample memory @ 24gb , 4gb allocated solr itself. queries taking 1200ms compared 90ms when grouping turned off.

i ran across plugin called collapsingqparserplugin uses filter query remove 1 of group.

fq={!collapse field=workid}

it's designed indexes have lot of unique groups. have 3.8 million. approach much faster @ 120ms. it's beautiful solution me except 1 thing. because filters out other members of group, facets representative document counted. instance, if have following 3 documents:

"docs": [   {     "id": "1",     "workid": "abc",     "type": "book"   },   {     "id": "2",     "workid": "abc",     "type": "ebook"   },   {     "id": "3",     "workid": "abc",     "type": "ebook"   } ] 

once collapsed, top 1 shows in results. because other 2 filtered out, facet counts like

"type": ["book":1] 

instead of

"type": ["book":1, "ebook":1] 

is there way group.facet counts using collapse filter query?

i unable find way solr or plugin configurations, developed work around create group facet counts while still using collapsingqparserplugin.

i making duplicate of fields i'll faceting on , making sure facet values entire group in each document so:

"docs": [   {     "id": "1",     "workid": "abc",     "type": "book",     "facettype": [       "book",       "ebook"     ]   },   {     "id": "2",     "workid": "abc",     "type": "ebook",     "facettype": [       "book",       "ebook"     ]   },   {     "id": "3",     "workid": "abc",     "type": "ebook",     "facettype": [       "book",       "ebook"     ]   } ] 

when ask solr generate facet counts, use new field:

facet.field=facettype 

this ensures facet values accounted , counts represent groups. when use filter query, revert using old field:

fq=type:book 

this way correct document chosen represent group.

i know dirty, complex way make work, work , that's needed. requires ability query documents before insertion solr, calls development. if has simpler solution still love hear it.


Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

Python ctypes access violation with const pointer arguments -