solr4 - Solr CollapsingQParserPlugin with group.facet=on style facet counts -
i have solr index of 5 million documents @ 8gb using solr 4.7.0. require grouping in solr, find slow. here group configuration:
group=on group.facet=on group.field=workid group.ngroups=on
the machine has ample memory @ 24gb , 4gb allocated solr itself. queries taking 1200ms compared 90ms when grouping turned off.
i ran across plugin called collapsingqparserplugin uses filter query remove 1 of group.
fq={!collapse field=workid}
it's designed indexes have lot of unique groups. have 3.8 million. approach much faster @ 120ms. it's beautiful solution me except 1 thing. because filters out other members of group, facets representative document counted. instance, if have following 3 documents:
"docs": [ { "id": "1", "workid": "abc", "type": "book" }, { "id": "2", "workid": "abc", "type": "ebook" }, { "id": "3", "workid": "abc", "type": "ebook" } ]
once collapsed, top 1 shows in results. because other 2 filtered out, facet counts like
"type": ["book":1]
instead of
"type": ["book":1, "ebook":1]
is there way group.facet counts using collapse filter query?
i unable find way solr or plugin configurations, developed work around create group facet counts while still using collapsingqparserplugin.
i making duplicate of fields i'll faceting on , making sure facet values entire group in each document so:
"docs": [ { "id": "1", "workid": "abc", "type": "book", "facettype": [ "book", "ebook" ] }, { "id": "2", "workid": "abc", "type": "ebook", "facettype": [ "book", "ebook" ] }, { "id": "3", "workid": "abc", "type": "ebook", "facettype": [ "book", "ebook" ] } ]
when ask solr generate facet counts, use new field:
facet.field=facettype
this ensures facet values accounted , counts represent groups. when use filter query, revert using old field:
fq=type:book
this way correct document chosen represent group.
i know dirty, complex way make work, work , that's needed. requires ability query documents before insertion solr, calls development. if has simpler solution still love hear it.
Comments
Post a Comment