Multi-Select Facet with Solr, Vue and Go
Contents
- Introduction
- Indexing and querying
- API Implementation
- Front-end implementation
- Conclusion
- References
Changelog
- 02/10/2021: Update examples to use the latest version of solr-go.
Introduction
A multi-select facet feature give the users the ability to quickly filter the product catalog and help them zero-in on the product that they need. The majority of e-commerces websites support this feature and if you have shopped online before, then you had probably used it before.
I needed to support exactly the same feature for one my project but I didn’t know where to start. I didn’t even know what this feature was called. I scoured the internet until, luckily, I found this amazing blog post. It’s a very detailed explanation of how to implement a multi-select facet query in Solr.
In this blog post, we’re going to take it further and implement a web API and a single-page application.
Indexing and querying
The data
We’re going to use smartphones and e-readers as our product data since they both have similar attributes. Each record will have a set of SKUs
. For example, the first record in our data contains a Apple iPhone 11 Pro Max which has 2 variants
. The first one has a color of Black with storage capacity of 256G and the other one has a color of Space Grey with a storage capacity of 64GB.
[
{
"id": "1",
"name": "Apple iPhone 11 Pro Max",
"brand": "Apple",
"category": "Electronic Devices",
"productType": "Smartphones",
"docType": "product",
"skus": [
{
"id": "10",
"docType": "sku",
"colorFamily_s": "Black",
"operatingSystem_s": "iOS",
"storageCapacity_s": "256GB"
},
{
"id": "11",
"docType": "sku",
"colorFamily_s": "Space Grey",
"operatingSystem_s": "iOS",
"storageCapacity_s": "64GB"
}
]
}
...
]
Defining our schema
From the above, we have the following list of fields.
- name - e.g. Apple iPhone 11 Pro Max
- brand - e.g. Apple
- category e.g. Electronic Devices
- productType e.g. Smartphones, E-readers
- docType - indicates the document type e.g. product, sku. The use of this field will become more apparent in the next sections.
You may be asking, Why are we only defining those fields? Great question.
The sku fields colorFamily_s, operatingSystem_s, storageCapacity_s are dynamic which means, Solr can infer the type of the field based on its suffix (_s). This is really handy when you’re dealing with data that might have dynamic attributes like an sku
. Also, the skus
do not need to be defined in the schema. More details on nested child documents.
...
fields := []solr.Field{
{
Name: "name",
Type: "text_general",
Indexed: true,
Stored: true,
},
{
Name: "category",
Type: "text_gen_sort",
Indexed: true,
Stored: true,
},
{
Name: "brand",
Type: "text_gen_sort",
Indexed: true,
Stored: true,
},
{
Name: "productType",
Type: "string",
Indexed: true,
Stored: true,
},
{
Name: "docType",
Type: "string",
Indexed: true,
Stored: true,
},
}
err := solrClient.AddFields(ctx, collection, fields...)
...
Indexing the data
Indexing is pretty straightforward, just open the data file and feed it to Solr and commit. That’s it!
// open the json containing the data
f, err := os.OpenFile(dataPath, os.O_RDWR, 0644)
...
// send the json to solr
err = solrClient.Update(ctx, collection, solr.JSON, f)
...
// commit
err = solrClient.Commit(ctx, collection)
...
Now, we should be able to see that we have sucessfully indexed our data. Notice the fq
param is set to docType:product
. This will filter all the non-product documents.
But, where are the product skus
? Another great question.
Nested child documents are indexed like a regular document. Internally, Solr knows which sku documents are related to which products.
To help us identify which document is a product or sku, we are using the docType field. This is the suggested way to handle nested child documents in Solr.
We can include the skus to the product by specifying the value [child]
for fl
parameter.
Query with facet
The query in this example is heavily inspired the blog post mentioned earlier. I suggest reading it thoroughly before proceeding to the next section.
{
"query": "{!parent tag=top filters=$skuFilters which=docType:product v=docType:sku}",
"queries": {
"skuFilters": ["{!tag=colorFamily_s}colorFamily_s:Black"]
},
"filter": ["{!tag=top}brand:Amazon"],
"facet": {
"Brand": {
"facet": { "productCount": "uniqueBlock(_root_)" },
"field": "brand",
"limit": -1,
"type": "terms"
},
"Color Family": {
"domain": {
"excludeTags": "top",
"filter": [
"{!filters param=$skuFilters excludeTags=colorFamily_s v=$sku}",
"{!child of=docType:product filters=$filter v=docType:product}"
]
},
"facet": { "productCount": "uniqueBlock(_root_)" },
"field": "colorFamily_s",
"limit": -1,
"type": "terms"
}
}
}
Let’s break it down a little bit so that we can have a better understanding each part of the query.
Here, we’re using block join parent query parser. This parser takes a query value (v=docType:sku
) and filter (filters:$skuFilters
) that matches the child documents (skus
) and returns their parents (which=docType:product
).
Note the $
(dollar symbol) in the filters:$skuFilters
. We use this syntax to reference other values in query body.
Also, we’re specifying tag:top
which will be useful when we want to exclude the product filters in the child facet settings. More details on tagging and excluding filters
.
{
"query": "{!parent tag=top filters=$skuFilters which=docType:product v=docType:sku}",
"queries": {
"skuFilters": ["{!tag=colorFamily_s}colorFamily_s:Black"]
},
"filter": ["{!tag=top}brand:Amazon"],
...
}
Next, we’re specifying a terms facet. This will produce a list of values based on the field colorFamily_s
which will only be applied to sku documents. We’re excluding the top filter so it doesn’t get applied to this facet setting. We’ve also included the "productCount": "uniqueBlock(_root_)"
to count the unique products that matches the child skus.
{
...
"facet": {
...
"Color Family": {
"field": "colorFamily_s",
"type": "terms",
"limit": -1,
"facet": {
"productCount": "uniqueBlock(_root_)"
},
"domain": {
"excludeTags": "top",
"filter": [
"{!filters param=$skuFilters excludeTags=colorFamily_s v=docType:sku}",
"{!child of=docType:product filters=$filter v=docType:product}"
]
}
}
}
}
The next facet setting is very similar to above, but is a lot simplier, as it is only applied to top level documents (products).
{
...
"facet": {
"Brand": {
"field": "brand",
"limit": -1,
"type": "terms",
"facet": {
"productCount": "uniqueBlock(_root_)"
}
}
...
}
}
Let’s test our query and see the results.
{
"Brand": {
"buckets": [
{
"val": "Amazon",
"count": 1,
"productCount": 1
}
]
},
"Color Family": {
"buckets": [
{
"val": "Black",
"count": 6,
"productCount": 5
},
{
"val": "Space Grey",
"count": 2,
"productCount": 2
},
{
"val": "Blue",
"count": 1,
"productCount": 1
},
...
]
}
}
For each bucket, we can see that we have the val
which is unique value for the field, count
which indicates the number of skus
that matched, and the productCount
which indicates the number of unique products that matched (i.e. Assume that 2 skus
matched the query but they both belong to the same parent product, thus productCount
is 1
).
API implementation
Our API will have a /search
endpoint which will simplify the querying and processing of results for us. Optionally, we can have a /suggest
endpoint for autosuggest feature.
$ curl http://localhost:8081/search?q=Apple&colorFamilies=Black%2CGold&operatingSystems=iOS
Response:
{
"facets": [
{
"buckets": [
{
"productCount": 1,
"skuCount": 1,
"val": "Apple"
}
],
"name": "Brand",
"param": "brands"
},
{
"buckets": [
{
"productCount": 2,
"skuCount": 2,
"val": "Space Grey"
},
{
"productCount": 1,
"skuCount": 1,
"val": "Black"
},
{
"productCount": 1,
"skuCount": 1,
"val": "Gold"
},
{
"productCount": 1,
"skuCount": 1,
"val": "Midnight Green"
}
],
"name": "Color Family",
"param": "colorFamilies"
},
...
],
"products": [
{
"brand": "Apple",
"category": "Electronic Devices",
"id": "1",
"name": "Apple iPhone 11 Pro Max",
"productType": "Smartphones"
}
]
}
Front-end implementation
Our web app will display the filters and the list of products that matched. We should be able to change the filters by checking/unchecking the filters on the left side of the screen.
Conclusion
For beginners like me, implementing a multi-select facet in Solr is very intidimidating task when you don’t know where to start. Thankfully, Solr has a very good documentation and there are blogs written that explains how to implement them which guided me in my implementation.
The complete implementation can be found in this repository. If you have any suggestions, you can open an issue in the repository.