Gravatar for timsodhi@health.usf.edu

Question by timothevs, Sep 2, 2015 12:06 PM

Best way to index multiple values for an entity from Database source

I am looking for some pointers here on how to index and facet based on our goal here.

We have a list of people, who can select multiple characteristics for themselves which then should show up as facets while searching for those people. For example, person 1 can choose multiple values from Happy, Sad, Angry, Frustrated etc. All of this goes to our DB in a special table indexed by userid and moodtype.

Now we would like to bring these mood types into Coveo for searching as facets. A user should be able to select Happy, Sad, Frustrated from the sidebar and see Person 1.

I am trying to understand how this would work with custom fields while indexing.

  • Do you create a separate field for each mood type and mark Person 1 as having that mood type? For example a custom field called happyMood and set that to Happy or Y for Person 1?

  • Or do you create a custom field called moodType and set it to a long string containing Happy; Frustrated; etc.?

How does one create a facet in that case? Any help or guidance would be great!

1 Reply
Gravatar for jflheureux@coveo.com

Answer by Jean-François L'Heureux, Sep 2, 2015 12:46 PM

Hi Tim,

Coveo Facets can only list values of a single field. Thus, you should index all the person's moods in a single CES moodType custom string field and separate the values by a semi-colon like this: "Happy;Sad;Angry;Frustrated".

When creating your custom field in CES, make sure to check the "Allow faceted search on a field containing multiple values" option to tell CES that the content of the field will be semi-colon separated values.

Once the external source is rebuilt and you see the desired values for the field in the CES Index Browser, you will be ready to add your facet. I suppose you want to display that facet in a Coveo for Sitecore search page.

First, you will have to add the moodType field as an external field in Coveo for Sitecore to avoid the field translation (@fmoodtypeXXXXX) (see FieldMap).

<fieldMap type="...">
  ...
  <externalFields hint="raw:AddExternalField">
    ...
    <field fieldName="moodtype"/>
  </externalFields>
</fieldMap>

Then, create a facet on that field in your page/experience editor and you will have all the values as distinct facet values. You may have to check the "Is multi-value field" option in your Coveo for Sitecore facet component properties to have the values split on the semi-colon character.

If the moodtype field is not listed in the "Field" field of the facet component options, you need to add a new facet field definition in Sitecore: /sitecore/System/Settings/Buckets/Facets. Make sure to use the exact same CES field name in the "Field Name" field of that facet field definition item.

Hope this helps.

Jeff

Gravatar for timsodhi@health.usf.edu

Comment by timothevs, Sep 2, 2015 1:22 PM

Perfect! That is exactly what I was hoping for.

The part I was unsure of was if I did put all the values in a single CES custom field, would it show them as separate values in the facet in Sitecore? It seems like it would.

Thank you!

Just curious, how does one aggregate multiple rows into one field in the ODBC file? Would one need a subquery?

Gravatar for sdesilets@coveo.com

Comment by Sebastien Desilets, Sep 2, 2015 2:04 PM

Yes, you will need a subquery.

https://onlinehelp.coveo.com/en/ces/7.0/administrator/complementinformationretrievalwithsubqueriesforthedatabaseconnector.htm

You will join all values from the subquery with the ";" separator, allowing you to use those values in a multi values field.

Gravatar for sdesilets@coveo.com

Comment by Sebastien Desilets, Sep 2, 2015 2:36 PM

For simplicity's sake I recommend using a subquery in the source configuration file.

But for the sake of completeness, it's worth noting the database you are using probably has some sort of concatenation function such as MySQL GROUP_CONCAT, Oracle LISTAGG or the awkward FOR XML PATH in SQL Server. Any of those should outperform using a subquery in the configuration file.

Gravatar for timsodhi@health.usf.edu

Comment by timothevs, Sep 3, 2015 12:21 PM

Thanks, I did end up using LISTAGG to concatenate, although Sub Queries worked as well. But I noticed that the returned index had titles all messed up for the last 100 or so results. I had the title field mapped to %[LASTNAME], %[FIRSTNAME], %[TITLE].

For the final 100 results, the index browser shows many with just their title, eg instead of Horner, James PhD, it comes back as PhD. Does that make sense? What could be going wrong? The fields are populated for all the records in question.

Gravatar for jflheureux@coveo.com

Comment by Jean-François L'Heureux, Sep 3, 2015 9:17 PM

If your database source mapping file defines a body for your indexed documents, it may be caused by the title selection sequence of your source (see Modifying General Source Parameters)

If you set the "Use the filename" option in the first position, I think the title in the mapping file, if any, will be used.

However, if you set the "Use the title extracted by the converter" option before the "Use the filename" option, the title extracted by the converter, if any, will be used instead.

Ask a question