Unify Enterprise Search Remote indexing template

From VYRE

Jump to: navigation, search
Unify Enterprise Search Remote indexing template
module: Content module
supplier: VYRE Ltd.


The remote indexing template dictates what information from a content store is stored and indexed, and how.

Contents

Creating

To create a remote indexing template (RIT) for a content store go to the store (see Datastore: Content and Metadata) and click 'Edit...' and 'Remote Indexing Template'. Remember to make sure the option 'Use remote index for this store' is checked.

To submit all items for that store to the remote index click 'Edit...' and 'Send all items to remote index'.

Template Code

The template is created to represent the content store item as XML using FreeMarker.

Here is a sample template that stores two attributes from a content store:

  <SearchableItem>
    <uuid>${item.id}</uuid>
    <documentBoost>1.0</documentBoost>
    <properties>
      <entry>
        <string>title</string>
        <SearchableProperty>
          <name>title</name>
          <value><![CDATA[${item.name}]]></value>
          <boost>1.0</boost>
          <stored>true</stored>
          <indexed>true</indexed>
          <tokenized>true</tokenized>
        </SearchableProperty>
      </entry>
      <entry>
        <string>description</string>
        <SearchableProperty>
          <name>description</name>
          <value><![CDATA[${item.description}]]></value>
          <boost>1.0</boost>
          <stored>true</stored>
          <indexed>true</indexed>
          <tokenized>true</tokenized>
        </SearchableProperty>
      </entry>
    </properties>
  </SearchableItem>

As you can see the template represents the overall structure of the item as XML with content store attributes inserted or calculated using FreeMarker expressions or interpolation.

Configuration

SearchableItem

This is the item indexed by the search engine and it mandates three child nodes: uuid, documentBoost and properties which is an iterable of SearchableProperties describing each attribute.

uuid

A unique identifier for this document.

documentBoost

A float value representing the overall score multiplier given to this document at index time. As an example would a "documentBoost" value of 1.5 mean that this document would be given an inflated score that's 50% higher than its natural rank. This allows you to specify that documents from one datastore are to be considered more relevant than others.

properties

A map of searchable properties that represent each stored value.

SearchableProperty

A searchable property encapsulated how the value of a single attribute from a content store item and mandates how the search server should treat it at index time. Note that this is an XML serialization of the Java object which is why (at present) there is a redundant 'string' property present in the XML representation. TODO: investigate whether name could be used to represent a 'display value' rather than act purely as an identifier.

name

The identifier or name of the indexed attribute, e.g. 'title'

value

The value of the indexed attribute.

boost

A float value representing the overall score multiplier given to keyword hits in this attribute at query time. This means that a hit in a given attribute can be given higher or less weighting than others. As an example could a hit in a 'title' field be considered more valuable to natural rank than a hit in a 'description' or 'body' field.

stored

'true' or 'false' indicating whether the value should be stored verbatim in the index for displaying in search results.

indexed

'true' or 'false' indicating whether the document should be indexed i.e. whether it is searchable or not (i.e. not just for display purposes).

tokenized

'true' or 'false' indicating whether or not to tokenize the value of the property. The tokenization process splits the value according to predefined rules meaning that individual tokens or words will be searchable. In effect this indicates whether the individual constituent parts of the value are searchable or whether the attribute as a whole represents a single value.

Example:

If tokenized would the sentence "The quick brown fox jumps over the lazy dog" yield a result on the keyword 'fox'. However if non-tokenized would it only be considered a valid hit if a query was done for the entire string.

Advanced

TODO: Solr schema xml, Solr config files and rabida configuration files which override settings applied in the remote indexing template.


Helper Object

To help with retrieving values in the RIT, a helper object is added. Methods can be called on this object to retrieve different information.

For example ${helper.getAttributeValue( '18' )} will return the String value for attribute 18 of the current item (or the default value, see below).

Method Call Return Object Description
getAttributeValue( String attributeId ) String Returns the attribute value for the current item, assumes the default from the helper.
getAttributeValue( String attributeId, String defaultValue ) String Returns the attribute value for the current item, or defaultValue if the attribute is null.
getAttributeValue(Item item, String attributeId, String defaultValue ) String Returns the string value for the given attribute id, assumes the item has the loaded content and metadata
getAttributeValue(Item item, String attributeId, String defaultValue, boolean loadContentMetadata ) String Return the value for the given attribute id for the item passed, or if the value is not present return the defaultValue.

If true is passed for loadContentMetadata the content and metadata is loaded, this should be used sparinly. For example, if only one attribute is needed for a linked item then this can be used, but if many attributes are needed then .loadContentAndMetadata() should be called on the item once and false passed to the many attribute calls

N.B. If null is passed as the defaultValue, the default value for the helper is used.

getAttributeValueByName( String attributeName ) String Returns the attribute value for the current item, assumes the default from the helper.
getAttributeValueByName( String attributeName, String defaultValue ) String Returns the attribute value for the current item, or defaultValue if the attribute is null.
getAttributeValueByName(Item item, String attributeName, String defaultValue ) String Returns the string value for the geiven attribute name, assumes the item has the loaded content and metadata
getAttributeValueByName(Item item, String attributeName, String defaultValue, boolean loadContentMetadata ) String Returns the value for the given attribute name for the item passed, or if the value is not present return the defaultValue.

If true is passed for loadContentMetadata the content and metadata is loaded, this should be used sparinly. For example, if only one attribute is needed for a linked item then this can be used, but if many attributes are needed then .loadContentAndMetadata() should be called on the item once and false passed to the many attribute calls

N.B. If null is passed as the defaultValue, the default value for the helper is used.

getLinkedItemInfos( String linkDefId ) List<ItemInfo> Gets the linked itemInfos for the current item (that have not been deleted).
getLinkedItemInfos( Item item, String linkDefId ) List<ItemInfo> Gets the linked iteminfos for the passed item (that have not been deleted).
getLinkedItems( String linkDefId, boolean loadContentMetadata ) List<Item> Gets the linked items for the current item, note the content and metadata should not be loaded unless it is needed for all the items.
getLinkedItems( Item item, String linkDefId, boolean loadContentMetadata ) List<Item> Gets the linked items for the passed item (that have not been deleted), note the content and metadata should not be loaded unless it is needed for all the items.
getItemFromItemInfo(ItemInfo itemInfo, boolean loadContentMetadata ) Item Returns the item for the passed ItemInfo, if loadContentMetadata is true it will load the content and metadata
getDefaultValue() String Gets the default value that the helper will use if an attribute is null.
setDefaultValue(String defaultValue) void Sets the default value that the helper will use if an attribute is null.

Default fields indexed

The fields below are all stored automatically when sending an item to the remote index.

Variable Name Stored Value
uuid id of the item
active if this item is active, will be "true" or "false"
title name filed of the item
description description field of the item
keywords keyword field of the item
collection id of the collection
creationDate date the item was created (in SOLR format 2007-12-24T23:59:59Z)
lastModifiedDate date the item was modified (in SOLR format 2007-12-24T23:59:59Z)
creator profile id of the item creator
lastModifier profile id of the last modifier of the item
category stores the string values of the taxonomy categories
categoryIds stores the id values of the taxonomy categories
leafCategoryIds the leaf categories.

NB. If an item belongs to category X and none of the descendants of X, then we say that X is a "leaf category" of that item.

locale locale of the item
secondary "true" if this is a secondary item
primary "false" if this is a secondary item
primaryItem id of the primary item ("none" if this is a primary item)
secondaryItem_{locale} ex: secondaryItem_ca_ES is the id of the catalan version of this primary item (or this secondary items primary item)

See Also

Personal tools