Unify Enterprise Search Remote indexing template
From VYRE
| Unify Enterprise Search Remote indexing template | |
|---|---|
| module: | Content module |
| supplier: | VYRE Ltd. |
The remote indexing template dictates what information from a content store is stored and indexed, and how.
Contents |
Creating
To create a remote indexing template (RIT) for a content store go to the store (see Datastore: Content and Metadata) and click 'Edit...' and 'Remote Indexing Template'. Remember to make sure the option 'Use remote index for this store' is checked.
To submit all items for that store to the remote index click 'Edit...' and 'Send all items to remote index'.
Template Code
The template is created to represent the content store item as XML using FreeMarker.
Here is a sample template that stores two attributes from a content store:
<SearchableItem> <uuid>${item.id}</uuid> <documentBoost>1.0</documentBoost> <properties> <entry> <string>title</string> <SearchableProperty> <name>title</name> <value><![CDATA[${item.name}]]></value> <boost>1.0</boost> <stored>true</stored> <indexed>true</indexed> <tokenized>true</tokenized> </SearchableProperty> </entry> <entry> <string>description</string> <SearchableProperty> <name>description</name> <value><![CDATA[${item.description}]]></value> <boost>1.0</boost> <stored>true</stored> <indexed>true</indexed> <tokenized>true</tokenized> </SearchableProperty> </entry> </properties> </SearchableItem>
As you can see the template represents the overall structure of the item as XML with content store attributes inserted or calculated using FreeMarker expressions or interpolation.
Configuration
SearchableItem
This is the item indexed by the search engine and it mandates three child nodes: uuid, documentBoost and properties which is an iterable of SearchableProperties describing each attribute.
uuid
A unique identifier for this document.
documentBoost
A float value representing the overall score multiplier given to this document at index time. As an example would a "documentBoost" value of 1.5 mean that this document would be given an inflated score that's 50% higher than its natural rank. This allows you to specify that documents from one datastore are to be considered more relevant than others.
properties
A map of searchable properties that represent each stored value.
SearchableProperty
A searchable property encapsulated how the value of a single attribute from a content store item and mandates how the search server should treat it at index time. Note that this is an XML serialization of the Java object which is why (at present) there is a redundant 'string' property present in the XML representation. TODO: investigate whether name could be used to represent a 'display value' rather than act purely as an identifier.
name
The identifier or name of the indexed attribute, e.g. 'title'
value
The value of the indexed attribute.
boost
A float value representing the overall score multiplier given to keyword hits in this attribute at query time. This means that a hit in a given attribute can be given higher or less weighting than others. As an example could a hit in a 'title' field be considered more valuable to natural rank than a hit in a 'description' or 'body' field.
stored
'true' or 'false' indicating whether the value should be stored verbatim in the index for displaying in search results.
indexed
'true' or 'false' indicating whether the document should be indexed i.e. whether it is searchable or not (i.e. not just for display purposes).
tokenized
'true' or 'false' indicating whether or not to tokenize the value of the property. The tokenization process splits the value according to predefined rules meaning that individual tokens or words will be searchable. In effect this indicates whether the individual constituent parts of the value are searchable or whether the attribute as a whole represents a single value.
Example:
If tokenized would the sentence "The quick brown fox jumps over the lazy dog" yield a result on the keyword 'fox'. However if non-tokenized would it only be considered a valid hit if a query was done for the entire string.
Advanced
TODO: Solr schema xml, Solr config files and rabida configuration files which override settings applied in the remote indexing template.
Helper Object
To help with retrieving values in the RIT, a helper object is added. Methods can be called on this object to retrieve different information.
For example ${helper.getAttributeValue( '18' )} will return the String value for attribute 18 of the current item (or the default value, see below).
| Method Call | Return Object | Description |
|---|---|---|
| getAttributeValue( String attributeId ) | String | Returns the attribute value for the current item, assumes the default from the helper. |
| getAttributeValue( String attributeId, String defaultValue ) | String | Returns the attribute value for the current item, or defaultValue if the attribute is null. |
| getAttributeValue(Item item, String attributeId, String defaultValue ) | String | Returns the string value for the given attribute id, assumes the item has the loaded content and metadata |
| getAttributeValue(Item item, String attributeId, String defaultValue, boolean loadContentMetadata ) | String | Return the value for the given attribute id for the item passed, or if the value is not present return the defaultValue.
If true is passed for loadContentMetadata the content and metadata is loaded, this should be used sparinly. For example, if only one attribute is needed for a linked item then this can be used, but if many attributes are needed then .loadContentAndMetadata() should be called on the item once and false passed to the many attribute calls N.B. If null is passed as the defaultValue, the default value for the helper is used. |
| getAttributeValueByName( String attributeName ) | String | Returns the attribute value for the current item, assumes the default from the helper. |
| getAttributeValueByName( String attributeName, String defaultValue ) | String | Returns the attribute value for the current item, or defaultValue if the attribute is null. |
| getAttributeValueByName(Item item, String attributeName, String defaultValue ) | String | Returns the string value for the geiven attribute name, assumes the item has the loaded content and metadata |
| getAttributeValueByName(Item item, String attributeName, String defaultValue, boolean loadContentMetadata ) | String | Returns the value for the given attribute name for the item passed, or if the value is not present return the defaultValue.
If true is passed for loadContentMetadata the content and metadata is loaded, this should be used sparinly. For example, if only one attribute is needed for a linked item then this can be used, but if many attributes are needed then .loadContentAndMetadata() should be called on the item once and false passed to the many attribute calls N.B. If null is passed as the defaultValue, the default value for the helper is used. |
| getLinkedItemInfos( String linkDefId ) | List<ItemInfo> | Gets the linked itemInfos for the current item (that have not been deleted). |
| getLinkedItemInfos( Item item, String linkDefId ) | List<ItemInfo> | Gets the linked iteminfos for the passed item (that have not been deleted). |
| getLinkedItems( String linkDefId, boolean loadContentMetadata ) | List<Item> | Gets the linked items for the current item, note the content and metadata should not be loaded unless it is needed for all the items. |
| getLinkedItems( Item item, String linkDefId, boolean loadContentMetadata ) | List<Item> | Gets the linked items for the passed item (that have not been deleted), note the content and metadata should not be loaded unless it is needed for all the items. |
| getItemFromItemInfo(ItemInfo itemInfo, boolean loadContentMetadata ) | Item | Returns the item for the passed ItemInfo, if loadContentMetadata is true it will load the content and metadata |
| getDefaultValue() | String | Gets the default value that the helper will use if an attribute is null. |
| setDefaultValue(String defaultValue) | void | Sets the default value that the helper will use if an attribute is null. |
Default fields indexed
The fields below are all stored automatically when sending an item to the remote index.
| Variable Name | Stored Value |
|---|---|
| uuid | id of the item |
| active | if this item is active, will be "true" or "false" |
| title | name filed of the item |
| description | description field of the item |
| keywords | keyword field of the item |
| collection | id of the collection |
| creationDate | date the item was created (in SOLR format 2007-12-24T23:59:59Z) |
| lastModifiedDate | date the item was modified (in SOLR format 2007-12-24T23:59:59Z) |
| creator | profile id of the item creator |
| lastModifier | profile id of the last modifier of the item |
| category | stores the string values of the taxonomy categories |
| categoryIds | stores the id values of the taxonomy categories |
| leafCategoryIds | the leaf categories.
NB. If an item belongs to category X and none of the descendants of X, then we say that X is a "leaf category" of that item. |
| locale | locale of the item |
| secondary | "true" if this is a secondary item |
| primary | "false" if this is a secondary item |
| primaryItem | id of the primary item ("none" if this is a primary item) |
| secondaryItem_{locale} | ex: secondaryItem_ca_ES is the id of the catalan version of this primary item (or this secondary items primary item) |
