How do I train the search engine? ( import synonym, acronym, or analogue search terms as well as spelling variations)

Modified on Mon, 29 Jan 2024 at 10:55 AM

You can include an acronym, synonym, or analogue search in order for attendees to find content even if they were searching for an alternative term. 



Examples: 

  • Spelling Variations: A user searches for coeliac and finds all content where the term is spelled celiac
  • Drug Names: A user searches for the generic Ibuprofen and finds all content with the brand names  Motrin or Advil
  • Synonyms/Labels: A user searches for the Whipple procedure and finds all content with pancreatoduodenectomyWhipple, or Kausch-Whipple 
  • Abbreviations: A user searches for afib and finds all content with atrial fibrillation


To train your search engine, prepare a file with all analogue terms, spelling variations, alternative drug names, or acronyms with their expansions in one column. These terms are linked as groups with an ID.


You can build on the same list from year to year.


Tip: You can review your search analytics (more on analytics) to identify terms without results from your last event, or as your event is ongoing.


Test it out!

You can see this in action in our demo planner.

  1. Visit ativ.me/planner
  2. Click Search Everything on the Home tab
  3. Search for Covid
  4. View the results -- note that not all results use the term "covid." Some instead use only the term "coronavirus," "SARS-CoV2," etc.


In the project, these terms are set up as an analogue group. This means the search "knows" these terms are analogues for one another and returns them all as results.


Requirements

Ensure you are meeting the following requirements:

  1. Terms cannot contain commas

  2. A group of terms shares the same ID 

    1. The ID itself can be anything, so long as it is shared by each term in your group. You can choose a simple numerical designation (e.g. 01 for the first group, 02 for the second, and so on) or something that helps you remember the group (such as using one of the terms in the group as the ID for all of the terms).

  3. A term must be at least 2 characters long

  4. Terms must be unique (duplicate terms will cause import to fail)

    • This means one term cannot be assigned to multiple groups

  5. Stopwords (these are common words such as prepositions that are ignored by the search and can be adjusted if necessary) are displayed as orphaned but do not prevent publish.

    • Orphans will appear on the side collapsible bar.

    • If an analogue term is also a stopword, the analogue term will not be searchable in the app until removed from the stopword list.

  6. Symbols that can be included:
    • .  period
    • ( )  Parenthesis
    • -  dashes
    • _  underscores
    • /   forward slash
    • α  alpha
    • β  beta
    • :   colon
    • ;   semi-colon
    • +  plus
    • *   asterisk
    • ' apostrophe


In-App searches

  1. All analogue terms are displayed as part of the autocomplete suggestions in the search field
  2. Searching for an analogue term will render all results assigned to the same analogue ID (you make up the ID - it's just there for reference)
  3. Searching for a term that does not exist in the data, but is a term in your Analogue CSV file, will display search results that match the analogues (for example, if a user searches for coeliac but this spelling does not exist in your content, only in your analogue search file, then users will still find all results for celiac, even though they searched for a term that does not exist in your data).

Example Spreadsheet and CSV


Analogue ID    Analogue Terms
1ibuprofen
1motrin
1advil
R1coeliac
R1celiac




Analogue Limits


  • Limit each term to maximal 9 analogues (total max of 10 per analogue ID)
  • Each analogue term must be under 100 characters



The same content appears in a .csv file comma separated:


Analogue ID, Analogue Terms

1,ibuprofen

1,motrin

1,advil

R1,coeliac

R1,celiac


Analogue terms are not case sensitive.  Users entering upper case COELIAC in the search will still find results with the term celiac lower case.


Example CSV content:

Analogue ID, Analogue Terms

pain1, Ibuprofen

pain1, Advil

pain1, Motrin

pain1, Medipren

pain1, Nupren

pain1, painkiller

pain1, pain killer 


"pain killer" is an analogue term that consists of more than one word. It is considered a phrase.


Note on Multi-Term Phrase Searches

Multi-term phrase search is one directional. Let's use our different pain killer medications above as an example. The user enters the letters pai in the search field. After entering three letters, the search starts displaying possible suggestions, including matching analogues such as "painkiller" and also the analogue phrase "pain killer." Both of these analogues start with the three letters the user entered pai


If the user selects "pain killer" from the search suggestions, the user picked the analogue from your search training. Now all records that have a term matching any analogue for ID pain1 display. This means that for example all sessions with the terms Motrin and Advil would also appear, even if they don't have the words "pain killer" in the description or title. 


If the user does not select the suggestion, the search will behave like a normal search. The user didn't pick an analogue but entered two two separate words:  pain and killer. Without choosing a search suggestion (the analogue phrase), the search considers this as two separate words and will search for anything that has the word pain and the word killer in the data. Sessions with Advil and Motrin may not appear unless that session also has the words pain and killer in the description or title.


Another example: Imagine your analogue file may contain the following abbreviation and expansion: 

A1, SAGES

A1, Society of American Gastrointestinal and Endoscopic Surgeons

The acronym expansion is a multi-word term or search phrase. 

  • If a user enters such a multi-word term and starts typing "Soc", the full search phrase appears in the search suggestion. When selected, the records with the matching single analogue term are displayed. That means, searching for "Society of American Gastrointestinal and Endoscopic Surgeons" by selecting this phrase from the suggested search list will also find all records with the term "SAGES".
  • If a user does not select the phrase from the suggested list, the search behaves like a standard multi-term search and will return all records that contain the prefix "Soc" so "Society", "Social", etc..
  • If a user enters the search term SAGES, all records with the single term will be returned. Analogue phrases are not included in those results (e.g. records with "Society of American Gastrointestinal and Endoscopic Surgeons").




Pro users: Please check with your Project Manager to confirm whether we have a Google Sheet prepared for you from a previous event or to get help setting one up for the first time.


How to generate a CSV link from a Google Sheet

The easiest way to manage data like this is by creating it in a Google Sheet and then importing the CSV from the Sheet. This article explains how to create a CSV link for a Google Sheet.


How to link to a CSV URL

  1. In the CMS Menu > Settings > Search Training > click the Plus button in the Imports section
  2. In the Source field, select CSV URL
  3. In the Name field, enter where this data is coming from (e.g. a Google Sheet, your coworker, etc.)
  4. In the URL field, paste the URL you obtained by following the steps in the article linked above
  5. Click the Next button to switch to the Field Map tab
  6. In the Field Map tab, match your column with the appropriate column in EventPilot
  7. Click the Import button

How to update your data

  1. If you are using a Google Sheet, simply make updates in the Google Sheet.
  2. When you are done, open the CMS > Settings > Search Training > and click the IMPORT button next to your linked data source. 
    1. Note: if you are using a Google Sheet, there are sometimes delays between you updating the visible sheet and Google adding your updates to its associated CSV file. If you do not see your changes being imported, wait 5-10 minutes and try importing again. 
    2. Choosing to Import All Sources from the Home tab will also bring in your updated analogues. 


How to create a CSV File

  1. You can use Excel, Google Spreadsheets, or a basic text editor such as notes to create your .csv file. The content is broken up into two columns with the following headers:
    1. Column 1: Analogue ID
    2. Column 2: Analogue Terms
  2. A term, and all its analogues, must be given an ID - you can choose the ID and it can consist of numbers and letters. Then reuse the same ID for each analogue or synonym per line. 

  3. If you are using a spreadsheet editor, export the file as .csv

  4. Log into the CMS and access the Settings Menu 

  5. Select Search Training 

  6. Add a new data source using your .csv file

    • Errors should warn you if terms do not meet the requirements listed below

  7. Publish data, once all errors and/or orphans have been addressed


How to upload a CSV file

  1. In the CMS Menu > Settings > Search Training > click the Plus button in the Imports section
  2. In the Source field, select CSV Upload
  3. In the Name field, enter where this file is coming from (e.g. an Excel Sheet, your coworker, etc.)
  4. Click the Select File button to upload your file.
  5. Click the Next button to switch to the Field Map tab
  6. In the Field Map tab, match your column with the appropriate column in EventPilot
  7. Click the Import button

How to update your CSV file

  1. Make changes in the program you are using and follow the steps above to save a new .csv file
  2. In the CMS Menu > Settings > Search Training > click the Pencil icon next to your existing import source
  3. In the File tab, click the Select File button to upload a new version of your CSV
  4. Click the Next button and click Import