This will be a series on Full text Search in MS SQL Server. In this tutorial you wll learn the details of full text search which is one of the fastest to find data using CONTAINS key word rather than LIKE keyword we normally use. I will also discuss about to make ranking on search result depending on the matching criteria and search string.
Lets start the First Post on FULL TEXT SEARCH:
Full-text Search Concepts:
Full-text search allows fast and flexible indexing for keyword-based query of text data. We can gain significant enhancements in the areas of performances, manageability, and functionality deliver for exceptional search capabilities for application of any size. The performance benefit of using FT can be best realizes when we use for unstructured huge text data.
“A LIKE query against millions of rows of text data can take minutes to return, whereas a full-text query can take only one seconds or less against the same data, depending on the numbers of rows to return”: MSDN.
FULL-TEXT SEARCH TERMINOLOGY:
Full-text Index: Stores information about significant words and their location within a given column. Index information are use to search quickly for rows with particular words or their combinations.
Full-text Catalog: A full-text catalog may contain Zero or more full-text indexes. Each catalog may serve indexing needs of one or more tables within a Single DB. And Catalog must reside on a local hard drive (not in removable disk, net drive) associated with instance of SQL Server.
Word Broker: tokenizes text based on the lexical rules of a specific language.
Stemmer: Create inflectional (Synonyms, Antonyms, forms of word etc. Like go>went>>gone).
Filter: Filter a specified file type and also filter extract text from a file stored in varbinary or image column.
Population or Crawl: Is the process of creating and maintaining full-text index.
Noise Words: like “a”, “an”, “the”>> frequently occurring words that do not help the search. These words are ignored when creating index.
Full-text index is a special type of token based functional index built and maintained be MS Full-text for SQL Server (MSFTESQL). The process of creating and maintaining a full-text index is called index POPULATION.
Population Types:• Full population
• Change tracking-based population
• Incremental timestamp-based population
Typically occurs when a full-text index is first populated. Index entries are built for all the rows in all the tables covered by the catalog. If a full population is requested for a table, index entries are built for all the rows in that table.
CREATE FULLTEXT INDEX CHANGE TRACKING AUTO
>> CHANGE TRACKING OFF NO POPULATION if we do not want to populate the index at the time of creation.
CHANGE TRACKING BASED POPULATION
CHANGE TRACKING [AUTO/MANUAL/OFF]
AUTO>> automatically propagate changes to the FT index.
MANUAL>> propagate index on a schedule (e.g. every day, every hour), SQL server agent or programmer by himself by
ALTER FULLTEXT INDEX
Indexed update asynchronously (runs in the Background) when DB activity is low.
INCREMENTAL TIMESTAMP-BASED Population:
Updates the FT index for rows added, deleted or modified after the last population or while last population was in progress
Requirements: Indexed table must have a column of the timestamp data type.
Hope you enjoyed the first series on Full text search. Stay tuned for more.