Home arrow MySQL arrow Page 4 - Getting Started With MySQL's Full-Text Search Capabilities
MYSQL

Getting Started With MySQL's Full-Text Search Capabilities


Need a rock-solid, powerful search solution for your PHP/MySQL driven web site? In this article Mitchell introduces us to MySQL's full-text and Boolean search capabilities.

Author Info:
By: Mitchell Harper
Rating: 5 stars5 stars5 stars5 stars5 stars / 119
August 26, 2002
TABLE OF CONTENTS:
  1. · Getting Started With MySQL's Full-Text Search Capabilities
  2. · What is Full-Text Searching?
  3. · What is Full-Text Searching (contd.)
  4. · Full-Text Rules And The MATCH Command
  5. · Performing A Boolean Search
  6. · Conclusion

print this article
SEARCH DEVARTICLES

Getting Started With MySQL's Full-Text Search Capabilities - Full-Text Rules And The MATCH Command
(Page 4 of 6 )

Remember earlier when I listed a number of bullet points, one of which stated that MySQL removes noise words and those of less than 3 characters. Let's test that theory will 2 basic full-text search queries:

select firstName, match(firstName, lastName, details) against('devArticles is on the www') as relevance from testTable;

This query returns the following records:

The results of our query

Notice how our last query only had one word that was longer than 3 characters in lengh, "devArticles". If we remove all words of 3 characters or less from the search string then the relevance ranking will remain the same:

select firstName, match(firstName, lastName, details) against('devArticles') as relevance from testTable;

Here is the list of records that matches the search:

The results of our query

As we can clearly see, the relevance ranking remains the same - MySQL does indeed remove noise words and those words with 3 characters or less.

MySQL's full-text search ranks words based on their semantic values -- common words rank lower than uncommon words. This makes sense, as a word that exists in many records will have a lesser relevance to a word that only appears in 1 or 2 records. Semantic word rankings are used in most popular full-text searching algorithms. Popular search engines and directories also employ this method.

The 50% Threshold
MySQL removes noise words and short words, but if a word is present in more than 50% of the records being searched, then those records will not be returned. MySQL calls this the "50% threshold". In a way this makes sense, as it filters out records that have a low relevance.

Here's one of the comments from a MySQL user on their site:

"... you should add at least 3 rows to the table before you try to match anything, and what you're searching for should only be contained in one of the three rows. This is because of the 50% threshold. If you insert only one row, then now matter what you search for, it is in 50% or more of the rows in the table, and therefore disregarded."
blog comments powered by Disqus
MYSQL ARTICLES

- MySQL and BLOBs
- Two Lessons in ASP and MySQL
- Lord Of The Strings Part 2
- Lord Of The Strings Part 1
- Importing Data into MySQL with Navicat
- Building a Sustainable Web Site
- Creating An Online Photo Album with PHP and ...
- Creating An Online Photo Album with PHP and ...
- PhpED 3.2 More Features Than You Can Poke ...
- Creating An Online Photo Album with PHP and ...
- Creating An Online Photo Album with PHP and ...
- Security and Sessions in PHP
- Setup Your Personal Reminder System Using PHP
- Create a IP-Country Database Using PERL and ...
- Developing a Dynamic Document Search in PHP ...

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 
Support 

Developer Shed Affiliates

 




© 2003-2017 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials