Getting Started With MySQL's Full-Text Search Capabilities - Performing A Boolean Search (Page 5 of 6 )
MySQL version 4.0.1 and above can perform complex full-text Boolean searches. I am using MySQL version 4.0.1, which is still in the Alpha stage. If you want to perform Boolean searches then click here to download MySQL version 4.0.
So what exactly is a Boolean search? Put simply, it is a powerful way to include or extract words and phrases from your search criteria. If you've ever searched for something like "+download +games", then you've used a Boolean search engine.
By combining various operators within your search string, you can filter in/out other words, change a words contribution to the relevance value and more. Here's the list of Boolean operators, as listed at MySQL.com:
+ A leading plus sign indicates that this word must be present in every row returned.
- A leading minus sign indicates that this word must not be present in any row returned.
By default (when neither plus nor minus is specified) the word is optional, but the rows that contain it will be rated higher. This mimics the behaviour of MATCH() ... AGAINST() without the IN BOOLEAN MODE modifier.
< > These two operators are used to change a word's contribution to the relevance value that is assigned to a row. The < operator decreases the contribution and the > operator increases it.
( ) Parentheses are used to group words into sub expressions.
~ A leading tilde acts as a negation operator, causing the word's contribution to the row relevance to be negative. It's useful for marking noise words. A row that contains such a word will be rated lower than others, but will not be excluded altogether, as it would be with the - operator.
* An asterisk is the truncation operator. Unlike the other operators, it should be appended to the word, not prepended.
" The phrase, that is enclosed in double quotes ", matches only rows that contain this phrase literally, as it was typed.
A Boolean search is performed in much the same way as a normal full-text search, however it includes the IN BOOLEAN MODE keywords, as shown in the example below:
select * from testTable where match(firstName, lastName, details) against ('+DevArticles -flash' in boolean mode);
In the example query above I'm specifying that every row returned must contain the word "DevArticles" but must NOT contain the word "flash". Still working with our test table, which record do you think will have the highest relevance ranking if we executed this query?:
select firstName, match(firstName, lastName, details) against('+mysql -devArticles >flash' in boolean mode) as relevance from testTable where match(firstName, lastName, details) against('+mysql -devArticles >flash' in boolean mode);
Looking at the bulleted list above, the results of this query will be ranked higher if:
It contains the word "mysql"
It doesn’t contain the word "devArticles"
It contains one or more instances of the word "flash"
Here is the result from the query:
How about this query:
select firstName, match(firstName, lastName, details) against('+devArticles -sitecubed' in boolean mode) as relevance from testTable where match(firstName, lastName, details) against('+devArticles -sitecubed' in boolean mode);
In this query, records that contain the word DevArticles and NOT the word siteCubed will rank. The results are shown below:
All of the other records returned a 0 relevance ranking, and are therefore not shown in the results. The best way to become familiar with Boolean full-text searches is to experiment with the various operators, including where they are positioned in the search string (including the use of brackets) and how often they are used.