mysql query to use the match () against () function problem

MATCH (col1, col2 ,...) AGAINST (expr [IN BOOLEAN MODE | WITH QUERY EXPANSION])
MySQL supports full text indexing and search capabilities. MySQL FULLTEXT index type in the full-text index. FULLTEXT indexes for MyISAM tables only; they can be CHAR, VARCHAR or TEXT column as part of the CREATE TABLE statement is created, or subsequently use the ALTER TABLE or CREATE INDEX added. For larger data sets, enter your data without a FULLTEXT index to the table, then create the index, its data entry faster than the speed of existing FULLTEXT index more quickly.

Full-text search with MATCH () function is executed together.

mysql> CREATE TABLE articles (
-> Id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
-> Title VARCHAR (200),
-> Body TEXT,
-> FULLTEXT (title, body)
->);
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO articles (title, body) VALUES
-> ('MySQL Tutorial', 'DBMS stands for DataBase ...'),
-> ('How To Use MySQL Well', 'After you went through a ...'),
-> ('Optimizing MySQL', 'In this tutorial we will show ...'),
-> ('1001 MySQL Tricks ', '1. Never run mysqld as root. 2. ...'),
-> ('MySQL vs. YourSQL', 'In the following database comparison ...'),
-> ('MySQL Security', 'When configured properly, MySQL ...');
Query OK, 6 rows affected (0.00 sec)
Records: 6 Duplicates: 0 Warnings: 0

mysql> SELECT * FROM articles
-> WHERE MATCH (title, body) AGAINST ('database' IN BOOLEAN MODE);
+----+-------------------+------------------------ ------------------+
| Id | title | body |
+----+-------------------+------------------------ ------------------+
| 5 | MySQL vs. YourSQL | In the following database comparison ... |
| 1 | MySQL Tutorial | DBMS stands for DataBase ... |
+----+-------------------+------------------------ ------------------+
2 rows in set (0.00 sec)
MATCH () function to perform database for a string of natural language search. A database is a set of one or two included in the FULLTEXT the column. Search string as a AGAINST () the parameters are given. For each row in the table, MATCH () returns a correlation value, that is, search string and MATCH () column in the table specified in the text of the line between a similarity measure.

By default, the search is performed case-insensitive way. However, you can be indexed by using the binary sort out a case-sensitive full-text search. For example, you can use the latin1 character set to a column to sort the given latin1_bin for the full-text search is case-sensitive.

If the above examples, when MATCH () is used in a WHERE statement, the related non-negative floating point value. Zero correlation means that there is no similarity. Correlation calculation is based on the number of words in the line, the line number of the unique child, the total number of words in the database, and contains a special word file (line) number.

For natural language full-text search, requiring MATCH () function in the name of your table columns and some FULLTEXT index contains the same columns. Information for the above, attention, MATCH () function (title and full text) named in the columns and articles Miao Nei ULLTEXT the same columns in the index. To search for the title and text, respectively, should create FULLTEXT indexes for each column.

Basically, the above example shows how to use the correlation line in order to return gradually weak MATCH () function. The following example will show how to explicitly retrieve the associated values. The order of rows returned is uncertain, because the SELECT statement does not contain a WHERE or ORDER BY clause:

mysql> SELECT id, MATCH (title, body) AGAINST ('Tutorial')
-> FROM articles;
+----+-----------------------------------------+
| Id | MATCH (title, body) AGAINST ('Tutorial' IN BOOLEAN MODE) |
+----+-----------------------------------------+
| 1 | 0.65545833110809 |
| 2 | 0 |
| 3 | 0.66266459226608 |
| 4 | 0 |
| 5 | 0 |
| 6 | 0 |
+----+-----------------------------------------+
6 rows in set (0.00 sec)
The following example is more complex. Asked to return the relevant value, while the line is fading by relevance sort order. To achieve this result, you should specify the two MATCH (): once in the SELECT list and once in the WHERE clause. This does not cause additional housekeeping, because the MySQL optimizer noticed two MATCH () call is the same, which will activate a full-text search code.

mysql> SELECT id, body, MATCH (title, body) AGAINST
-> ('Security implications of running MySQL as root') AS score
-> FROM articles WHERE MATCH (title, body) AGAINST
-> ('Security implications of running MySQL as root' IN BOOLEAN MODE);
+----+-------------------------------------+------ -----------+
| Id | body | score |
+----+-------------------------------------+------ -----------+
| 4 | 1. Never run mysqld as root. 2. ... | 1.5219271183014 |
| 6 | When configured properly, MySQL ... | 1.3114095926285 |
+----+-------------------------------------+------ -----------+
2 rows in set (0.00 sec)
Table with 2 rows (0.00 seconds)

MySQL FULLTEXT prototype implementation of any word character (letters, numbers and underscores in part) as a sequence of words. This sequence may also contain a single quote ('), but no more than one in a row. This means that aaa'bbb will be seen as a word, and aaa''bbb are considered two words. In the word before or after the single quotation marks will be removed FULLTEXT parser; 'aaa'bbb' will become aaa'bbb.

FULLTEXT analysis program by looking for certain delimiter to determine the starting and ending word positions, such as '' (space character),, (comma) and. (Period). If the word is not a separator to separate (for example, in Chinese), the FULLTEXT parser can not determine a word's starting position and end position. To be able to in such language to add words to the FULLTEXT index or other indexed terms, you must preprocess them so that by some such as "sort of arbitrary delimiter separated.

Some words in the full-text search will be ignored:

Any too short word will be ignored. Full-text search can find the default minimum word length of 4 characters.
Stop word in the word will be ignored. Disable the word is to have a "the" or "some" that are too common and are considered non-semantic word. There is a built-in stop word, but it can be user-defined list is rewritten.

Thesaurus and ask each correct word in the thesaurus and ask according to their importance in being measured. In this way, a number of documents in the word with a lower importance (and even many of the importance of the word zero), because in this particular semantic lexicon in its low value. Conversely, if the word is relatively rare, then it will get a higher importance. And the importance of the word are combined to be used to calculate the correlation of the line.

Large lexicon of the technology used with the optimal contract (in fact, it is carefully adjusted at this time). For small tables, word distribution does not adequately reflect their semantic value, and this model may sometimes produce bizarre results. For example, although the word "MySQL" appear in the article table for each row, but the word may not get any search results:

mysql> SELECT * FROM articles

-> WHERE MATCH (title, body) AGAINST ('MySQL');

Can not find the search term (0.00 seconds)

The search result is empty because the word "MySQL" appear in at least 50% of the full text of the line. Therefore, it is included in the stop word. For large data sets, using the most appropriate to the operation of a natural language inquiry ---- not from a 1GB table every line and return again. For small data sets, its usefulness may be relatively small.

A line with the contents of all rows in the table of half the words are less likely to find relevant documents. In fact, it is easier to find a lot of irrelevant content. We all know that when we try to use the Internet search engine to find information on when the high frequency of this happening. Can be inferred, because the line contains the word where the particular data set and was given a lower semantic value. A given word may be in a data set with more than 50% of its threshold, while in another data set is not.

When you first try to use full-text search to understand the working process, this 50% threshold provides important operational implication: If you create a table and only insert one row of articles 1, 2, and each text words appear in all lines of at least 50% probability. Then the result is what you do not search. Must be inserted at least 3 lines, and the more the better. Need to bypass the 50% limit of the user can use Boolean search code

分类:Database 时间:2011-09-13 人气:192
分享到:
blog comments powered by Disqus

相关文章

  • mysql full text search match () against 2011-03-29

    A SELECT query LIKE Statement to execute this query, although this method is feasible , But for the full-text search, this is an extremely low efficiency of the method , In particular, when dealing with large amounts of data . ------------------- Abo

  • Search engine data collection (r) 2010-11-18

    Collection of learning resources search engine First, the search engine technology / dynamic resource <A>, Miscellaneous 1, Lu Liang's Search Engine http://www.wespoke.com/ Lu Liang is an expert on search engine development, have previously develope

  • Online Data Entry Jobs - What Are The Best Jobs 2010-11-20

    Jobs that are plentifulbr> br> More and more businesses are moving into the global marketplace, thanks to the capabilities of the internet. These same businesses are realizing that online data entry jobs can make a significant different in the cost

  • Data Entry Services ? How they increase efficiency in your business? 2010-11-20

    In addition, use the other functions to ensure that hygiene Databases contain only relevant and Current Information. Br Br Industries that use Data crisscross Herve Leger Bandage entry Data entry Services Services Br is widely used in many different

  • Things to do after getting a data entry job 2010-11-24

    The first thing to do and probably the most difficult is to try and set a schedule. If you decided to work at home because you didn't like waking up early and working long hours without decent breaks, you should still try to get a working Schedule. I

  • SEO Web site contains the major search engines fast entry 2010-12-04

    Fast SEO your site - The entrance of major search engines Baidu Web log entry Google Web log entry Yahoo Web log entry Web log entry Bing Dmoz Web log entry Coodir web directory log entry Alexa Web log entry Sogou web site to submit your entry Soso s

  • 在MongoDB中模拟Auto Increment的php代码 2014-01-07

    MySQL用户多半都有Auto Increment情结,不过MongoDB缺省并没有实现,所以需要模拟一下,编程语言以PHP为例 代码大致如下所示: <?php function generate_auto_increment_id($namespace, array $option = array()) { $option += array( 'init' => 1, 'step' => 1, ); $instance = new Mongo(); $instance = $insta

  • Dynamically change the input type tag attributes (such as the password to text) 2010-12-25

    <HTML> <head> <meta http-equiv="Content-Type" content="text/html; charset=gb2312"> <script> function removeSubmitFocus () { document.all ('pass'). outerHTML = "<input type = text name ='" + document

  • On the use of hibernate from sqlserver database to read out the image type data processing summary 2009-07-18

    A project has recently encountered from sqlserver database using hibernate to read out the image of the type of data. In this summary, At the same time would also like to thank the help lovewhzlq and huangnetian. Summary: The procedure used in the pi

  • How to handle the data table field of type Date? 2010-10-05

    How to keep a date data type, such as sql database, the corresponding field type is date The date of the database to java.sql.Date; order to a calendar or java.uitl.Date stored in the database, Can calendar or java.uitl.Date into java.sql.Date. java.

iOS 开发

Android 开发

Python 开发

JAVA 开发

开发语言

PHP 开发

Ruby 开发

搜索

前端开发

数据库

开发工具

开放平台

Javascript 开发

.NET 开发

云计算

服务器

Copyright (C) codeweblog.com, All Rights Reserved.

CodeWeblog.com 版权所有 黔ICP备15002463号-1

processed in 0.847 (s). 13 q(s)