This period has been busy comparison shopping search goods than one hundred ( SEO work, after the restructuring website, keyword placement and outside the chain of construction, Baidu has been stable for the site included, webmaster tool to query: "Baidu in last days included a website ( 1010 pages, "contains the total number: 72400. (2011-09-20)

Search applications in the realization of shopping time, often encounter similar Taobao search results page, "you are not looking for:", etao search results page under the search box "hot search" and the like demand, that is the key to the current search words, similar to the query keywords.

Seo and website from the existing structure into account, a separate page with a popular show keywords, URL design, such as:

And Achieve results similar words such as:

That under the implementation details, you first need a keyword database, the user can search for keywords statistics, you can also collect mall popular keywords. Wrote a simple program, statistics Taobao, Taobao Mall popular keywords more than 100,000 as a keyword library.

Similar to the query, I must mention the TF-IDF. TF-IDF (term frequency-inverse document frequency) is used for information retrieval and information exploration of the common weighting techniques. The main idea is: if a word or phrase in an article in the frequency of TF high and rarely in other articles, is that the word or phrase has a good ability to distinguish between the categories, suitable for classification. TFIDF fact: TF * IDF, TF word frequency (Term Frequency), IDF anti-document frequency (Inverse Document Frequency).

Lucene also provides a similar comparison of the interface, MoreLikeThis interface. Do not talk nonsense, and directly on the code.

Popular Keywords Recommended code:

public List<Hotkey> searchMoreLikeThis(String wd, int top) throws IOException, Exception { MoreLikeThisQuery query = new MoreLikeThisQuery(wd, new String[] { "wd" }, analyzerUtil.ikAnalyzer); TopDocs topDocs = getSearcher().search(query, top); int totalHits = topDocs.totalHits; ScoreDoc scoreDocs[] = topDocs.scoreDocs; top = top < totalHits ? top : totalHits; List<Hotkey> list = new ArrayList<Hotkey>(); for (int i = 0; i < top; i++) { Document doc = getSearcher().doc(scoreDocs[i].doc); Hotkey hotkey = doc2Object(doc); int freq = taobaoItemSearcher.docFreq(hotkey.getWd()); //... Here temporarily invisible ^_^ list.add(doc2Object(doc)); } return list; }

Interested friends look at the site, mention recommendations. Technology, Web site operators, SEO and other interested friends, a lot of exchanges.


You can also add QQ: 909546261, doubt phase analysis.

