mnoGoSearch 3.2.33 reference manual: Full-featured search engine software | ||
---|---|---|
Prev | Chapter 8. Searching documents | Next |
mnoGoSearch sorts results first by relevancy and second by popularity rank.
Relevancy for every found document is calculated as 100% multiplied by the cosine of an angle formed by weights vectors for the request and weights vectors for the document found. The number of vector coordinates is equal to the multiplication of the number of words forms in the search query and the number of sections defined in indexer.conf. Every vector's coordinate corresponds to a word in a search query that fits one of the document's sections. The values of this coordinate depend on the weight of this section, defined by the wf parameter (see the Section called Changing different document parts weights at search time). And this word is exactly the same as in the search query or its word form or synonym. And one more coordinate is equal to the average distance between searched words in the document. For the query's vector, this coordinate is equal to 0.
Since sections definition are located only in the indexer.conf file, use NumSections command in searchd.conf or in search.htm to specify the number of sections used. By default, this value is 256. But note, NumSections do not affect document ordering, only the relevancy value.
The popularity rank calculation is made in two stages. At first stage, the value of the Weight parameter for every server is divided by the number of links from this server. Thus, the weight of one link from this server is calculated. At second stage, for every page we find the sum of weights of all links pointed to this page. This sum is the popularity rank for this page.
By default, the value of the Weight parameter is equal to 1 for all servers indexed. You may change this value by Weight command in the indexer.conf file or directly in the server table, if you load the servers configuration from this table.
If you place the PopRankSkipSameSite yes command in the indexer.conf file, the indexer will take only inter-site links (i.e. links from a page on one site to a page on another site) for popularity rank calculation.
If you place the PopRankFeedBack yes command in the indexer.conf file, the indexer will calculate the site weight before page rank calculation. To do that, the indexer calculates the sum of popularity rank for all pages from the same site. If this sum is greater than 1, the weight for the site is set to this sum, otherwise, the site weight is set to 1.
If you place the PopRankUseTracking yes command in the indexer.conf file, the indexer will calculate the site weight as the number of tracked queries with restriction on this site.
If you place the PopRankUseShowCnt yes command in the search.htm (or the searchd.conf) file, then for every result shown to the user, the corresponding url.shows value will be increased by 1, if relevancy for this result is great or equal to the value specified by the PopRankShowCntRatio command (default value is 25.0). If you place PopRankUseShowCnt yes in the indexer.conf file, the indexer will add to url's PopularityRank the value of url.shows multiplied by value, specified in the PopRankShowCntWeight command (default value is 0.01).
Please note that in case of boolean searching of two or more words, you have to enter operators (&, |, ~). I.e. it is necessary to enter "a & book" instead of "a book" (with no quotation marks).
This feature authorizes assignment of words between <a href="xxx"> and </a> also to a document to which this link leads to. It works in SQL database mode and is not supported in built-in database and Cachemode. To enable Crosswords, please use CrossWords yes command in indexer.conf and search.htm.