Gravatar for

Question by Lipika Brahma, Dec 15, 2016 2:29 PM

Understanding Term Weights ?

When I do an inspect element on a search page to find ranking information, I find two things. 1) Document Weights and 2) Term Weights.

In term weights I see some words from the query and numbers beside them, what do these numbers signify ? Here is an example

Terms weights:
startup: 100, 41;
Title: 1017; Concept: 84; Summary: 296; URI: 0; Formatted: 141; Casing: 0; Relation: 141; Frequency: 1786;

not: 100, 7;
Title: 422; Concept: 0; Summary: 123; URI: 0; Formatted: 58; Casing: 0; Relation: 58; Frequency: 191;

Total weight: 7321

In this case the query was "not startup", you can see the words startup:100,41 <- what do these numbers signify ?

1 Reply
Gravatar for

Answer by Simon, Dec 15, 2016 2:39 PM

The two numbers under the term is the Correlation and IDF of the term.

The correlation is how close this stemming expansion is to the searched word. In your case, startup is the searched word so it gets 100%, but startup could have gotten 75%, for example.

The IDF of the term is how important it is, in the index. Which mean the number of time it appears in the index; this IDF is only computed for indexes that are 20,000 documents in size and more.

Gravatar for

Comment by Simon, Dec 15, 2016 2:40 PM

About stemming :

Ask a question