Download e-book for iPad: A Theory of Indexing by Gerard Salton

Posted On February 7, 2018 at 4:06 pm by / Comments Off on Download e-book for iPad: A Theory of Indexing by Gerard Salton

By Gerard Salton

ISBN-10: 0898710154

ISBN-13: 9780898710151

Offers a concept of indexing in a position to score index phrases, or topic identifiers in lowering order of significance. This results in the alternative of excellent rfile representations, and likewise debts for the function of words and of glossary periods within the indexing technique.

This research is standard of theoretical paintings in automated details association and retrieval, in that thoughts are used from arithmetic, desktop technology, and linguistics. an entire concept of details retrieval may possibly emerge from a suitable mix of those 3 disciplines.

Show description

Read Online or Download A Theory of Indexing PDF

Best probability books

Get Fuzzy Logic and Probability Applications PDF

Probabilists and fuzzy fanatics are inclined to disagree approximately which philosophy is healthier they usually infrequently interact. hence, textbooks often recommend just one of those equipment for challenge fixing, yet now not either. This e-book, with contributions from 15 specialists in likelihood and fuzzy common sense, is an exception.

Probability: With Applications and R - download pdf or read online

An creation to likelihood on the undergraduate levelChance and randomness are encountered every day. Authored through a hugely certified professor within the box, chance: With functions and R delves into the theories and functions necessary to acquiring a radical realizing of likelihood.

Extra info for A Theory of Indexing

Sample text

D. Information value experiments. The experiments dealing with the use of information values are covered separately, because the methodology must necessarily be different in this case from that used earlier. In particular, since the generation of information values depends on a number of user-system interactions involving the processing of user queries against the available document collections, it is necessary to break the query set into two parts: a set of test queries must first be used for the generation and modification of term weights by means of interactive query processing; a new set of queries, not previously used, can then serve for evaluation purposes.

Since such a relatively small deletion percentage does not lead to substantial losses in performance for any collection, and may in fact produce considerable improvements, the ten percent deletion percentage may be productive in all environments. It may be useful, as a final exercise, to determine whether a clear-cut policy is available for choosing among various significance rankings for term deletion purposes. In particular, the discrimination value rankings can be compared with the inverse document frequency rankings previously examined.

Whereas no clear correlation was found to exist between the S/N ratings and the document or collection frequencies of the corresponding terms, a direct relation appears to exist for the discrimination value rankings. As the discrimination values decrease from good to average to poor, the document and collection frequencies of the terms go from average, to low, and finally to quite high. This correspondence is used as a basis for a theory of indexing in the last section of this study. In summary, a study of the frequency distributions of the terms ranked according to a number of different measures of term significance reveals the following characteristics: (a) When the terms are ranked in decreasing order of collection frequency F k , or document frequency Bk, the best terms are those with universal occurrence A THEORY OF INDEXING 23 characteristics; such terms may help in producing high recall output, but the retrieval results will certainly not be sufficiently precise for most purposes.

Download PDF sample

A Theory of Indexing by Gerard Salton

by Paul

Rated 4.92 of 5 – based on 39 votes