Huwebes, Disyembre 19, 2013

Document Indexing


1. Goal = Find the important meanings and create an internal 
              representation

2. Factors to consider:
    Accuracy to represent meanings (semantics)
        Exhaustiveness (cover all the contents)
        Facility for computer to manipulate. 

3. What is the best representation of contents?
        Char. string (char trigrams) : not precise enough 
        Word: good coverage, not precise
        Phrase: poor coverage, more precise
        Concept: poor coverage, precise


Coverage (Recall)Accuracy (Precision)

                                    String      Word       Phrase      Concept

 

Walang komento:

Mag-post ng isang Komento