Package edu.odu.cs.cs350
Class TFIDFCalculator
java.lang.Object
edu.odu.cs.cs350.TFIDFCalculator
Calculates TF-IDF (Term Frequency–Inverse Document Frequency) for words across multiple documents.
IDF(w) = log(N/df(w))
where:
- N is total number of documents
- df(w) is the number of documents containing word w
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptioncomputeIDF(List<Document> documents) Computes the Inverse Document Frequency (IDF) for all words in a collection of documents.computeTFIDF(Document document, Map<String, Double> inverseDocumentFrequencyMap) Computes the TF-IDF score for each word in a document.
-
Constructor Details
-
TFIDFCalculator
public TFIDFCalculator()
-
-
Method Details
-
computeIDF
Computes the Inverse Document Frequency (IDF) for all words in a collection of documents.- Parameters:
documents- The list of documents.- Returns:
- A map of words to their IDF values.
-
computeTFIDF
public static Map<String,Double> computeTFIDF(Document document, Map<String, Double> inverseDocumentFrequencyMap) Computes the TF-IDF score for each word in a document.- Parameters:
document- The document to compute TF-IDF for.inverseDocumentFrequencyMap- The precomputed IDF values for all words.- Returns:
- A map of words to their TF-IDF values.
-