Class TFIDFCalculator

java.lang.Object
edu.odu.cs.cs350.TFIDFCalculator

public class TFIDFCalculator extends Object
Calculates TF-IDF (Term Frequency–Inverse Document Frequency) for words across multiple documents. IDF(w) = log(N/df(w)) where: - N is total number of documents - df(w) is the number of documents containing word w
  • Constructor Details

    • TFIDFCalculator

      public TFIDFCalculator()
  • Method Details

    • computeIDF

      public static Map<String,Double> computeIDF(List<Document> documents)
      Computes the Inverse Document Frequency (IDF) for all words in a collection of documents.
      Parameters:
      documents - The list of documents.
      Returns:
      A map of words to their IDF values.
    • computeTFIDF

      public static Map<String,Double> computeTFIDF(Document document, Map<String,Double> inverseDocumentFrequencyMap)
      Computes the TF-IDF score for each word in a document.
      Parameters:
      document - The document to compute TF-IDF for.
      inverseDocumentFrequencyMap - The precomputed IDF values for all words.
      Returns:
      A map of words to their TF-IDF values.