Skip to content

Our COM SCI 267 project. Credit to Nikki Woo for fantastic project name.

Notifications You must be signed in to change notification settings

jho44/GetHuluJob

Repository files navigation

Dataset

ml_netflix.csv: join on ml-latest-small and Netflix data

  • code for generating the dataset in utils/gen_data.ipynb

Experimented on

  1. LDA
    • pure Python
      • can run the notebook straight through
    • Stan
      • warning: takes ~1.67 hours to finish sampling on all 1056 movies
      • run python3 stan-lda.py in terminal with the following available flags:
        • regen_words_df (bool): True if you'd like to regenerate the dataframe mapping each word (ID) to a document/movie (ID)
          • saved to cache/words_df.csv
        • regen_data_lemmatized (bool): True if you'd like to regenerate the lemmatized movie descriptions
          • saved to cached/data_lemmatized.txt
        • num_movies (int): the first num_movies movies from the data set that you'd like to train on
          • by default, it's the number of movies in the data set (1056)
        • just_eval (bool): True if you'd like to just calculate the evaluation metrics. Assumes you already have the trained posterior values in results/theta.npy.
    • Pyro
      • can run the notebook straight through
      • modify number of topics, number of epochs run, etc. in cell 4
    • Turing
      • can run the notebook using Julia runtime
      • results are output to CSV (cache/julia_out.csv) for evaluation in Python using eval_julia.ipynb
  2. PMF
    • pure Python
      • can run the notebook straight through
    • Stan
      • run python3 stan-pmf.py in terminal with the followiing available flags:
        • just_eval (bool): True if you'd like to just calculate the evaluation metrics. Assumes you already have the trained posterior values in results/Z.npy and results/W.npy.

Eval Metrics

  1. Personalization
  2. MAP@K
  3. Mean Precision
  4. RMSE

About

Our COM SCI 267 project. Credit to Nikki Woo for fantastic project name.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •