-
AnimeList.csv contains list of anime, with title, title synonyms, genre, studio, licencor, producer, duration, rating, score, airing date, episodes, source (manga, light novel etc.) and many other important data about individual anime providing sufficient information about trends in time about important aspects of anime. Rank is in float format in csv, but it contains only integer value. This is due to NaN values and their representation in pandas.
-
UserList.csv contains information about users who watch anime, namely username, registration date (join_date), last online date, birth date, gender, location, and lots of aggregated values from their anime lists.
-
UserAnimeList.csv contains anime lists of all users. Per each record, here is username, anime ID, score, status and timestamp when was this record last updated.
-
Users and theirs location -> Distribution of users over the world (Normalized by population)
-
Compare watching trending for each country (or each genre)
-
Which country loves the anime the most (based on number of anime completed and others stuffs (maybe weighted))
-
Number of average watched episodes by users
- Graph (Nodes: Anime, Edges: +1 if 2 animes watch by the each user). Note: Using a threshold for weight of the edge -> Elimite the weights, only keep the edge with weights larger than a threshold
- Community detection algorithm.
- Centralities
-> Relationship between anime
- Threshold is reasonble (Distribution plot for threshold)?
- How many communities is enough ?
- Map from communities to the properties of animes -> Features of the same animes in the anime communities
-
Graph for user graph -> Need to filter also in the anime-user dataset
-
(2,3 animes) => Recommendation for people to watch next anime based on the first one (Frequent Pattern)
-
Spearman correlation between 2 columns in user/anime
-
Clusterings on users (user_watching,user_completed,user_onhold,user_dropped,user_plantowatch,user_days_spent_watching) => Compare it with age (range of ages) ?
-
Why OnePiece is not a top anime in this ? (Technique not known)
https://www.kaggle.com/code/vietanhnguyen1010/clean-data-myanimelist
https://github.com/google/dspl/blob/master/samples/google/canonical/countries.csv