Skip to content

Centroid visualization using K-means algorithm on MNIST dataset

Notifications You must be signed in to change notification settings

Nweaver412/Centroid-Visualization-MNIST

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Centroids as Internal Representations of Image Data

As you imagine concepts and objects inside of your head, you might have noticed that you have internal "representations" of various objects and concepts. Ever wonder what a computer thinks the numbers 0-9 look like?

plts

Pretty cool huh? It also turns out that its quite simple to really peek inside of statistical models! In this work we find that with sufficient MNIST image data, simple cluster algorithms (like k-means) centroid's are visually identifiable representations of the classes present in image datasets.

plts

In order to further test out our method, we tried used Fashion-MNIST. Notice how there are clearly identifiable and unique reprentations for boots, sneakers, and shoes. These initial findings show that clustering algorithms can hold visual concepts inside of their centroids, and suggest further research into unsupervised clustering methods for data representations.

Paper

The complete project write up is compiled in a PDF format as well as a Google Doc. You can access either of these by clicking the link below:

Write-up PDF | Write-up Link.

Don't want to read long papers? Check out a summarized version from our presentation.

Code

Interested in the code? We have a jupyter notebook with our findings at research/main.ipynb.

If you want to experiment with the notebook, make sure to run the following commands.

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

About

Centroid visualization using K-means algorithm on MNIST dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published