-
Notifications
You must be signed in to change notification settings - Fork 24
Home
Intro
Welcome to the wiki for the course Social data analysis and visualization (02806) offered by the Technical University of Denmark. This is the main page, where you can access the weekly exercises. If you take a look in the side-bar, you can read about the administrative details (including a very useful course overview), assignments, books, and more.
The class is taught flipped classroom style, where the the lecture and homework elements of a course are reversed. You'll be able to view short video lectures before (or during) the class session, so in-class time can be devoted to exercises, projects, or discussions. Check out the first lecture to learn more.
Assignments
- Assignment 1 is available here. For due-date, etc see important dates or the assignment itself.
- Assignment 2 is available here. For due-date, etc see important dates or the assignment itself.
Exercises
-
Week 1: Introduction. This week is all about getting started: Installing python, learning about Jupyter notebooks, and making sure that you're a relatively skilled python programmer. You can also see the file here on github, but the videos won't display properly.
- Reading I: This
pandas
Tutorial. Just work through the examples in your own Jupyter Notebook. - Reading II: We'll be looking into crime patterns. Take a look at this article from Science Magazine to get a bit deeper sense of the topic.
- Reading I: This
-
Week 2: Let the data science begin. Ok. So now everyone's up to speed with Python and Pandas. This week, we'll start doing more analyses of the data that we downloaded last week. You'll learn that just calculating simple distributions (and conditional distributions) can teach you A LOT about a dataset. But that's not it. We'll also get creative with plotting GPS data. (And as a little bonus, we'll also play with some traditional dataviz examples.) So LOTS to do today. No time for reading :)
- Reading: No reading this week. Just fun with coding.
-
Week 3: Plotting single variable data. This week we start with dataviz lectures. We'll also start reading independently and learn about the many different ways you can visualize just a single variable
- Reading: Data Analysis with Open Source Tools. Chapter 2 (you can get the text here).
-
Week 4: Heatmaps and data errors. GeoSpatial data is a very important category, so this week we dig deeper with options for visualizing that data-type. Including strategies for making little movies. We also have a small exercise to talk about errors in the data which draws on some of the work we've done in previous weeks. I hope you enjoy todays relatively light load.
- Reading I. Read through the following tutorial How to: Folium for maps, heatmaps & time data. Get it here: https://www.kaggle.com/daveianhickey/how-to-folium-for-maps-heatmaps-time-data
- Reading II (Optional) There are also some nice tricks in Spatial Visualizations and Analysis in Python with Folium. Read it here if you'd like, otherwise it should be safe to skip: https://towardsdatascience.com/data-101s-spatial-visualizations-and-analysis-in-python-with-folium-39730da2adf
-
Week 5: More plotting, linear regression. This lecture features more lecturing (and a cool bonus video). Then we get into exploring data with two variables, something which we'll read about (see blow). Then we do logarithmic plots and have lots of fun with linear regression and the associated math.
- Reading: DAOST Chapter 3
-
Week 6: Quick intro to machine learning. Today we catch everyone up on machine learning. That means a lot of material to get through, lots of videos, lots of reading, lots of exercises. But it will pay in terms of skills to analyze data and create amazing data visualizations. Here's the structure: We will learn about machine learning in general. Then we will play with
sklearn
, then we will learn the about KNN and solve a KNN exercise. Finally we will work with decision trees and solve a decision tree exercise. Then we will rest. (And if the link to the nbviewer doesn't work, you can find the exercises here.)- Reading: Data Science from Scratch chapter 11,12,17. (links to text in the notebook)
- Reading: A fantastic visual explanation of decision trees.
-
Week 7: More machine learning ... and movies. Today we continue working with machine learning. The purpose of our exercises to day is to show you that amazing things can happen when we combine data sources. So we'll add weather data to our crime dataset for new insights. On the reading and video front, we'll prep for next week when we finish off by looking into explanatory data visualization.
- Reading. Edward Segel and Jeffrey Heer. Narrative Visualization: Telling Stories with Data, section 1-3.
-
Week 8: Interactive visualizations with Bokeh. Today more tools for explanatory data visualization. We read more about narrative data visualization and create at two interactive data visualizations using Bokeh. If the nbviewer link isn't working, you can find the exercises right here on GitHub (and view the lecture based on the link or check it out on slack).
- Reading. Edward Segel and Jeffrey Heer. Narrative Visualization: Telling Stories with Data, section 4-5.
This class has been hand crafted for you by Sune Lehmann in Copenhagen.
This work is licensed under a Creative Commons Attribution 4.0 International License.