Skip to content

Final Project

Sune Lehmann edited this page Mar 5, 2024 · 6 revisions

This page contains information about the final project.

Project Assignments Overview

The point of the Project Assignments is to try out the skills you've learned in the course on your own dataset.

Watch a video overview of the project here

Video

Now let's get down to business with more details in writing. This year there are two options

Option 1: City Data

Previously, I've limited the projects to be about city data, and I still think that's a great choice of data source to work on. Check out some city datasets below.

For inspiration for what to do with data like this, I recommend you sit down and listen to a podcast to get in the right mindset. It's an interview with Ben Wellington, the author of the site http://iquantny.tumblr.com/ (it's a super cool blog, so remember to scroll through that page to check out some of the many projects Ben has worked on).

Check out this podcast for inspiration. It's a 40 minute listen, but well worth your time. After listening to this I predict that you'll be brimming with ideas for what to start working on. (Or at least have an idea of where to get started).

Summary of Overview: The overall idea is to take a deep dive into some aspect of a city dataset and try to understand that data using the tools you've learned in this class. Once you understand the data, you should tell the story of what you've found on your own website using D3 visualizations (and whichever other tools you need). So it's not just cool data viz, it's:

  • Data analysis and understanding, then
  • Using narrative data viz to communicate what you're learned.

Option 2: COVID-19 data

We're all thinking about it. Corona virus. There is lots of open data about it. Your visualizations might help the world. So of course you can also work on any aspect of data associated with COVID-19. And it's a natural thing to combine this with city data in many cases.

Project Assignment A

The first part of the final project is an 2 minute movie, which should explain the central idea/concept that you will investigate in your final project. You're making the movie so that the TAs and I can give you feedback, and so that other groups can steal your ideas (and you can steal ideas from them). The movie must contain the following

  • An explanation of the central idea behind your final project (what is the idea? which datasets do you need to explore the idea?, why is it interesting?)
  • A mock up of the visualization that you wish to build. (Anything is fine here. Pen and paper, MS Paint, Inkscape, D3, anything.).
  • Make sure you answer the questions
    • What genre is it? (for Genres, see section 4.3 of the Segel and Heer paper)
    • Why is that genre right for telling the story you want to communicate with the data
  • An outline on the elements you'll need to get to your goal.
  • The implementation plan.
  • A walk-through of your preliminary data-analysis, addressing
    • What is the total size of your data? (MB, number of rows, number of variables, etc)
    • What are other properties? (What is the date range? Is is it geo-data?, then a quick plot of locations, etc.)
    • Show the fundamental distributions of the data (similar to the work we did on SF crime data for lecture 3)

But other than that, there are no constraints. And we do appreciate funny/inventive/beautiful movies, although the academic content is most important. Note that we'll display the movie to the entire class.

(The maximum length is 2 minutes, but its OK if the movie is shorter.)

Handing in the assignment: Simply upload your video to youtube or another video hosting site (the higher the resolution the better) and submit the link to peergrade.

Project Assignment B

The deliverables for the Final project are

A website with your visualizations an accompanying text. I recommend you structure it as a kind of narrative data story (cf. the Segel paper we read during Lecture 7 and 8). The website should tell the story about the data that you're interested in getting across. In the simplest, most minimalist case, the website can be a very nice Jupyter Notebook hosted on nbviewer.

  • It should contain visualizations to let the reader explore the data that you're interested in getting across. It is a plus if some of them are interactive.

Your analysis behind the scenes can be technical and as advanced as you like (in fact the goal is to show you can combine data analysis, machine learning, and data visualization), but the website itself should not be technical, but rather aim at using visualization and explanation to get your data driven insights across to a non-scientific reader.

The idea is that you can create much more complex, dynamic and interactive analysis (and visualizations) using the possibilities available when you're creating a website. So it is a way for you to present your work in a way that everyone can understand it (like something you could show your parents).

An explainer Jupyter Notebook. The explainer notebook should contain all the behind the scenes data-analysis stuff, details on the dataset, why you've selected these particular visualizations, explanations methodology, etc.

More about the website

The main point of the website is to present your idea/analyses to the world in a way that showcases your use of what you've learned in class. The website should be self-contained and tell the story without the need for the details in the explainer notebook (the purpose of the explainer notebook is to provide additional details for interested/scientific readers).

More on the explainer notebook

The notebook should contain your analysis and code. Please structure it into the following sections

  1. Motivation.
  • What is your dataset?
  • Why did you choose this/these particular dataset(s)?
  • What was your goal for the end user's experience?
  1. Basic stats. Let's understand the dataset better
  • Write about your choices in data cleaning and preprocessing
  • Write a short section that discusses the dataset stats, containing key points/plots from your exploratory data analysis.
  1. Data Analysis
  • Describe your data analysis and explain what you've learned about the dataset.
  • If relevant, talk about your machine-learning.
  1. Genre. Which genre of data story did you use?
  • Which tools did you use from each of the 3 categories of Visual Narrative (Figure 7 in Segal and Heer). Why?
  • Which tools did you use from each of the 3 categories of Narrative Structure (Figure 7 in Segal and Heer). Why?
  1. Visualizations.
  • Explain the visualizations you've chosen.
  • Why are they right for the story you want to tell?
  1. Discussion. Think critically about your creation
  • What went well?,
  • What is still missing? What could be improved?, Why?
  1. Contributions. Who did what?
  • You should write (just briefly) which group member was the main responsible for which elements of the assignment. (I want you guys to understand every part of the assignment, but usually there is someone who took lead role on certain portions of the work. That's what you should explain).
  • It is not OK simply to write "All group members contributed equally".
  1. Make sure that you use references when they're needed and follow academic standards.

Handing in the assignment: Simply upload the link to your website to peergrade.