The project is broken up into three main systems:
- Frontend App: what the user sees
- Backend API: how the app gets data
- Offline Pipelines: how we process the data
All three of these systems are part of this codebase. This guide will help you install and run all three parts.
We will set up the project codebase using the command line. Open a terminal window to follow these steps.
If you are on Windows, install Cygwin and add it to your PATH in your system environment variables. This will allow you to run some terminal commands that are normally not available on Windows.
When you see a code block like this, unless otherwise stated, each line is a command to enter into your terminal.
# This command will list all the folders and files in your current directory:
ls
# Windows users who do not have Cygwin can use this command:
dir
To set up TransitHealth on Cloud9, a cloud-based development environment, follow this guide.
To set up TransitHealth on your own machine, continue following this guide.
The code for this project is hosted on GitHub. If you do not have a GitHub account, visit the website to create one.
We will use git for version control: tracking different versions of our code and collaborating on development. If git is not installed on your computer, download it here.
If you have not used git before, configure your name and email.
git config --global user.name "Your Name"
git config --global user.email "Your Email"
Clone the repository and use cd
to enter the folder. This first clone can take 4-10 minutes because you are also cloning the compressed database files.
git clone [email protected]:scarletstudio/transithealth.git
cd transithealth
Now that you have the codebase cloned, you can install and run the app. Later, you will make your own changes and commit and push them, which is how we send our changes to the main repository.
Environment variables help us configure our programs without changing code.
Run this command to create an environment configuration file:
touch .env
Then open the file in the text editor of your choice and add these contents:
# Tells the server what port to serve the API on
PORT=5000
# Tells the server what origins to accept requests from (comma-separated list, if multiple origins)
ALLOW=http://localhost:8001
# Tells the server where to find the SQLite database file
DATABASE=pipeline/database.db
Our frontend app uses a special kind of environment configuration file, called .env.local
. Create this file in the app/
directory.
touch app/.env.local
Then open the file in the text editor of your choice and add these contents:
# Tells the frontend app where to send requests to for the backend API (no slash at the end)
NEXT_PUBLIC_API=http://localhost:5000
Sometimes these files contain secret information (not the case for our project), like database passwords, so we do not want these files to be committed to the repository. If you look at the file .gitignore
and search for .env
, you will find that we have told git to ignore both of these files.
- If you run the frontend app on a different URL or port number, update your
.env
file to make sure it is in theALLOW
list for the backend API. - If you run the backend API on a different URL or port number, update your
app/.env.local
file to make sure it is set as theNEXT_PUBLIC_API
so the frontend app can reach it.
Now we can set up the Python parts of our project. If Python and its package manager pip are not on your system, visit the website to install them.
Use Python version 3.6 or higher.
Virtual environments allow us to manage the dependencies of our project, without affecting the rest of our computer.
If the virtual environment package for Python is not installed, you can install it with pip:
pip3 install virtualenv
Create a virtual environment, which will be stored in a folder called .venv/
.
virtualenv .venv
If the above command does not work, try this:
python3 -m virtualenv .venv
Activate the virtual environment. You will do this at the start of every session.
# For Mac/Linux users:
source .venv/bin/activate
# For Windows users:
.venv\Scripts\activate.bat
If your virtual environment is active, you will see (.venv)
at the left of your terminal line.
At the end of your session, you can deactive the virtual environment by typing:
deactivate
Install the project Python dependencies.
pip3 install -r requirements.txt
Change directories to the app/
folder.
cd app
We will use Yarn, a package manager for Node.js. If Node.js is not on your system, visit the website to install it first.
Use Node version 14 or higher.
Then, if Yarn is not on your system, visit the website to install it.
Now you can install the project JavaScript dependencies.
yarn install
After the installation succeeds, go back up to the project root directory.
cd ..
Our offline pipeline pulls in data from outside sources, processes it, and creates our database. The pipeline is run by a tool called Make.
For more information about the offline pipeline directory structure, visit its README.
Make should already be installed on Linux or Mac systems.
For Windows users:
- Install Cygwin and add it to your PATH in your system environment variables. This will allow you to run our Makefile, which contains commands that normally do not work on Windows.
- Install Chocolatey. Then use it to install Make by running
choco install make
.
Make sure that your virtual environment is activated and that you are in the pipeline/
directory.
source .venv/bin/activate
cd pipeline
Run this command to unpack the compressed database into a file that the backend can read.
make uncompressed
If you would like to run the entire pipeline (except for the files that take the longest to make), then follow these steps. Otherwise, skip to the next heading.
First, download the latest archive from this Drive link. This contains the files that take the longest to make.
https://drive.google.com/file/d/1UG0G8PemaT1YU_BKaOfN-PIq191KvceV/view?usp=sharing
Download the file and move it to pipeline/archive.tgz
. Then unpack its contents using this command:
make unpack-archive
Now you can run the rest of the pipeline. This command will clear all files except the ones from the archive and then rerun the entire pipeline:
make clean-except && make
Now that you have unpacked the database, you can run basic tests to check its contents.
First, go back up to the project root directory. Then, run our unit testing command pytest
. This will run all unit tests for the project.
cd ..
pytest
If you get a message that all tests are passing, you are good to go! Otherwise, ask for help.
Our backend API serves requests to the frontend and computes metrics from the database produced by the offline pipeline.
For more information about the backend API directory structure, visit its README.
You can run the API from the project root directory. Make sure your virtual environment is activated.
source .venv/bin/activate
Use this command to run the server in development mode. If you make changes, it will automatically reload.
./api/dev.sh
Now you can open http://localhost:5000
in your browser. You get see a welcome message telling you the API is active.
If you run the backend API on a different URL or port number, update your
app/.env.local
file to make sure it is set as theNEXT_PUBLIC_API
so the frontend app can reach it.
If this command does not succeed, check:
- That your virtual environment is activated (Step 3)
- That the Python dependencies are installed (Step 4)
- That your
.env
file is present in the project root directory and has the correct values with no extra whitespace (Step 2)
Keep the server running and open a new terminal to run the frontend app for step 8.
Later, you can stop the server by pressing Cmd + C
or Ctrl + C
. You may have to press twice.
Our frontend app helps users visualize data and sends requests to the backend API.
For more information about the frontend app directory structure, visit its README.
Use this command to start the app in development mode. If you make changes, it will automatically reload.
./app/dev.sh
Now you can open http://localhost:8001/transithealth
in your browser. You should get the app home page.
If you run the frontend app on a different URL or port number, update your
.env
file to make sure it is in theALLOW
list for the backend API.
If this command does not succeed, check:
- That the Node.js/Yarn dependencies are installed (Step 5)
- That your
.env.local
file is present in theapp/
directory and has the correct values with no extra whitespace (Step 2) - That you put
/transithealth
at the end of your URL
You can stop the frontend by pressing Cmd + C
or Ctrl + C
. You may have to press twice.
You can close this terminal and return to the terminal that was running the backend API server. Shut down that server and proceed to step 9 to run the Jupyter notebook server.
Jupyter notebooks allow us to write Python (and other languages, like SQL!) and interact with the output. They are a great tool for data exploration, debugging, and prototyping.
Create a new folder with your username. This is where your notebooks will live. Change the command below to replace YOUR_HAWK_USERNAME
with your Hawk username. By convention, folder names are lowercase.
mkdir notebooks/YOUR_HAWK_USERNAME
Copy the example notebook into your folder. Change the command below to replace YOUR_HAWK_USERNAME
with your Hawk username.
cp "notebooks/example/Example Notebook.ipynb" "notebooks/YOUR_HAWK_USERNAME/My Example Notebook.ipynb"
You can start the Jupyter notebook server from the project root directory.
./notebooks/start.sh
This will open a page in your browser with all the folders in our project. Click on the notebooks/
folder and then open your folder. Click on your example notebook to launch it.
Using this command also ensures that the notebook server launches with our Python virtual environment, so you can use any module we have installed.
Run through the commands in the sample notebook. Click Shift + Enter
to execute a cell. You can also do this step later.
Press the save icon in the top left corner of your notebook.
You can stop the notebook server by pressing Cmd + C
or Ctrl + C
. You may have to press twice.
Before starting this step, contact your mentor or Vinesh to get write access to the GitHub repository, which will allow you to push your branch and work on the project. Let them know your GitHub username. You can find this on your GitHub profile or in the URL of your GitHub account page.
Now that you have made some changes, you will commit and push them to the main project repository.
First, we will create a new branch. Putting your change on a different branch allows you to work while others make changes to the repository and lets you submit your work for review, before merging it into the main repository.
Replace YOUR_BRANCH_NAME
with first_branch_YOUR_HAWK_USERNAME
where YOUR_HAWK_USERNAME
is your Hawk username. By convention, branch names are lowercase.
git checkout -b YOUR_BRANCH_NAME
Run this command to check what files have been changed. They should show as "not staged for commit." The only change should be the notebooks folder you added and your new example notebook. If you see other changes, ask for help.
git status
Use this command to add all of the changes to your commit. When you check the status again, they should show as "to be committed."
git add -A
git status
Create a commit and add a commit message.
git commit -m "Create my notebooks folder."
Now try to push the commit to the repository.
git push
The first time you push a commit from a new branch, it will fail and tell you that the branch is only on your local machine, not on the remote repository. Git will show you a command in the failure output. Run that command to push your branch to the repository. From then on, you will be able to push commits from this branch. The command will be like this:
git push --set-upstream origin YOUR_BRANCH_NAME
After pushing, open a pull request so we can merge your new notebook folder into the main branch.
- Go to the project repository on GitHub
- Click on the Pull Requests tab
- Click New pull request in the top-right corner
- Select your branch name from the dropdown labeled compare
- Leave main as the value for the dropdown labeled base
- Click Create pull request in the top-right corner
- Fill out the title and description of your pull request
- Click Create pull request in the bottom-right bellow the description
- Ask your mentor to review and merge the pull request
You made it through all the setup steps! Now you are ready to start contributing to the project.