Skip to content

Assessment for potential Cultivo data science candidates

License

Notifications You must be signed in to change notification settings

farukht/data-science-assessment

 
 

Repository files navigation

Data Engineering Take-Home Assignment: Nature Conservation & Geospatial Data

Context

Assume you have been hired as a Data Engineer for an organization focused on nature conservation. The organization is working on a project to monitor and protect natural habitats using satellite data, wildlife sensor data, and geospatial information. Your task is to design and implement a data pipeline that ingests, processes, and analyzes this data to help identify areas needing immediate conservation attention as well as build a model that provides helpful insights related our organization's interests.

Objective

Your goal in this assessment is to showcase your curiousity and creativity to design rigorous models and derive interesting insights.

You'll be given two tasks.

The first is a design task, in which we expect you to diagram and describe how you'd set up a process to injest this data from a live streamed source, assuming you are also paying montoring services to supply this data from scratch. Think about how you might transform and store the data efficiently for querying and analysis and feed it into your model.

The second task will require you devise interesting questions from preliminary explorations of a subset of migration data, found alongside this notebook, and construct a rigorous model to answer them. Please demonstrate all of your process using this notebook, and most importantly your outputs.

Setup

Feel free to set up the jupyter notebook in this container using conda, or your own kernal / virtual environment. To make it easier, you can set up the notebook using this docker with the potentialy libraries you might need.

To start using a prepared Docker image,

  • 1 navigate to this shared folder in your terminal, and then load up docker and run the docker file to pull in needed libraries
docker build -t geospatial-notebook .
docker run -p 8888:8888 -v $(pwd):/home/jovyan/work geospatial-notebook

When the container runs, it will display a URL with a token (something like http://127.0.0.1:8888/?token=...). It will probably be something like http://127.0.0.1:8888/tree You can copy this URL into your browser, and you'll open to a Jupyter lab. Your existing notebook will be available inside the container under the work directory.

Anytime you want to work again, just run the following command to start the Docker container and access your notebooks:

docker run -p 8888:8888 -v $(pwd):/home/jovyan/work geospatial-notebook

About

Assessment for potential Cultivo data science candidates

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%