# Data Science Pair Interview ## Requirements - Python 3.11 (you can change this in the `pyproject.toml` file) - [Poetry](https://python-poetry.org/) - [Docker](https://www.docker.com/) - [Ngrok](https://ngrok.com/) - [GNU Make](https://www.gnu.org/software/make/) I recommend using [asdf](https://asdf-vm.com/) to manage your Python versions and Poetry installation. ## Usage ### Get Setup Create your local environment with ```shell make install ``` Ensure you have created and validated your account with [Ngrok](https://ngrok.com/). ### Make you Dataset `src/interview/data.py` contains an example function to build the classic [Iris dataset](https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html). You can extend this as you see fit to create one or more custom datasets relevant to your business. There is a Poetry script to run the `make_data` function and build the dataset, which you can run with: ```shell make data ``` ### Run the Notebook with Docker Build the container and run the notebook with ```shell make build run ``` This will copy your data and notebook to the container, install any packages you specified in your `pyproject.toml` file, then run the notebook in the container. Jupyter will be available on port 8888. Take not of the authentication token. ### Setup a Tunnel Start Ngrok with `ngrok http 8888`. This will give you the URL you will share with the candidate, as illustrated below. ![Ngrok](ngrok.png) By following this link the candidate can login to Jupyter Lab running in the container on your machine using the Jupyter authentication token you took note of above.