47 lines
1.5 KiB
Markdown
47 lines
1.5 KiB
Markdown
# Data Science Pair Interview
|
|
|
|
## Requirements
|
|
|
|
- Python 3.11 (you can change this in the `pyproject.toml` file)
|
|
- [Docker](https://www.docker.com/)
|
|
- [Ngrok](https://ngrok.com/)
|
|
- [GNU Make](https://www.gnu.org/software/make/)
|
|
|
|
## Usage
|
|
|
|
### Get Setup
|
|
|
|
Create your local environment with
|
|
|
|
```shell
|
|
poetry install
|
|
```
|
|
|
|
Ensure you have created and validated your account with [Ngrok](https://ngrok.com/).
|
|
|
|
### Make you Dataset
|
|
|
|
`src/interview/data.py` contains an example function to build the classic [Iris dataset](https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html). You can extend this as you see fit to create one or more custom datasets relevant to your business. There is a Poetry script to run the `make_data` function and build the dataset, which you can run with:
|
|
|
|
```shell
|
|
make data
|
|
```
|
|
|
|
### Run the Notebook with Docker
|
|
|
|
Build the container and run the notebook with
|
|
|
|
```shell
|
|
make build run
|
|
```
|
|
|
|
This will copy your data and notebook to the container, install any packages you specified in your `pyproject.toml` file, then run the notebook in the container. Jupyter will be available on port 8888. Take not of the authentication token.
|
|
|
|
### Setup a Tunnel
|
|
|
|
Start Ngrok with `ngrok http 8888`. This will give you the URL you will share with the candidate, as illustrated below.
|
|
|
|
![Ngrok](ngrok.png)
|
|
|
|
By following this link the candidate can login to Jupyter Lab running in the container on your machine using the Jupyter authentication token you took note of above.
|