data-science-interview/README.md

# Data Science Pair Interview

## Requirements

- Python 3.11 (you can change this in the `pyproject.toml` file)
- [Poetry](https://python-poetry.org/)
- [Docker](https://www.docker.com/)
- [Ngrok](https://ngrok.com/)
- [GNU Make](https://www.gnu.org/software/make/)

I recommend using [asdf](https://asdf-vm.com/) to manage your Python versions and Poetry installation.

## Usage

### Get Setup

Create your local environment with

```shell
make install
```

Ensure you have created and validated your account with [Ngrok](https://ngrok.com/).

### Make you Dataset

`src/interview/data.py` contains an example function to build the classic [Iris dataset](https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html). You can extend this as you see fit to create one or more custom datasets relevant to your business. There is a Poetry script to run the `make_data` function and build the dataset, which you can run with:

```shell
make data
```

### Run the Notebook with Docker

Build the container and run the notebook with

```shell
make build run
```

This will copy your data and notebook to the container, install any packages you specified in your `pyproject.toml` file, then run the notebook in the container. Jupyter will be available on port 8888. Take not of the authentication token.

### Setup a Tunnel

Start Ngrok with `ngrok http 8888`. This will give you the URL you will share with the candidate, as illustrated below.

![Ngrok](ngrok.png)

By following this link the candidate can login to Jupyter Lab running in the container on your machine using the Jupyter authentication token you took note of above.
Build and run Jupyter in a Docker container This commit will allow you to: - Create a sample dataset. - Build a Docker container in which to run Jupyter. - Run Jupyter Lab in collaboration mode, making it available at `localhost:8888`. 2023-08-11 14:44:33 +00:00			`# Data Science Pair Interview`

			`## Requirements`

			- Python 3.11 (you can change this in the `pyproject.toml` file)
Add Poetry as a requirement 2023-10-19 08:29:04 +00:00			`- [Poetry](https://python-poetry.org/)`
Build and run Jupyter in a Docker container This commit will allow you to: - Create a sample dataset. - Build a Docker container in which to run Jupyter. - Run Jupyter Lab in collaboration mode, making it available at `localhost:8888`. 2023-08-11 14:44:33 +00:00			`- [Docker](https://www.docker.com/)`
			`- [Ngrok](https://ngrok.com/)`
			`- [GNU Make](https://www.gnu.org/software/make/)`

Add Poetry as a requirement 2023-10-19 08:29:04 +00:00			`I recommend using [asdf](https://asdf-vm.com/) to manage your Python versions and Poetry installation.`

Build and run Jupyter in a Docker container This commit will allow you to: - Create a sample dataset. - Build a Docker container in which to run Jupyter. - Run Jupyter Lab in collaboration mode, making it available at `localhost:8888`. 2023-08-11 14:44:33 +00:00			`## Usage`

			`### Get Setup`

			`Create your local environment with`

			```shell
Add make command to install dependencies 2023-10-19 08:26:21 +00:00			`make install`
Build and run Jupyter in a Docker container This commit will allow you to: - Create a sample dataset. - Build a Docker container in which to run Jupyter. - Run Jupyter Lab in collaboration mode, making it available at `localhost:8888`. 2023-08-11 14:44:33 +00:00			```

Add Ngrok instructions to README.md 2023-08-22 12:26:50 +00:00			`Ensure you have created and validated your account with [Ngrok](https://ngrok.com/).`

Build and run Jupyter in a Docker container This commit will allow you to: - Create a sample dataset. - Build a Docker container in which to run Jupyter. - Run Jupyter Lab in collaboration mode, making it available at `localhost:8888`. 2023-08-11 14:44:33 +00:00			`### Make you Dataset`

Expand dataset build instructions 2023-10-19 08:24:35 +00:00			`src/interview/data.py` contains an example function to build the classic [Iris dataset](https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html). You can extend this as you see fit to create one or more custom datasets relevant to your business. There is a Poetry script to run the `make_data` function and build the dataset, which you can run with:
Build and run Jupyter in a Docker container This commit will allow you to: - Create a sample dataset. - Build a Docker container in which to run Jupyter. - Run Jupyter Lab in collaboration mode, making it available at `localhost:8888`. 2023-08-11 14:44:33 +00:00
			```shell
Add make command to make dataset 2023-10-19 08:21:42 +00:00			`make data`
Build and run Jupyter in a Docker container This commit will allow you to: - Create a sample dataset. - Build a Docker container in which to run Jupyter. - Run Jupyter Lab in collaboration mode, making it available at `localhost:8888`. 2023-08-11 14:44:33 +00:00			```

			`### Run the Notebook with Docker`

			`Build the container and run the notebook with`

			```shell
			`make build run`
			```

Add Ngrok instructions to README.md 2023-08-22 12:26:50 +00:00			This will copy your data and notebook to the container, install any packages you specified in your `pyproject.toml` file, then run the notebook in the container. Jupyter will be available on port 8888. Take not of the authentication token.

			`### Setup a Tunnel`

			Start Ngrok with `ngrok http 8888`. This will give you the URL you will share with the candidate, as illustrated below.

			`![Ngrok](ngrok.png)`

			`By following this link the candidate can login to Jupyter Lab running in the container on your machine using the Jupyter authentication token you took note of above.`