Setting Up a Development Environment
Prerequisites¶
- Python 3.8+
-
Poetry 1.1.7 or higher
-
Clone the repo to your local machine and initialize the submodules:
git clone https://github.com/SAME-Project/same-project.git cd same-project git submodule update --init --recursive
-
Download and install Poetry, which is used to manage dependencies and virtual environments for the SAME project. You will need to install the project's Python dependencies using Poetry as well after installing it:
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/install-poetry.py | python3 - poetry install
To install AML dependencies, now optional, use
poetry install --extras azureml
Using the repo¶
Use of the SAME python project assumes executing in a virtual environment managed by Poetry. Before running any commands, the virtual environment should be started:
poetry shell
NOTE: From this point forward, all functions require executing inside that virtual environment. If you see an error like
zsh: command not found
, it could be because you're not executing inside the venv. You can check this by executing:which python3
This should result in a response like:
.../pypoetry/virtualenvs/same-project-88mixeKa-py3.8/bin/python3
. If it reports something like/usr/bin/python
or/usr/local/bin/python
, you are using the system python, and things will not work as expected.
How to execute against a notebook from source code¶
From the root of project, execute:
same <cli-arguments>
Running tests¶
To run all the tests against the CLI and SDK:
pytest
To run a subset of tests for a single file:
pytest test/cli/test_<file>.py -k "test_<name>"
How to setup private test environments¶
Local Kubeflow cluster on Minikube¶
You can set up a local Kubeflow cluster to run the CLI pytests against if you wish:
-
Start a minikube cluster in the devcontainer:
Note: Kubeflow currently defines its Custom Resource Definitions (CRD) under
apiextensions.k8s.io/v1beta
which is deprecated in Kubernetes v1.22, so minikube must start the cluster with a version <1.22. See kubeflow/kfctl issue #500.minikube start --kubernetes-version=v1.21.5
Starting minikube will also change the default kubeconfig context to the minikube cluster. You can check this with:
kubectl config get-contexts
-
Deploy Kubeflow to the minikube cluster:
export PIPELINE_VERSION=1.7.0 kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION" kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic-pns?ref=$PIPELINE_VERSION"
Kubeflow cluster on Azure Kubernetes Services (AKS)¶
From any Azure subscription where you are at least a Contributor, you can create and provision a new AKS cluster with Kubeflow:
- Create a new AKS cluster either using the Azure CLI or Azure Portal.
The linked instructions will also update your kubeconfig to use the new cluster as the context when you run az aks get-credentials
, but you can also manually do so with:
kubectl config set-context <context name>
-
Deploy Kubeflow to the cluster.
Note: The document references a non-existent v1.3.0 release, you can simply use the v1.2.0 release instead. See kubeflow/kfctl issue #495.
Azure Machine Learning (AML) workspace and compute¶
-
Create a new Service Principal for running tests against your private AML instance.
As mentioned in the instructions, make sure to take note of the output of the command as you will need the
clientId
,clientSecret
, andtenantId
values to configure the.env.sh
file to run the AML tests. -
Create a new Azure Machine Learning Workspace.
You will need the
--resource-group
and--workspace-name
values you specified during workspace creation to configure the.env.sh
file to run the AML tests.You will also need the subscription
id
that you created the AML workspace in. You can check this by running:az account show --query id
-
Create an AML Compute cluster or AML Compute Instance.
You will need the
--name
that you specified during compute cluster/instance creation to configure the.env.sh
file to run the AML tests.