Manifest Deployment Guide

Deploying Kubeflow with Amazon Cognito, RDS and S3

Note: Helm installation option is still in preview.

This guide describes how to deploy Kubeflow on Amazon EKS using Cognito for your identity provider, RDS for your database, and S3 for your artifact storage.

1. Prerequisites

Refer to the general prerequisites guide and the RDS and S3 setup guide in order to:

  1. Install the CLI tools
  2. Clone the repositories
  3. Create an EKS cluster
  4. Create an S3 Bucket
  5. Create an RDS Instance
  6. Configure AWS Secrets for RDS and S3
  7. Install AWS Secrets and Kubernetes Secrets Store CSI driver
  8. Configure an RDS endpoint and an S3 bucket name for Kubeflow Pipelines

Configure Custom Domain and Cognito

  1. Follow the Section 2.0 of Cognito setup guide in order to:
    1. Create a custom domain
    2. Create TLS certificates for the domain
    3. Create a Cognito Userpool
    4. Configure Ingress
  2. Deploy Kubeflow.
    1. Install Kubeflow using the following command:
      make deploy-kubeflow INSTALLATION_OPTION=kustomize DEPLOYMENT_OPTION=cognito-rds-s3
      make deploy-kubeflow INSTALLATION_OPTION=helm DEPLOYMENT_OPTION=cognito-rds-s3
  3. Follow the rest of the Cognito guide from section 5.0 (Updating the domain with ALB address) in order to:
    1. Add/Update the DNS records in a custom domain with the ALB address
    2. Create a user in a Cognito user pool
    3. Create a profile for the user from the user pool
    4. Connect to the central dashboard

Uninstall Kubeflow

Note: Delete all the resources you might have created in your profile namespaces before running these steps.

  1. Run the following commands to delete the profiles, ingress and corresponding ingress managed load balancer

     kubectl delete profiles --all
    
  2. Delete the kubeflow deployment

Note: Make sure you have the correct INSTALLATION_OPTION and DEPLOYMENT_OPTION environment variables set for your chosen installation.

make delete-kubeflow INSTALLATION_OPTION=kustomize DEPLOYMENT_OPTION=cognito-rds-s3
make delete-kubeflow INSTALLATION_OPTION=helm DEPLOYMENT_OPTION=cognito-rds-s3
  1. To delete the rest of resources(subdomain, certificates etc.), run the following commands from the root of your repository:

    Note: Make sure that you have the configuration file created by the script in tests/e2e/utils/cognito_bootstrap/config.yaml. If you did not use the script, plug in the name, ARN, or ID of the resources that you created in a yaml file in tests/e2e/utils/cognito_bootstrap/config.yaml by referring to the following sample:

    • Sample config file:
    cognitoUserpool:
        ARN: arn:aws:cognito-idp:us-west-2:123456789012:userpool/us-west-2_yasI9dbxF
        appClientId: 5jmk7ljl2a74jk3n0a0fvj3l31
        domainAliasTarget: xxxxxxxxxx.cloudfront.net
        domain: auth.platform.example.com
        name: kubeflow-users
    kubeflow:
        alb:
            serviceAccount:
                name: alb-ingress-controller
                namespace: kubeflow
                policyArn: arn:aws:iam::123456789012:policy/alb_ingress_controller_kube-eks-clusterxxx
    cluster:  
        name: kube-eks-cluster
        region: us-west-2
    route53:
        rootDomain:
            certARN: arn:aws:acm:us-east-1:123456789012:certificate/9d8c4bbc-3b02-4a48-8c7d-d91441c6e5af
            hostedZoneId: XXXXX
            name: example.com
        subDomain:
            us-west-2-certARN: arn:aws:acm:us-west-2:123456789012:certificate/d1d7b641c238-4bc7-f525-b7bf-373cc726
            hostedZoneId: XXXXX
            name: platform.example.com
            us-east-1-certARN: arn:aws:acm:us-east-1:123456789012:certificate/373cc726-f525-4bc7-b7bf-d1d7b641c238
    
    • Run the following command to install the script dependencies and delete the resources:

    Note: You can rerun the script incase some resources fail to delete

    cd tests/e2e
    pip install -r requirements.txt
    PYTHONPATH=.. python utils/cognito_bootstrap/cognito_resources_cleanup.py
    cd -
    
  2. To delete the rest of RDS-S3 resources: Make sure that you have the configuration file created by the script in tests/e2e/utils/rds-s3/metadata.yaml.

PYTHONPATH=.. python utils/rds-s3/auto-rds-s3-cleanup.py
Last modified September 23, 2022: Change uninstall order (2ac18a0)