This repository contains some of the principal resourses that you probably want to use when starts working into the IaC module.
. └── terraforms-templates ├── aws │ ├── assests │ └── modules │ ├── ec2 │ ├── eks │ ├── networking │ ├── rds │ └── s3 ├── gcp │ ├── modules │ │ ├── cloudsql │ │ ├── compute-engine │ │ ├── gke │ │ └── vpc │ └── nfs └── kubernetes
The current architecture was implemented following this guide Provisioning an EKS Cluster guide
- AWS account configured. For this example we are using default profile and us-east-2 region
- Kubect cli
- aws cli
- Cluster version: 1.20
- Terraform >= 0.13
Go to the directory aws/
and execute the Terraform commands:
terraform init terraform apply --var-file=terraform.tfvars
Once that the cluster is created (it will took a few minutes), set the kubectl context:
aws eks --region $(terraform output -raw region) update-kubeconfig --name $(terraform output -raw cluster_name)
The current architecture was implemented following this guide <a href=”https://learn.hashicorp.com/tutorials/terraform/gke?in=terraform/kubernetes) “>Provision a GKE Cluster guide
- GCP account configured
- Kubectl cli
- gcloud cli
- Cluster Version: 1.20
- Terraform >= 0.13
Go to the directory gcp/
and execute the following Terraform commands:
terraform init terraform apply --var-file=terraform.tfvars
Once that the cluster is created (it will took a few minutes), set the kubectl context:
gcloud container clusters get-credentials $(terraform output -raw kubernetes_cluster_name) --region $(terraform output -raw location)
To destroy the EKS cluster with all the services, we run:
terraform destroy --var-file=terraform.tfvars
Important notes Remember that you are working with services on the cloud and you are creating the infraestructure to work with between them. Often times, due different conditions, including internet desconections and server problems, some of your services will stay in your Cloud Provider, so, we encourage you to check twice everytime you run this feature and make sure that all the services were shut down properly.
- Helm 3
- kubernetes cluster version: 1.20
To work with Airflow we will use a NFS servicem we will create it in the Kubernetes cluster.
First create a namespace for the nsf server executing in your respective cloud provider path:
kubectl create namespace nfs
then is time to create the nfs server using your yaml file:
kubectl -n nfs apply -f nfs/nfs-server.yaml
now export the nfs server:
export NFS_SERVER=$(kubectl -n nfs get service/nfs-server -o jsonpath="{.spec.clusterIP}")
finally, in order to install Airflow, go to the directory kubernetes/
:
Create a namespace for storage deployment:
kubectl create namespace storage
Add the chart for the nfs-provisioner
helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
Install nfs-external-provisioner
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \ --namespace storage \ --set nfs.server=$NFS_SERVER \ --set nfs.path=/
Here we are using official Airflow helm chart as example, but, can also been installed any other Airflow distribution.
Create the namespace
kubectl create namespace airflow
Add the chart repository and confirm:
helm repo add apache-airflow https://airflow.apache.org
Update the file airflow-values.yaml
attributes; repo, branch and subPath of your DAGs.
yaml gitSync: enabled: true # git repo clone url # ssh examples ssh://[email protected]/apache/airflow.git # [email protected]:apache/airflow.git # https example: https://github.com/mmendoza17/data-bootcamp-terraforms-mmendoza repo: https://github.com/eiffela65/Airflow-Templates branch: main rev: HEAD depth: 1 # the number of consecutive failures allowed before aborting maxFailures: 0 # subpath within the repo where dags are located # should be "" if dags are at repo root subPath: ""
Install the airflow chart from the repository:
helm install airflow -f airflow-values.yaml apache-airflow/airflow --namespace airflow
We can verify that our pods are up and running by executing:
kubectl get pods -n airflow
The Helm chart shows how to connect:
You can now access your dashboard(s) by executing the following command(s) and visiting the corresponding port at localhost in your browser: Airflow Webserver: kubectl port-forward svc/airflow-webserver 8080:8080 --namespace airflow Flower dashboard: kubectl port-forward svc/airflow-flower 5555:5555 --namespace airflow Default Webserver (Airflow UI) Login credentials: username: admin password: admin Default Postgres connection credentials: username: postgres password: postgres port: 5432 You can get Fernet Key value by running the following: echo Fernet Key: $(kubectl get secret --namespace airflow airflow-fernet-key -o jsonpath="{.data.fernet-key}" | base64 --decode)
As you can see, we need to access to the dashboard running:
kubectl port-forward svc/airflow-webserver 8080:8080 --namespace airflow kubectl port-forward svc/airflow-flower 5555:5555 --namespace airflow
This solution was based on this guide: Provision a GKE Cluster guide, containing Terraform configuration files to provision an GKE cluster on GCP.