Istio Service Mesh with Cert-Manager and AWS Load Balancer Controller

6 minute read

In this post, I’m going to go through the steps to setup the Istio service mesh using Terraform. Cert-manager is used to fetch LetsEncrypt SSL certificates and Istio will use the AWS Load Balancer Controller to spin up a network LB. Using a service mesh like Istio is going to give us flexibility with traffic routing, observability metrics with service-to-service communication going through Istio’s gateway and security with its authorization policies.

Cert-manager and the AWS Load Balancer controller will need to be installed beforehand to use the demo app or example Istio ingress gateways. You can see how to set those up here and here. If you’re not on AWS, you should be able to swap out the annotations on the gateways.

These are the providers that we’ll be using in the environment. You may need to adjust how the helm and kubectl providers are getting the cluster name and token for your environment.

Providers/Versions

providers.tf

 1locals {
 2  env    = "sandbox"
 3  region = "us-east-1"
 4}
 5
 6provider "aws" {
 7  region = local.region
 8  default_tags {
 9    tags = {
10      env       = local.env
11      terraform = true
12    }
13  }
14}
15
16provider "helm" {
17  kubernetes {
18    host                   = module.eks-cluster.endpoint
19    cluster_ca_certificate = base64decode(module.eks-cluster.certificate)
20    exec {
21      api_version = "client.authentication.k8s.io/v1beta1"
22      # This requires the awscli to be installed locally where Terraform is executed
23      args        = ["eks", "get-token", "--cluster-name", module.eks-cluster.name]
24      command     = "aws"
25    }
26  }
27}
28
29provider "kubectl" {
30  apply_retry_count      = 5
31  host                   = module.eks-cluster.endpoint
32  cluster_ca_certificate = base64decode(module.eks-cluster.certificate)
33  load_config_file       = false
34
35  exec {
36    api_version = "client.authentication.k8s.io/v1beta1"
37    command     = "aws"
38    # This requires the awscli to be installed locally where Terraform is executed
39    args = ["eks", "get-token", "--cluster-name", module.eks-cluster.name]
40  }
41}

versions.tf

 1terraform {
 2  required_providers {
 3    aws = {
 4      source  = "hashicorp/aws"
 5      version = "~> 5.0"
 6    }
 7    kubectl = {
 8      source  = "alekc/kubectl"
 9      version = "~> 2.0.3"
10    }
11    helm = {
12      source  = "hashicorp/helm"
13      version = "~> 2.11.0"
14    }
15  }
16  required_version = "~> 1.5.7"
17}

To get the Istios sidecar proxy to automatically inject itself for all our pods in a specific namespace, we can label our namespace like so:

1resource "kubernetes_namespace" "env" {
2  metadata {
3    name = var.env
4    labels = {
5      istio-injection = "enabled"
6    }
7  }
8}

Module

Initialize the module where needed. The list of domains will be used by the cluster SSL certs further below.

 1module "istio" {
 2  source        = "../../aws/eks-addons/istio"
 3  cluster_name  = aws_eks_cluster.cluster.name
 4  env           = var.env
 5  istio_version = "1.20.1"
 6  domains       = ["*.${local.env}.example.com"]
 7  depends_on = [
 8    aws_eks_node_group.core
 9  ]
10}

Module files

Here we’re installing the base Istio helm chart that consists of CRD’s required before installing other components. We’re setting the defaultRevision value to default as recommended by the docs. Next is IstioD that is the service discovery component and setting a nodeaffinity to make sure it stays on my core nodes.

main.tf

 1resource "helm_release" "istio" {
 2  namespace        = "istio-system"
 3  create_namespace = true
 4  name             = "istio-base"
 5  repository       = "https://istio-release.storage.googleapis.com/charts"
 6  chart            = "base"
 7  version          = var.istio_version
 8
 9  values = [
10    <<-EOT
11    defaultRevision: default
12    EOT
13  ]
14}
15
16resource "helm_release" "istiod" {
17  namespace        = "istio-system"
18  create_namespace = true
19  name             = "istiod"
20  repository       = "https://istio-release.storage.googleapis.com/charts"
21  chart            = "istiod"
22  version          = var.istio_version
23  depends_on       = [helm_release.istio]
24
25  values = [
26    <<-EOT
27    pilot:
28      affinity:
29        nodeAffinity:
30          requiredDuringSchedulingIgnoredDuringExecution:
31            nodeSelectorTerms:
32            - matchExpressions:
33              - key: role
34                operator: In
35                values:
36                - core
37    EOT
38  ]
39}

These two resources are the ingress gateways that, in this case, will setup two network load balancers using the AWS Load Balancer controller. Having workloads that can be either public or internal use, I’m creating both LB’s here. There are several more annotations values that can be found here.

One thing to note is the label being applied to each ingress controller. These will be referenced when creating a Gateway later.

 1resource "helm_release" "istio_ingress_external" {
 2  namespace        = "istio-ingress"
 3  create_namespace = true
 4  name             = "istio-gateway-external"
 5  repository       = "https://istio-release.storage.googleapis.com/charts"
 6  chart            = "gateway"
 7  version          = var.istio_version
 8
 9  values = [
10    <<-EOT
11    labels:
12      istio: "ingressgateway-external"
13    service:
14      annotations:
15        service.beta.kubernetes.io/aws-load-balancer-name: "${var.env}-network-external"
16        service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"
17        service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
18        service.beta.kubernetes.io/aws-load-balancer-attributes: "load_balancing.cross_zone.enabled=true"
19    affinity:
20      nodeAffinity:
21        requiredDuringSchedulingIgnoredDuringExecution:
22          nodeSelectorTerms:
23          - matchExpressions:
24            - key: role
25              operator: In
26              values:
27              - core
28    EOT
29  ]
30  depends_on = [helm_release.istiod]
31}
32
33resource "helm_release" "istio_ingress_internal" {
34  namespace        = "istio-ingress"
35  create_namespace = false
36  name             = "istio-gateway-internal"
37  repository       = "https://istio-release.storage.googleapis.com/charts"
38  chart            = "gateway"
39  version          = var.istio_version
40
41  values = [
42    <<-EOT
43    labels:
44      istio: "ingressgateway-internal"
45    service:
46      annotations:
47        service.beta.kubernetes.io/aws-load-balancer-name: "${var.env}-network-internal"
48        service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"
49        service.beta.kubernetes.io/aws-load-balancer-scheme: "internal"
50        service.beta.kubernetes.io/aws-load-balancer-attributes: "load_balancing.cross_zone.enabled=true"
51    affinity:
52      nodeAffinity:
53        requiredDuringSchedulingIgnoredDuringExecution:
54          nodeSelectorTerms:
55          - matchExpressions:
56            - key: role
57              operator: In
58              values:
59              - core
60    EOT
61  ]
62  depends_on = [
63    helm_release.istiod,
64    helm_release.istio_ingress_external
65  ]
66}

For TLS termination on our applications, I’m creating two cluster wide certificates; one for staging/testing and one for production. This will use cert-manager to fetch LetsEncrypt certificates and create a keypair secret that will be mounted on the Gateway.

manifests.tf

 1locals {
 2  certs          = ["prod", "staging"]
 3  load_balancers = ["external", "internal"]
 4}
 5
 6resource "kubectl_manifest" "cluster_certs" {
 7  for_each = { for cert in local.certs : cert => cert }
 8  yaml_body = templatefile("../../modules/aws/eks-addons/istio/files/cluster_cert.yaml.tftpl", {
 9    DOMAINS = var.domains
10    TYPE    = each.value
11  })
12  depends_on = [
13    helm_release.istio_ingress_external,
14    helm_release.istio_ingress_internal
15  ]
16}

Two shared Gateways are being created, each for the external and internal load balancers.

 1resource "kubectl_manifest" "gateways" {
 2  for_each = { for lb in local.load_balancers : lb => lb }
 3  yaml_body = templatefile("../../modules/aws/eks-addons/istio/files/gateways.yaml.tftpl", {
 4    DOMAINS = var.domains
 5    LB      = each.value
 6  })
 7  depends_on = [
 8    helm_release.istio_ingress_external,
 9    helm_release.istio_ingress_internal
10  ]
11}

cluster_cert.yaml.tftpl

 1apiVersion: cert-manager.io/v1
 2kind: Certificate
 3metadata:
 4  name: letsencrypt-${TYPE}
 5  namespace: istio-ingress
 6spec:
 7  secretName: letsencrypt-${TYPE}
 8  dnsNames:
 9  %{ for domain in DOMAINS ~}
10  - "${domain}"
11  %{ endfor ~}
12  issuerRef:
13    name: letsencrypt-${TYPE}
14    kind: ClusterIssuer

gateway.yaml.tftpl

This is our Gateway manifest that will manage inbound/outbound traffic for the cluster. In this example, HTTP traffic will be redirected to HTTPS and we’re setting the SSL cert created earlier.

The selector value is pointing the gateway to the specific Ingress Gateway Controller created earlier.

 1apiVersion: networking.istio.io/v1alpha3
 2kind: Gateway
 3metadata:
 4  name: gateway-${LB}
 5  namespace: istio-ingress
 6spec:
 7  selector:
 8    istio: ingressgateway-${LB}
 9  servers:
10  - port:
11      number: 443
12      name: https
13      protocol: HTTPS
14    tls:
15      mode: SIMPLE
16      credentialName: letsencrypt-cert-prod
17    hosts:
18  %{ for domain in DOMAINS ~}
19    - "${domain}"
20  %{ endfor ~}
21  - port:
22      number: 80
23      name: http
24      protocol: HTTP
25    tls:
26      httpsRedirect: true
27    hosts:
28  %{ for domain in DOMAINS ~}
29    - "${domain}"
30  %{ endfor ~}

variables.tf

 1variable "cluster_name" {
 2  type = string
 3}
 4variable "env" {
 5  type = string
 6}
 7variable "istio_version" {
 8  type = string
 9}
10variable "domains" {
11  type = list(string)
12}

Demo

Instead of the typical Ingress manifest, we will instead create a VirtualService that sends traffic to our app and binds to a named gateway created earlier. In the VirtualService, we can finely tune routing; however, in this example, I just want all traffic to reach my one endpoint and send it to my service “demo-app” listening on port 8000. For any existing apps, they will need to be restarted for the proxy sidecar container to automatically inject itself.

Since the shared gateway was installed to the istio-ingress namespace, I’m specifying the exact location with the namespace.

ingress.yaml

 1apiVersion: networking.istio.io/v1alpha3
 2kind: VirtualService
 3metadata:
 4  name: demo-app
 5  namespace: sandbox
 6spec:
 7  hosts:
 8  - "demo.sandbox.example.com"
 9  gateways:
10  - istio-ingress/gateway-external
11  http:
12  - match:
13    - uri:
14        exact: /
15    route:
16    - destination:
17        host: demo-app
18        port:
19          number: 8000