Gitlab-runner no kubernetes

Como configurar o Gitlab Runner com Kubernetes Executor usando Helm, Longhorn RWX para cache e Sealed Secrets para gestão segura de tokens.

Visão Geral

O GitLab Runner é o agente responsável por executar os jobs definidos nas pipelines .gitlab-ci.yml. Quando instalado num cluster Kubernetes com o Kubernetes Executor, ele cria um Pod isolado para cada job escalando automaticamente sem necessidade de gerir máquinas.

Arquitetura

GitLab (VM / servidor)
        │
        │  regista via token (glrt-...)
        ▼
Runner Manager Pod  ──── Namespace: gitlab-runners
        │
        │  cria dinamicamente (1 Pod por job)
        ▼
┌──────────────────────────────────┐
│          Job Pod                 │
│  ┌─────────────┐ ┌────────────┐  │
│  │  container  │ │   helper   │  │
│  │  (build)    │ │ (git/arts) │  │
│  └─────────────┘ └────────────┘  │
│  PVC cache (Longhorn RWX) /cache │
└──────────────────────────────────┘

Pré-requisitos

Requisito	Versão mínima
Kubernetes	1.21+
Helm	3.x
Longhorn	1.4+ (para cache RWX)
GitLab	16.0+ (tokens `glrt-`)

1. Criar o Runner no GitLab

Antes de instalar qualquer coisa no cluster, é preciso registar o runner no GitLab para obter o token.

Acede a Admin → CI/CD → Runners
Clica em New instance runner
Define as tags (ex: k8s, cloudops, docker)
Clica em Create runner
Copia o token glrt-... imediatamente — o GitLab só o mostra uma vez

⚠️ A partir do GitLab 16, os tokens usam o formato glrt-... (runner authentication tokens). O formato antigo de registration token está deprecated.

Para verificar se um token ainda é válido:

curl -s "https://gitlab.empresa.com/api/v4/runners/verify" \
  --request POST \
  --form "token=glrt-SEU_TOKEN_AQUI"

Resposta 200 OK com "token_expires_at": null confirma que o token é válido e não tem expiração.

2. Preparar o Cluster

2.1 Criar o Namespace

kubectl create namespace gitlab-runners

2.2 Criar o Secret com o Token

kubectl create secret generic gitlab-runner-token \
  --namespace gitlab-runners \
  --from-literal=runner-token="glrt-SEU_TOKEN_AQUI"

Confirma a criação:

kubectl get secret gitlab-runner-token -n gitlab-runners

💡 Dica de segurança: Em ambientes GitOps, usa Bitnami Sealed Secrets para encriptar o secret antes de o commitar no repositório. O token plaintext nunca deve estar no Git.

2.3 Criar o PVC de Cache (Longhorn RWX)

O cache partilhado entre jobs usa um PVC com ReadWriteMany, suportado nativamente pelo Longhorn via NFSv4 interno.

# runner-cache-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: runner-cache
  namespace: gitlab-runners
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: longhorn
  resources:
    requests:
      storage: 20Gi

kubectl apply -f runner-cache-pvc.yaml

# Verifica se ficou Bound
kubectl get pvc -n gitlab-runners

O PVC deve ficar no estado Bound antes de continuar. Se ficar Pending, verifica se o Longhorn tem a feature de RWX habilitada.

3. Instalar via Helm

3.1 Adicionar o Repositório

helm repo add gitlab https://charts.gitlab.io
helm repo update

3.2 Criar o values.yaml

# values.yaml
gitlabUrl: "https://gitlab.empresa.com"

# Referencia o secret criado anteriormente — token nunca exposto no values
runnerToken: ""
existingSecret: "gitlab-runner-token"

# Número máximo de jobs simultâneos
concurrent: 20
checkInterval: 30

rbac:
  create: true
  rules:
    - apiGroups: [""]
      resources: ["pods", "pods/exec", "pods/attach", "secrets", "configmaps"]
      verbs: ["get", "list", "watch", "create", "patch", "delete", "update"]
    - apiGroups: [""]
      resources: ["pods/log"]
      verbs: ["get", "list", "watch"]

serviceAccount:
  create: true
  name: "gitlab-runner"

# Começa com 1 réplica; aumenta para 2 após validar o funcionamento
replicas: 1

resources:
  requests:
    memory: "128Mi"
    cpu: "100m"
  limits:
    memory: "256Mi"
    cpu: "500m"

runners:
  config: |
    [[runners]]
      name = "cloudops-k8s-runner"
      executor = "kubernetes"

      [runners.kubernetes]
        namespace = "gitlab-runners"
        image = "alpine:3.19"
        privileged = false
        pull_policy = ["if-not-present", "always"]

        # Recursos por job
        cpu_request = "100m"
        cpu_limit = "2"
        memory_request = "256Mi"
        memory_limit = "2Gi"

        # Recursos do helper container
        helper_cpu_request = "50m"
        helper_cpu_limit = "500m"
        helper_memory_request = "128Mi"
        helper_memory_limit = "512Mi"

        # Timeout para o pod do job ficar pronto
        poll_timeout = 600

        [runners.kubernetes.pod_labels]
          team = "cloudops"

        # Monta o PVC de cache em todos os job pods
        [[runners.kubernetes.volumes.pvc]]
          name = "runner-cache"
          mount_path = "/cache"

3.3 Instalar

helm install gitlab-runner gitlab/gitlab-runner \
  --namespace gitlab-runners \
  -f values.yaml

Para upgrades futuros:

helm upgrade gitlab-runner gitlab/gitlab-runner \
  --namespace gitlab-runners \
  -f values.yaml \
  --atomic  # faz rollback automático em caso de falha

4. Verificar a Instalação

# Pod do manager deve ficar Running
kubectl get pods -n gitlab-runners -w

# Logs para confirmar que conectou ao GitLab
kubectl logs -n gitlab-runners -l app=gitlab-runner --tail=50

Log esperado ao conectar com sucesso:

Configuration loaded                                builds=0
Registering runner... succeeded                     runner=glrt-xxx
Starting multi-runner from /home/gitlab-runner/.gitlab-runner/config.toml

No GitLab, em Admin → CI/CD → Runners, o runner deve aparecer com o status 🟢 Online.

5. Múltiplos Clusters

Em ambientes com vários clusters Kubernetes, registar um runner em cada cluster e usar tags para direcionar os jobs:

Cluster	Tags do Runner
cluster-dev	`k8s`, `k8s-dev`
cluster-stg	`k8s`, `k8s-stg`
cluster-prd	`k8s`, `k8s-prd`

No pipeline .gitlab-ci.yml:

build:
  tags:
    - k8s
  script:
    - echo "Executa em qualquer cluster k8s"

deploy-prod:
  tags:
    - k8s-prd
  script:
    - kubectl apply -f manifests/

6. Usar o Runner com Kaniko (build de imagens sem Docker)

O Kubernetes Executor não suporta docker build diretamente (sem privileged: true). A alternativa recomendada é o Kaniko, que constrói imagens sem necessidade de Docker daemon ou privilégios elevados.

# .gitlab-ci.yml
build-image:
  tags:
    - k8s
  image:
    name: gcr.io/kaniko-project/executor:v1.23.0-debug
    entrypoint: [""]
  script:
    - /kaniko/executor
      --context "${CI_PROJECT_DIR}"
      --dockerfile "${CI_PROJECT_DIR}/Dockerfile"
      --destination "${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHORT_SHA}"
      --cache=true
      --cache-repo "${CI_REGISTRY_IMAGE}/cache"

7. Troubleshooting

Runner aparece como Stale / Offline

O runner fica Stale quando o pod para de fazer contacto com o GitLab. Causas comuns:

# Verifica se o pod existe
kubectl get pods -n gitlab-runners

# Se não existir, o deployment foi apagado — reinstalar seguindo este guia
# Se existir, verifica os logs
kubectl logs -n gitlab-runners -l app=gitlab-runner --tail=100

Token inválido (403)

# Verifica o token
curl -s "https://gitlab.empresa.com/api/v4/runners/verify" \
  --request POST \
  --form "token=glrt-TOKEN"

# 200 = válido | 403 = expirado ou revogado

Se o token estiver inválido, cria um novo em Admin → CI/CD → Runners → New instance runner e atualiza o secret:

kubectl delete secret gitlab-runner-token -n gitlab-runners

kubectl create secret generic gitlab-runner-token \
  --namespace gitlab-runners \
  --from-literal=runner-token="glrt-NOVO_TOKEN"

# Reinicia o runner para usar o novo secret
kubectl rollout restart deployment/gitlab-runner -n gitlab-runners

PVC de cache em Pending

kubectl describe pvc runner-cache -n gitlab-runners

Causas comuns:

Longhorn sem suporte a RWX habilitado → habilita em Longhorn UI → Settings → Allow Recurring Job While Volume Is Detached
storageClassName incorreto → verifica com kubectl get storageclass

Jobs ficam em Pending por muito tempo

# Verifica eventos no namespace
kubectl get events -n gitlab-runners --sort-by='.lastTimestamp'

# Verifica recursos disponíveis nos nodes
kubectl describe nodes | grep -A5 "Allocated resources"

Visão Geral​

Arquitetura​

Pré-requisitos​

1. Criar o Runner no GitLab​

2. Preparar o Cluster​

2.1 Criar o Namespace​

2.2 Criar o Secret com o Token​

2.3 Criar o PVC de Cache (Longhorn RWX)​

3. Instalar via Helm​

3.1 Adicionar o Repositório​

3.2 Criar o values.yaml​

3.3 Instalar​

4. Verificar a Instalação​

5. Múltiplos Clusters​

6. Usar o Runner com Kaniko (build de imagens sem Docker)​

7. Troubleshooting​

Runner aparece como Stale / Offline​

Token inválido (403)​

PVC de cache em Pending​

Jobs ficam em Pending por muito tempo​

Referências​