2024-11-07 5 min read

GitOps with ArgoCD: Tutorial vs. Production Reality

ArgoCD tutorials make GitOps look simple. Production deployments reveal hidden complexity around secrets, access control, and multi-cluster management.

GitOps with ArgoCD: Tutorial vs. Production Reality

Every ArgoCD tutorial shows the same clean story: push to Git, watch resources sync automatically, deploy without SSH-ing into servers. It works perfectly in the demo. Then you hit production, and reality arrives with considerably more teeth.

The gap between "getting ArgoCD running" and "running ArgoCD safely at scale" is where most teams stumble. We've seen this pattern repeatedly at LavaPi—the concepts are sound, but implementation details matter enormously. Here's what actually trips people up.

Secrets Management Becomes Your Bottleneck

Tutorials gloss over this: "Store your credentials securely." In practice, secrets in GitOps create a genuine architectural problem. You cannot commit plaintext secrets to Git without ending your security career. So what do you do?

The naive approach: Use ArgoCD Sealed Secrets or similar tooling, encrypt everything, commit encrypted values. This works until your team needs to rotate credentials or onboard a new cluster. You're suddenly managing encryption keys across environments, and one mistake exposes everything.

The production approach: Decouple secret storage from Git entirely. Use an external secret manager (Vault, AWS Secrets Manager, Azure Key Vault) and have ArgoCD reference them dynamically.

yaml
apiVersion: v1
kind: SecretStore
metadata:
  name: vault-backend
spec:
  provider:
    vault:
      server: "https://vault.internal:8200"
      path: "secret"
      auth:
        kubernetes:
          mountPath: "kubernetes"
          role: "argocd-role"

Now your Git repository remains clean—it holds configuration, not secrets. But this introduces new complexity: you need Kubernetes auth configured correctly, RBAC policies set up properly, and monitoring around secret access patterns.

RBAC and Multi-Tenant Deployments

The tutorial shows one cluster, one team, one namespace. Production has multiple teams wanting to deploy independently without nuking each other's workloads.

ArgoCD's RBAC works, but it requires thought. A common mistake: granting broad permissions to service accounts that GitOps tooling uses. If an attacker compromises one repository, they can potentially escalate into cluster admin roles.

What we recommend:

  • Namespace-scoped Application resources
  • Separate AppProjects per team with explicit resource whitelisting
  • Service account tokens with minimal permissions
  • Distinct Git repositories per team (or strict path-based RBAC)
yaml
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
  name: team-backend
spec:
  sourceRepos:
  - 'https://git.company.com/backend/*'
  destinations:
  - namespace: 'backend-*'
    server: 'https://kubernetes.default.svc'
  namespaceResourceBlacklist:
  - group: ''
    kind: Namespace
  - group: 'rbac.authorization.k8s.io'
    kind: ClusterRole

This prevents a team from deploying to production namespaces or modifying cluster-level resources. Still, auditing who changed what and when requires additional tooling beyond ArgoCD itself.

Observability and Drift Detection at Scale

ArgoCD's sync status looks good on the dashboard until you have 200 applications across five clusters. Then drift detection becomes a puzzle: is the cluster actually synced, or is ArgoCD just not detecting the drift?

Tutorials don't show you:

  • Custom health assessments for complex resources
  • Monitoring ArgoCD's own reliability
  • Handling webhook delivery failures
  • Reconciliation loops that consume too much API bandwidth

The solution combines ArgoCD instrumentation with external monitoring:

bash
kubectl port-forward -n argocd svc/argocd-metrics 8082:8082
curl http://localhost:8082/metrics | grep argocd_app_sync

Watch for

code
argocd_app_sync_duration_seconds
spikes,
code
argocd_app_info
counts growing unexpectedly, and webhook failures. Set up alerts; don't trust the dashboard alone.

The Real Cost of GitOps

GitOps isn't "deploy faster." It's "deploy more predictably, with full audit trails, at the cost of significant upfront infrastructure work."

The tutorial environment has none of that infrastructure. Production needs secrets isolation, RBAC boundaries, observability hooks, and someone who understands how everything connects. When teams skip these pieces to go live quickly, they end up rebuilding them under pressure later.

The payoff is real—immutable audit trails, reduced human error, easier disaster recovery. But respect the complexity. Plan for secrets, security, and monitoring before you commit your credentials to Git.

Share
LP

LavaPi Team

Digital Engineering Company

All articles