Skip to content

Deployment Operations

Comprehensive deployment procedures and operational workflows for the RCIIS DevOps platform.

Overview

The RCIIS platform uses GitOps-based deployment with ArgoCD, ensuring declarative, version-controlled, and auditable deployments across all environments.

Deployment Architecture

GitOps Workflow

  1. Code Commit: Developers push code changes to feature branches
  2. CI Pipeline: Automated build, test, and image creation
  3. Chart Update: Automatic chart version bump and registry push
  4. GitOps Repository: Update deployment configurations
  5. ArgoCD Sync: Automatic deployment to target environments
  6. Validation: Post-deployment testing and monitoring

Environment Progression

  • Local: Developer workstations for initial testing
  • Testing: Automated CI/CD testing environment
  • Staging: Pre-production validation and user acceptance
  • Production: Live production environment (planned)

Deployment Procedures

Standard Application Deployment

Prerequisites: - Code merged to master branch - All tests passing in CI pipeline - Chart version updated - Deployment configuration reviewed

Deployment Steps:

# 1. Verify current state
argocd app list
argocd app get nucleus-staging

# 2. Check for drift
argocd app diff nucleus-staging

# 3. Sync application
argocd app sync nucleus-staging

# 4. Monitor deployment
kubectl get pods -n nucleus --watch
kubectl rollout status deployment/nucleus -n nucleus

# 5. Verify health
kubectl get ingress -n nucleus
curl -f https://nucleus-staging.devops.africa/health

Infrastructure Deployment

Core Infrastructure Components:

# Deploy infrastructure in order
argocd app sync cert-manager-staging
argocd app sync ingress-nginx-staging
argocd app sync argocd-staging

# Wait for readiness
kubectl wait --for=condition=ready pod -l app=cert-manager -n cert-manager --timeout=300s
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=ingress-nginx -n ingress-nginx --timeout=300s

# Verify certificates
kubectl get certificates -A
kubectl get clusterissuer

Database Deployment and Migration

SQL Server Deployment:

# Deploy database
helm install mssql bitnami/mssql \
  --namespace database \
  --create-namespace \
  --values apps/rciis/database/staging/values.yaml

# Wait for database ready
kubectl wait --for=condition=ready pod -l app=mssql -n database --timeout=600s

# Run migrations
kubectl exec deployment/nucleus -n nucleus -- \
  dotnet ef database update --project /app/Nucleus.Data.dll

Migration Verification:

# Check migration status
kubectl exec deployment/nucleus -n nucleus -- \
  dotnet ef migrations list --project /app/Nucleus.Data.dll

# Verify database schema
kubectl exec -it mssql-0 -n database -- \
  /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P $SA_PASSWORD \
  -Q "SELECT name FROM sys.tables"

Message Queue Deployment

Kafka Cluster Deployment:

# Deploy Strimzi operator
kubectl apply -f apps/rciis/strimzi/staging/

# Wait for operator
kubectl wait --for=condition=ready pod -l name=strimzi-cluster-operator -n kafka --timeout=300s

# Deploy Kafka cluster
kubectl apply -f apps/rciis/strimzi/staging/extra/kafka.yaml

# Wait for Kafka ready
kubectl wait --for=condition=Ready kafka/kafka-cluster -n kafka --timeout=600s

# Create topics and users
kubectl apply -f apps/rciis/strimzi/staging/extra/

Kafka Verification:

# Check cluster status
kubectl get kafka kafka-cluster -n kafka

# Verify topics
kubectl get kafkatopic -n kafka

# Test producer/consumer
kubectl exec kafka-cluster-kafka-0 -n kafka -- \
  bin/kafka-topics.sh --bootstrap-server localhost:9092 --list

Rolling Updates

Application Updates

Zero-Downtime Deployment Strategy:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nucleus
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  template:
    spec:
      containers:
      - name: nucleus
        image: harbor.devops.africa/rciis/nucleus:v1.2.3
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5

Update Procedure:

# Update image tag
yq '.image.tag = "v1.2.3"' -i apps/rciis/nucleus/staging/values.yaml

# Commit changes
git add apps/rciis/nucleus/staging/values.yaml
git commit -m "Update nucleus to v1.2.3"
git push origin master

# Monitor ArgoCD sync
argocd app sync nucleus-staging --timeout 300

# Watch rollout
kubectl rollout status deployment/nucleus -n nucleus --timeout=300s

# Verify new version
kubectl get pods -n nucleus -o wide
kubectl exec deployment/nucleus -n nucleus -- cat /app/version.txt

Infrastructure Updates

Helm Chart Updates:

# Update Helm repositories
helm repo update

# Check for available updates
helm list -A
helm search repo ingress-nginx --versions

# Update chart version
helm upgrade ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --values apps/infra/ingress-nginx/local/values.yaml \
  --version 4.8.0

# Verify update
kubectl get pods -n ingress-nginx
kubectl get deployment ingress-nginx-controller -n ingress-nginx -o yaml | grep image:

Rollback Procedures

Application Rollback

Quick Rollback:

# Check rollout history
kubectl rollout history deployment/nucleus -n nucleus

# Rollback to previous version
kubectl rollout undo deployment/nucleus -n nucleus

# Rollback to specific revision
kubectl rollout undo deployment/nucleus -n nucleus --to-revision=2

# Verify rollback
kubectl rollout status deployment/nucleus -n nucleus

ArgoCD Rollback:

# View application history
argocd app history nucleus-staging

# Rollback to specific revision
argocd app rollback nucleus-staging 123

# Force sync if needed
argocd app sync nucleus-staging --force --prune

Database Rollback

Migration Rollback:

# List applied migrations
kubectl exec deployment/nucleus -n nucleus -- \
  dotnet ef migrations list --project /app/Nucleus.Data.dll

# Rollback to specific migration
kubectl exec deployment/nucleus -n nucleus -- \
  dotnet ef database update PreviousMigrationName --project /app/Nucleus.Data.dll

# Verify rollback
kubectl exec -it mssql-0 -n database -- \
  /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P $SA_PASSWORD \
  -Q "SELECT * FROM __EFMigrationsHistory ORDER BY MigrationId DESC"

Blue-Green Deployments

Blue-Green Strategy Setup

Blue Environment (Current):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nucleus-blue
  labels:
    app: nucleus
    version: blue
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nucleus
      version: blue

Green Environment (New):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nucleus-green
  labels:
    app: nucleus
    version: green
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nucleus
      version: green

Traffic Switching:

# Deploy green environment
kubectl apply -f nucleus-green-deployment.yaml

# Wait for green pods ready
kubectl wait --for=condition=ready pod -l app=nucleus,version=green -n nucleus

# Test green environment
kubectl port-forward deployment/nucleus-green 8080:8080 -n nucleus
curl http://localhost:8080/health

# Switch traffic to green
kubectl patch service nucleus -p '{"spec":{"selector":{"version":"green"}}}' -n nucleus

# Verify traffic switch
kubectl get endpoints nucleus -n nucleus

# Remove blue environment
kubectl delete deployment nucleus-blue -n nucleus

Canary Deployments

Canary Strategy with Flagger

Canary Configuration:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: nucleus
  namespace: nucleus
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nucleus
  progressDeadlineSeconds: 60
  service:
    port: 80
    targetPort: 8080
  analysis:
    interval: 30s
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
    - name: request-success-rate
      threshold: 99
      interval: 1m
    - name: request-duration
      threshold: 500
      interval: 30s

Canary Deployment Process:

# Start canary deployment
kubectl set image deployment/nucleus nucleus=harbor.devops.africa/rciis/nucleus:v1.2.3 -n nucleus

# Monitor canary progress
kubectl get canary nucleus -n nucleus --watch

# Check canary metrics
kubectl describe canary nucleus -n nucleus

# Promote or rollback based on metrics
flagger promote nucleus -n nucleus  # or flagger rollback nucleus -n nucleus

Health Checks and Validation

Post-Deployment Validation

Application Health Checks:

# Health endpoint validation
curl -f https://nucleus-staging.devops.africa/health

# API functionality test
curl -H "Authorization: Bearer $TOKEN" \
     -H "Content-Type: application/json" \
     https://nucleus-staging.devops.africa/api/status

# Database connectivity test
kubectl exec deployment/nucleus -n nucleus -- \
  dotnet run --project /app/Tools/DatabaseCheck.dll

Infrastructure Validation:

# Certificate validation
kubectl get certificates -A
openssl s_client -connect nucleus-staging.devops.africa:443 -servername nucleus-staging.devops.africa < /dev/null

# Ingress validation
kubectl get ingress -A
kubectl describe ingress nucleus-ingress -n nucleus

# Service mesh validation (if applicable)
kubectl get virtualservice -A
kubectl get destinationrule -A

Smoke Tests

Automated Smoke Test Suite:

#!/bin/bash
# smoke-test.sh

set -e

echo "Running smoke tests for nucleus-staging..."

# Test health endpoint
echo "Testing health endpoint..."
curl -f https://nucleus-staging.devops.africa/health

# Test authentication
echo "Testing authentication..."
AUTH_RESPONSE=$(curl -s -X POST https://nucleus-staging.devops.africa/api/auth/login \
  -H "Content-Type: application/json" \
  -d '{"username":"test@example.com","password":"test123"}')

TOKEN=$(echo $AUTH_RESPONSE | jq -r '.token')

# Test protected endpoint
echo "Testing protected endpoint..."
curl -f -H "Authorization: Bearer $TOKEN" \
     https://nucleus-staging.devops.africa/api/declarations

# Test database connection
echo "Testing database connection..."
kubectl exec deployment/nucleus -n nucleus -- \
  /app/tools/db-check --connection-string "$DB_CONNECTION"

# Test Kafka connectivity
echo "Testing Kafka connectivity..."
kubectl exec deployment/nucleus -n nucleus -- \
  /app/tools/kafka-check --bootstrap-servers "kafka-cluster-kafka-bootstrap:9092"

echo "All smoke tests passed!"

Monitoring Deployments

Deployment Metrics

Key Deployment Metrics:

# Deployment success rate
rate(deployment_status_condition{condition="Progressing",status="True"}[5m])

# Pod restart rate during deployment
rate(kube_pod_container_status_restarts_total[5m])

# Application response time during deployment
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

# Error rate during deployment
rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m])

Deployment Alerts:

groups:
- name: deployment.rules
  rules:
  - alert: DeploymentFailed
    expr: kube_deployment_status_condition{condition="Progressing",status="False"} == 1
    for: 10m
    labels:
      severity: critical
    annotations:
      summary: "Deployment {{ $labels.deployment }} failed"
      description: "Deployment has been failing for more than 10 minutes"

  - alert: HighErrorRateDuringDeployment
    expr: rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > 0.05
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: "High error rate during deployment"
      description: "Error rate is {{ $value | humanizePercentage }}"

Troubleshooting Deployments

Common Deployment Issues

Pod Startup Failures:

# Check pod status
kubectl get pods -n nucleus
kubectl describe pod <pod-name> -n nucleus

# Check logs
kubectl logs <pod-name> -n nucleus --previous
kubectl logs deployment/nucleus -n nucleus --tail=100

# Check events
kubectl get events -n nucleus --sort-by=.metadata.creationTimestamp --field-selector involvedObject.name=<pod-name>

Image Pull Issues:

# Check image pull secrets
kubectl get secrets -n nucleus | grep docker
kubectl describe secret harbor-registry -n nucleus

# Test image pull manually
docker pull harbor.devops.africa/rciis/nucleus:v1.2.3

# Check registry connectivity
kubectl run debug --image=busybox -it --rm -- /bin/sh
# Inside pod: wget -O- https://harbor.devops.africa/v2/

Resource Constraints:

# Check resource usage
kubectl top nodes
kubectl top pods -n nucleus

# Check resource quotas
kubectl describe resourcequota -n nucleus

# Check node capacity
kubectl describe node <node-name>

Diagnostic Commands

# Check deployment status
kubectl get deployments -A
kubectl rollout status deployment/nucleus -n nucleus

# Check replica sets
kubectl get replicasets -n nucleus
kubectl describe replicaset <rs-name> -n nucleus

# Check ArgoCD application status
argocd app get nucleus-staging
argocd app diff nucleus-staging

# Check Helm releases
helm list -A
helm status nucleus -n nucleus

Best Practices

Deployment Strategy

  1. Automated Testing: Comprehensive test coverage before deployment
  2. Gradual Rollout: Use rolling updates or canary deployments
  3. Health Checks: Robust liveness and readiness probes
  4. Monitoring: Real-time monitoring during deployments

Change Management

  1. Code Review: All changes reviewed before deployment
  2. Documentation: Update documentation with changes
  3. Communication: Notify stakeholders of significant deployments
  4. Rollback Plan: Always have a rollback strategy

Security

  1. Image Scanning: Scan container images for vulnerabilities
  2. Secret Management: Secure handling of sensitive configuration
  3. Access Control: Limit deployment permissions
  4. Audit Trail: Maintain complete deployment history

Performance

  1. Resource Planning: Adequate resource allocation
  2. Scaling: Plan for traffic increases
  3. Optimization: Regular performance optimization
  4. Capacity Management: Monitor and plan capacity

For advanced deployment strategies and troubleshooting, refer to the Kubernetes and ArgoCD documentation.