NOMADDISTRIBUTED CONFIGURATION WITH
James Rasell: @jrasell
JAMES RASELL - WHO?
▸ Distributed Systems Engineer
▸ Background in Infra & Ops
▸ Generally automate the things other people don’t want to
▸ Creator of Sherpa, Levant and Nomad-Toast
▸ Is Butters Scotch?
NOMAD
NOMAD
QUICK OVERVIEW
▸ Nomad is an easy-to-use, flexible, and performant workload orchestrator
▸ Datacentre and Region aware and can scale above 10000 nodes per cluster
▸ It has native integration with Consul for service discovery and Vault for secret management
▸ Run a variety of workloads including Docker, Java, and QEMU via 3 types of schedular
ARCHITECTURE OVERVIEW
ARCHITECTURE OVERVIEW
CONSIDERATIONS
▸ Secure and segregated Vault/Consul cluster providing secrets and PKI management
▸ All servers/instances run Nomad client; even the Nomad servers
▸ Flexible workload placement using client meta and class parameters
BOOTSTRAPPING
BOOTSTRAPPING
TOOLS, METHOD, PROCESS
▸ Same process, tools, and methodology used for local dev as used in datacentre environments
▸ Utilised Bash, Terraform, and libvirt. Nothing else
▸ Locally the process was fully automated and would complete in under 8 minutes
▸ DC build was (slightly) more controlled
provisioner "remote-exec" { inline = [ "sudo bash -x /var/tmp/infra/3_configure_control_plane_servers.sh", ] }
variable "control_plane_node_nomad_client_tls_key" { description = "The TLS key for the control plane Nomad client." type = "string" }
control_plane_node_nomad_client_tls_key = “${module.vault_data.nomad_client_pki_private_key}"
provisioner "file" { content = “${var.control_plane_node_nomad_client_tls_key}" destination = “/var/tmp/infra/nomad_client_key.pem” }
BOOTSTRAPPING
PROCESS STEPS
▸ Dependancies meant base infrastructure needed to be built in a strictly controlled order
▸ Vault cluster in particular went through 5 stages of building
▸ VMs > Vault install > Vault init/unseal > Vault PKI > Vault Gotun > Nomad server install > Vault Nomad client install > workload pool Nomad client install
POST BOOTSTRAP
POST BOOTSTRAP
FIRST STEPS AFTER BOOTSTRAP PROCESS
▸ Stop the gotun process running as a Vault proxy and remove binary (batch job)
▸ Start Fabio for Vault proxying, using traffic shaping to direct traffic to active node (system job)
▸ Start the Consul server job (service job) and then the Consul client job (system job)
template { data = <<EOH #!/bin/bash
set -e
systemctl stop gotun
REMAINING_PS=$(pgrep gotun) if [[ -z ${REMAINING_PS} ]]; then pgrep gotun | xargs kill fi
rm -f /usr/local/bin/gotun rm -f /etc/gotun/config.yaml EOH
destination = "local/stop-gotun.sh" change_mode = "noop" perms = "777" }
consul kv put fabio/config/vault "route weight vault / weight 1.00 tags \"active\""
service_tags = "fabio-vault-urlprefix-/ proto=http"
TLS MANAGEMENT
TLS
SHORT TTL TLS AS DEFAULT
▸ Full TLS encryption used from day 1 on infrastructure applications and platform applications
▸ Longest TTL used for a TLS certificate was 720h, short lived apps used slowest possible TTLs
▸ TLS took a while to get stable and would have likely been impossible/hugely time consuming as an after thought
TLS
CERT-MANAGER APPLICATION
▸ Wrote small application to perform TLS expiry and IPSAN difference checks
▸ Certificates can be automagically replaced if either check fails
▸ Application can run arbitrary commands after replacing a certificate to force TLS rotation
▸ Managed and maintained Nomad, Consul and Vault certificates (batch cron job)
TLS
CERT-MANAGER APPLICATION
▸ Used interesting deployment logic to ensure it would run as a batch on all clients of a class. System Batch, pretty please.
▸ Perform restart calls on applications that didn’t support SIGHUP reload
▸ Please Please Please always plan for TLS reload features on both the app and downstream connections
TLS
BUNDLE SPLITTING
▸ Vault TLS bundles require splitting into component parts for use
▸ Used template stanzas and `hairyhenderson/gomplate` to perform splitting magic
template { data = <<EOH {{ with secret “pki/issue/rand” "ttl=15m" “common_name=rand.common” “ip_sans=1.1.1.1” "format=pem" }} {{.Data | toJSON }} {{ end }} EOH
destination = "local/bundle.json" change_mode = "restart" }
template { left_delimiter = "((" right_delimiter = "))"
data = <<EOH {{- printf "%s\n" (datasource "bundle").private_key -}} EOH
destination = “local/rand.pem.tmpl" perms = "600" change_mode = "noop" }
config { command = "gomplate"
args = [ "-d", "bundle=file://${NOMAD_TASK_DIR}/bundle.json?type=application/json", "-f", "local/ca.pem.tmpl", "-o", "local/ca.pem", "-f", "local/rand.pem.tmpl", "-o", "local/rand.pem", "-f", "local/rand-key.pem.tmpl", "-o", "local/rand-key.pem", "--", “${NOMAD_TASK_DIR}/app-run-command“, “--tls-key-path=local/rand-key.pem”,
] }
COCKROACH DB
MOST PEOPLE GET REALLY EXCITED ABOUT RUNNING A DATABASE INSIDE OF A CLUSTER MANAGER LIKE NOMAD; THIS IS GOING TO MAKE YOU LOSE YOUR JOB. GUARANTEED.
Kelsey Hightower - HashiConf 2016
COCKROACH DB
CLUSTERING SETUP
▸ CRDB servers placed across hosts using constraints to ensure HA & redundancy (service job)
▸ Ephemeral disks used to lower impact of allocations restarts or failures
▸ CRDB requires initialisation to prepare the cluster ready for use (batch job)
▸ Careful attention paid to job parameters and continual assessment
args = [ "${NOMAD_TASK_DIR}/cockroach", "init", "--certs-dir=${NOMAD_TASK_DIR}", "--host=${meta.host}", ] }
group "db-cluster" { ephemeral_disk { sticky = true }
count = 3
constraint { distinct_hosts = true }
constraint { operator = "=" attribute = "${meta.role}" value = “foobar" }
COCKROACH DB
SCHEMA MANAGEMENT
▸ Table schema stored alongside application code
▸ A small custom application was used to apply schema changes and rollback if needed using ‘gobuffalo/packr’ (batch job)
▸ DB can be seeded with data for development and testing purposes (batch job)
BACKUPS
BACKUPS
BACKING UP DATA
▸ Cockroach DB tables and Consul backed up regularly to external storage using custom apps/wrappers (batch job)
▸ The backup applications had restore commands to allow for easy testing of backup data
▸ Remember: a backup isn’t a real backup until the restore is proved to work
▸ In a full DR situation, the platform could be fully restored in 15 minutes
MISC.
MISCELLANEOUS
INGRESS AND DISCOVERY
▸ Separate Fabio used to provide external access to services and UI’s (system job)
▸ Internal application service discovery performed using a gRPC Consul resolver
MISCELLANEOUS
PATH TO PRODUCTION
▸ The entire platform can be built on a local developer machine giving an exact replica of production
▸ Infra tooling and process can be developed and tested in the same manner as application code
▸ All deployments performed using TeamCity, Levant and Nomad-Toast (service job) for automation and observability
MISCELLANEOUS
MONITORING
▸ Consul health checking used extensively to alert to OpsGenie (service job)
▸ All logs shipped to Humio using FileBeat for log shipping (system job)
▸ Consul, Nomad, Vault, and app metrics shipped to Circonus for analysis and alerting (service job)
FINAL THOUGHT
FINAL THOUGHT
KEY POINTS
▸ Nomad managed all tasks apart from initial minimal bootstrapping
▸ Rely on a single mechanism for running all tasks and workloads
▸ A number of small applications written to perform useful automation tasks