7 minute read

1. What I need to prepare before start

  • Create 3x VMs using this specification:
    • 4 vCPUs
    • 4 GB RAM
    • 100 GB Storage Disk
    • Operating System : Ubuntu 18.04 LTS
    • 6 x Network Interfaces
  • Networking map:

    interface network name ip address mtu bonding
    ens3
    ens4
    oam 10.101.1.xx/24 1500 bondm
    ens5
    ens6
    bond0 - 9000 bond0
    ens7
    ens8
    bond1 - 1500 bond1
    interface vlan id cidr mtu bond master purpose
    bond0.5 5 192.168.5.0/24 9000 bond0 internal
    bond0.6 6 192.168.6.0/24 9000 bond0 ceph replication
    bond0.8 8 192.168.8.0/24 9000 bond0 overlay
    bond0.10 10 10.11.12.0/24 9000 bond0 external
    bond0.11 11 10.11.12.0/24 9000 bond0 dns
    bond1.7 7 7.8.9.0/24 9000 bond1 ceph access

2. MaaS: Install & Configure

2.1. Add Necessary Repository and Install the Required Packages.

  • Set mapping hostname on /etc/hosts and generate ssh-keygen.
cat << EOF >> /etc/hosts

# MAAS Cluster
10.101.1.6   sofyan01-maas01-rack01.cloud.sofyan.dev   sofyan01-maas01-rack01
10.101.1.8   sofyan02-maas02-rack02.cloud.sofyan.dev   sofyan02-maas02-rack02
10.101.1.10  sofyan03-maas03-rack03.cloud.sofyan.dev   sofyan03-maas03-rack03
10.101.1.5   maas-vip
EOF
ssh-keygen -q -t rsa -N '' -f ~/.ssh/id_rsa <<<y >/dev/null 2>&1
  • Install MaaS and required packages.
sudo apt-add-repository -y ppa:maas/stable
sudo apt update
sudo apt-get install maas jq wget sshpass -y

2.2. Configure PostgreSQL Replicated Cluster

  • Install PostgreSQL Automatic Failover (PAF).
wget https://github.com/ClusterLabs/PAF/releases/download/v2.3.0/resource-agents-paf_2.3.0-1_all.deb
sudo dpkg -i resource-agents-paf_2.3.0-1_all.deb
sudo bash -c "cat << EOF > /etc/tmpfiles.d/postgresql-part.conf
## Directory for PostgreSQL temp stat files
d /run/postgresql/10-main.pg_stat_tmp 0700 postgres postgres - -
EOF"

systemd-tmpfiles --create /etc/tmpfiles.d/postgresql-part.conf
  • Configure PostgreSQL.
su - postgres -c "cat << EOF >> /etc/postgresql/10/main/postgresql.conf

listen_addresses = '*'
max_connections = 300
wal_level = hot_standby
synchronous_commit = on
archive_mode = off
max_wal_senders = 10
wal_keep_segments = 256
hot_standby = on
restart_after_crash = off
hot_standby_feedback = on
EOF"

sed -ir 's/local   replication.*/#local   replication     all                                     peer/g' /etc/postgresql/10/main/pg_hba.conf
sed -ir 's/host    replication.*/#host    replication     all             127.0.0.1/32            md5/g' /etc/postgresql/10/main/pg_hba.conf
sed -ir 's/host    replication.*/#host    replication     all             ::1/128                 md5/g' /etc/postgresql/10/main/pg_hba.conf
  • Edit pg_hba.conf, HBA stands for host-based authentication.
su - postgres -c "cat << EOF >> /etc/postgresql/10/main/pg_hba.conf

host replication postgres 10.101.1.5/32 trust
host replication postgres 10.101.1.6/32 trust
host replication postgres 10.101.1.8/32 trust
host replication postgres 10.101.1.10/32 trust

host maasdb maas 10.101.1.5/32 md5
host maasdb maas 10.101.1.6/32 md5
host maasdb maas 10.101.1.8/32 md5
host maasdb maas 10.101.1.10/32 md5
EOF"
  • Create recovery.conf.pcmk file and asign temporary IP address to broam9 (maas01 only).

su - postgres -c "cat << EOF > /etc/postgresql/10/main/recovery.conf.pcmk
standby_mode = on
primary_conninfo = 'host=10.101.1.5 port=5432 user=postgres application_name=sofyan01-maas01-rack01 keepalives_idle=60                    keepalives_interval=5                    keepalives_count=5'
restore_command = ''
recovery_target_timeline = 'latest'
EOF"
systemctl restart postgresql
ip address add 10.101.1.5/24 dev broam9
  • Take a base backup of a running PostgreSQL database cluster (maas02 & maas03).
systemctl stop postgresql
su - postgres -c "rm -rf ~/10/main/"
su - postgres -c "pg_basebackup -h maas-vip -D ~postgres/10/main/ -U postgres -v -X stream -P"
su - postgres -c "cp /usr/share/postgresql/10/recovery.conf.sample /var/lib/postgresql/10/main/recovery.conf"
  • Create recovery.conf and recovery.conf.pcmk file (maas02 only).
su - postgres -c "cat << EOF > /etc/postgresql/10/main/recovery.conf.pcmk
standby_mode = on
primary_conninfo = 'host=10.101.1.5 port=5432 user=postgres application_name=sofyan02-maas02-rack02 keepalives_idle=60                    keepalives_interval=5                    keepalives_count=5'
restore_command = ''
recovery_target_timeline = 'latest'
EOF"
su - postgres -c "cat << EOF > /var/lib/postgresql/10/main/recovery.conf
standby_mode = on
primary_conninfo = 'host=10.101.1.5 port=5432 user=postgres application_name=sofyan02-maas02-rack02 keepalives_idle=60                    keepalives_interval=5                    keepalives_count=5'
restore_command = ''
recovery_target_timeline = 'latest'
EOF"

systemctl start postgresql
  • Create recovery.conf and recovery.conf.pcmk file (maas03 only).
su - postgres -c "cat << EOF > /etc/postgresql/10/main/recovery.conf.pcmk
standby_mode = 'on'
primary_conninfo = 'host=10.101.1.5 port=5432 user=postgres application_name=sofyan03-maas03-rack03                    keepalives_interval=5                    keepalives_count=5'
restore_command = ''
recovery_target_timeline = 'latest'
EOF"
su - postgres -c "cat << EOF > /var/lib/postgresql/10/main/recovery.conf
standby_mode = 'on'
primary_conninfo = 'host=10.101.1.5 port=5432 user=postgres application_name=sofyan03-maas03-rack03                    keepalives_interval=5                    keepalives_count=5'
restore_command = ''
recovery_target_timeline = 'latest'
EOF"

systemctl start postgresql
  • Verify the PostgreSQL Cluster is Replicated (maas01).
su - postgres -c 'psql -c "select client_addr,sync_state from pg_stat_replication;"' > /root/postgresql_replica.log
  • Stop PostgreSQL on all nodes.
systemctl disable --now postgresql@10-main

2.3. Setup HA for PostgreSQL

  • Install HAProxy, Pacemaker, Corosync and required packages. (all nodes).
apt-get install haproxy pacemaker corosync -y
apt-get install pcs crmsh -y
  • Configure Corosync.
cat << EOF > /etc/corosync/corosync.conf
totem {
  version: 2
  token: 3000
  token_retransmits_before_loss_const: 10
  join: 60
  consensus: 3600
  vsftype: none
  max_messages: 20
  clear_node_high_bit: yes
  secauth: off
  threads: 0
  ip_version: ipv4
  rrp_mode: none
  transport: udpu
}

quorum {
  provider: corosync_votequorum
  }

nodelist {
  node {
    ring0_addr: sofyan01-maas01-rack01
    nodeid: 1000
  }
  node {
    ring0_addr: sofyan02-maas02-rack02
    nodeid: 1001
  }
  node {
    ring0_addr: sofyan03-maas03-rack03
    nodeid: 1002
  }
}


logging {
  fileline: off
  to_stderr: yes
  to_logfile: no
  to_syslog: yes
  syslog_facility: daemon
  debug: off
  logger_subsys {
    subsys: QUORUM
    debug: off
  }
}
EOF
  • Setup Pacemaker.
pcs cluster auth sofyan01-maas01-rack01 sofyan02-maas02-rack02 sofyan03-maas03-rack03 -u hacluster -p P@ssw0rd
pcs cluster setup --name ha-pgsql-maas sofyan01-maas01-rack01 sofyan02-maas02-rack02 sofyan03-maas03-rack03 --force
pcs cluster disable --all
pcs cluster start --all
  • Create pacemaker config file.
pcs cluster cib /root/pgsql_cfg
pcs -f /root/pgsql_cfg property set no-quorum-policy="ignore"
pcs -f /root/pgsql_cfg property set stonith-enabled="false"
pcs -f /root/pgsql_cfg resource defaults resource-stickiness="INFINITY"
pcs -f /root/pgsql_cfg resource defaults migration-threshold="1"

pcs -f /root/pgsql_cfg resource create pgsql ocf:heartbeat:pgsqlms \
    bindir="/usr/lib/postgresql/10/bin"                             \
    pgdata="/etc/postgresql/10/main"                                \
    datadir="/var/lib/postgresql/10/main"                           \
    op start   on-fail="restart" \
    op monitor interval="3s"  on-fail="restart" role="Master" \
    op monitor interval="4s" on-fail="restart"  role="Slave" \
    op promote on-fail="restart" \
    op demote  on-fail="stop" \
    op stop    on-fail="block" \
    op notify
pcs -f /root/pgsql_cfg resource master ms_pgsql pgsql notify=true

pcs -f /root/pgsql_cfg resource create res_pgsql_vip ocf:heartbeat:IPaddr2 \
    nic="broam9" \
    ip=10.101.1.5 \
    cidr_netmask=24 \
    op start   interval="0s"  on-fail="restart" \
    op monitor interval="4s" on-fail="restart" \
    op stop    interval="0s"  on-fail="block"

pcs -f /root/pgsql_cfg resource create res_maas_vip ocf:heartbeat:IPaddr2 \
    nic="broam9" \
    ip=10.101.1.11 \
    cidr_netmask=24 \
    op start   interval="0s"  on-fail="restart" \
    op monitor interval="4s" on-fail="restart" \
    op stop    interval="0s"  on-fail="block"

pcs -f /root/pgsql_cfg constraint colocation add res_pgsql_vip with master ms_pgsql INFINITY
pcs -f /root/pgsql_cfg constraint order promote ms_pgsql then start res_pgsql_vip symmetrical=false kind=Mandatory
pcs -f /root/pgsql_cfg constraint order demote ms_pgsql then stop res_pgsql_vip symmetrical=false kind=Mandatory

pcs cluster cib-push /root/pgsql_cfg
  • Configure HAProxy.
cat << EOF > cat /etc/haproxy/haproxy.cfg
frontend maas
    bind    *:80
    retries 3
    option  redispatch
    option  http-server-close
    default_backend maas

backend maas
    timeout server 900s
    balance source
    hash-type consistent
    server maas-api-0 10.101.1.6:5240 check
    server maas-api-1 10.101.1.8:5240 check
    server maas-api-2 10.101.1.10:5240 check
EOF

Then restart haproxy service

systemctl restart haproxy.service
  • Download HAProxy OCF Resource Agent.
cd /usr/lib/ocf/resource.d/heartbeat
curl -O https://raw.githubusercontent.com/thisismitch/cluster-agents/master/haproxy
chmod +x haproxy
  • Add HAProxy resource.
crm configure primitive haproxy ocf:heartbeat:haproxy \
op start interval="0" on-fail="restart" \
op monitor interval="4s" on-fail="restart" \
op stop interval="0" on-fail="block"

crm configure clone haproxy-clone haproxy
crm configure colocation colocation-res_maas_vip-haproxy-clone inf: res_maas_vip haproxy-clone

2.4. Setup MAAS

  • Reconfigure MAAS (on all nodes).
sed -ir 's/database_host: .*/database_host: 10.101.1.5/g' /etc/maas/regiond.conf
sed -ir 's/maas_url: .*/maas_url: http:\/\/10.101.1.11:80\/MAAS/g' /etc/maas/regiond.conf
sed -ir 's/- http.*/- http:\/\/10.101.1.11:80\/MAAS/g' /etc/maas/rackd.conf
systemctl restart maas-regiond.service
systemctl restart maas-rackd.service
  • Initialize MaaS (maas01).
maas init --admin-username root --admin-password P@ssw0rd --admin-email maas@sofyan.dev
  • Create SSH keys for MaaS (maas01).
maas login root http://localhost:5240/MAAS \$(maas apikey --username=root)
maas root maas set-config name=maas_name value="Sofyan Cloud-1"
maas root sshkeys create "key=\$(cat /root/.ssh/id_rsa.pub)"
  • Create fabrics (maas01).
maas root fabrics create name=default
maas root fabrics create name=vlan7-bondm
maas root fabrics create name=vlan9-bond1
  • Create space (maas01).
maas root spaces create name=oam-space
maas root spaces create name=internal-space
maas root spaces create name=ceph-replica-space
maas root spaces create name=overlay-space
maas root spaces create name=external-space
maas root spaces create name=ceph-access-space
maas root spaces create name=dns-space
  • Set subnet name (maas01).
maas root subnets read | jq -r '.[] | [.vlan.fabric_id, .vlan.vid, .id, .cidr|tostring] | join("     ")' > /tmp/subnet-and-id.txt
maas root subnet update "\$(cat /tmp/subnet-and-id.txt | grep 10.101.- | awk '{print \$3}')" name=oam
maas root subnet update "\$(cat /tmp/subnet-and-id.txt | grep 192.168.- | awk '{print \$3}')" name=internal
maas root subnet update "\$(cat /tmp/subnet-and-id.txt | grep 192.168.- | awk '{print \$3}')" name=ceph_replication
maas root subnet update "\$(cat /tmp/subnet-and-id.txt | grep 192.168.- | awk '{print \$3}')" name=overlay
maas root subnet update "\$(cat /tmp/subnet-and-id.txt | grep 10.11.1- | awk '{print \$3}')" name=external
maas root subnet update "\$(cat /tmp/subnet-and-id.txt | grep 10.11.1- | awk '{print \$3}')" name=dns
maas root subnet update "\$(cat /tmp/subnet-and-id.txt | grep 192.168.- | awk '{print \$3}')" name=ceph_access
  • Assign space to each vlan (maas01).
maas root subnets read | jq -r '.[] | [.vlan.fabric_id, .vlan.vid, .id, .cidr|tostring] | join("     ")' > /tmp/subnet-and-id.txt
maas root vlan update "\$(cat /tmp/subnet-and-id.txt | grep 10.101.- | awk '{print \$1}')" "\$(cat /tmp/subnet-and-id.txt | grep 10.101.- | awk '{print \$2}')" space=oam-space
maas root vlan update "\$(cat /tmp/subnet-and-id.txt | grep 192.168.- | awk '{print \$1}')" "\$(cat /tmp/subnet-and-id.txt | grep 192.168.- | awk '{print \$2}')" space=internal-space
maas root vlan update "\$(cat /tmp/subnet-and-id.txt | grep 192.168.- | awk '{print \$1}')" "\$(cat /tmp/subnet-and-id.txt | grep 192.168.- | awk '{print \$2}')" space=ceph-replica-space
maas root vlan update "\$(cat /tmp/subnet-and-id.txt | grep 192.168.- | awk '{print \$1}')" "\$(cat /tmp/subnet-and-id.txt | grep 192.168.- | awk '{print \$2}')" space=overlay-space
maas root vlan update "\$(cat /tmp/subnet-and-id.txt | grep 10.11.1- | awk '{print \$1}')" "\$(cat /tmp/subnet-and-id.txt | grep 10.11.1- | awk '{print \$2}')" space=external-space
maas root vlan update "\$(cat /tmp/subnet-and-id.txt | grep 10.11.1- | awk '{print \$1}')" "\$(cat /tmp/subnet-and-id.txt | grep 10.11.1- | awk '{print \$2}')" space=dns-space
maas root vlan update "\$(cat /tmp/subnet-and-id.txt | grep 192.168.- | awk '{print \$1}')" "\$(cat /tmp/subnet-and-id.txt | grep 192.168.- | awk '{print \$2}')" space=ceph-access-space
  • Register other nodes as Controller. (maas02 & maas03)
maas-rack register --url http://10.101.1.11:80/MAAS --secret \$(cat /home/jujumanage/secret)

3. Reference

  • https://maas.io/docs/how-to-install-maas
  • https://maas.io/docs/how-to-manage-controllers
  • https://maas.io/docs/how-to-use-the-maas-cli
  • https://pgstef.github.io/2018/02/07/introduction_to_postgresql_automatic_failover.html
  • https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_reference/s1-resourceoperate-haar
  • https://www.digitalocean.com/community/tutorials/how-to-create-a-high-availability-setup-with-corosync-pacemaker-and-reserved-ips-on-ubuntu-14-04
  • https://wiki.clusterlabs.org/wiki/PgSQL_Replicated_Cluster