cfdb
Version information
This version is compatible with:
- Puppet Enterprise 2017.2.x, 2017.1.x, 2016.5.x, 2016.4.x
- Puppet >=4.0.0 <5.0.0
- ,
Start using this module
Add this module to your Puppetfile:
mod 'codingfuture-cfdb', '0.9.3'
Learn more about managing modules with a PuppetfileDocumentation
cfdb
Description
- Setup & auto tune service instances based on available resources:
- MySQL
- PostgreSQL
- a general framework to easily add new types is available
- Support auto-configurable clustering:
- Galera Cluster for MySQL
- repmgr for PostgreSQL
- High Availability and fail-over out-of-the-box
- Specialized HAProxy on each client system
- Automatic creation of databases
- Automatic upgrade of databases after DB software upgrade
- Complete management of user accounts (roles)
- automatic random generated password management
- full & read-only database roles
- custom grants support
- ensures max connections meet
- Easy access configuration
- automatic firewall setup on both client & server side
- automatic protection for roles x incoming hosts
- automatic protection for roles x max connections
- Automatic backup and automated restore
- Strict cgroup-based resource isolation on top of
systemd
integration - Atutomatic DB connection availability checks
Terminology & Concept
cluster
- infrastructure-wide unique associative name for a collection of distributed database instances. In standalone cases, there is only one instance per cluster.instance
- database service process and related configuration. Each instance runs in own cgroup for fair weighted usage of RAM, CPU and I/O resources.primary instance
- instance where dynamic setup of roles and databases occurs. It also gets all Read-Write users by default.secondary instance
- a slave instance suitable for temporary automatic fail-over and Read-Only users. It is not allowed to define databases & roles in configuration, unless the instance is switched to primary node in static configuration.arbitrator
- an instance participating in quorum, but with no data. Mitigates split-brain cases.
database
- a database instance per cluster. All privileged on the database role with the same name is automatically created. There is no need to define such role explicitly.role
- a database user. Each role name is automatically prefixed with related database name to prevent name collisions.access
- a definition of client system and user to access particular role on specific database of specific cluster. All connections parameters are saved in ".env" file in user's home folder. It's possible to specify multiple access definitions per single system user using different .env variable prefixes. In case of multi-node cluster, a local HAProxy reverse-proxy instance is implicitly created with required high-Availability configuration.
Technical Support
- Example configuration
- Commercial support: support@codingfuture.net
Setup
Please use librarian-puppet or cfpuppetserver module to deal with dependencies.
There is a known r10k issue RK-3 which prevents automatic dependencies of dependencies installation.
IMPORTANT NOTES!!!
Please understand that PuppetDB is heavily used for auto-configuration. The drawback is that new facts are updated only on second run after the run which makes any changes. Typical case is to do the following:
- Provision instance with databases & roles
- Provision the same instance again (collect facts)
- Provision access locations
- Provision access locations again (collect facts)
- Provision instance (update configuration based on the access facts)
- Restart instances, if asked during provisioning
- Provision all nodes in cycle until there are no new changes and no restart is required
For cluster configuration:
- Provision primary node
- Provision primary node again (collect facts)
- Provision secondary nodes
- Provision secondary nodes again (collect facts)
- Provision primary node (configure firewall & misc. based on facts)
- Provision secondary nodes again (clone/setup)
- Restart instances, if asked during provisioning
- Provision all nodes in cycle until there are no new changes and no restart is required
As available system memory is automatically distributed between all registered services:
- Please make sure to restart services after distribution gets changes (or you may run in trouble).
- Please avoid using the same system for
codingfuture
derived services and custom. Or see #3. - Please make sure to reserve RAM using
cfsystem_memory_weight
for any custom co-located services. Example:
cfsystem_memory_weight { 'my_own_services':
ensure => present,
weight => 100,
min_mb => 100,
max_mb => 1024
}
Examples
Please check codingufuture/puppet-test for example of a complete infrastructure configuration and Vagrant provisioning.
Example for host running standalone instances, PXC arbitrator & PostgreSQL slaves + related accesses.
cfdb::iface: main
cfdb::instances:
mysrv1:
type: mysql
databases:
- db1_1
- db1_2
iface: vagrant
port: 3306
mysrv2:
type: mysql
databases:
db2:
roles:
readonly:
readonly: true
sandbox:
custom_grant: 'GRANT SELECT ON $database.* TO $user; GRANT SELECT ON mysql.* TO $user;'
myclust1:
type: mysql
is_arbitrator: true
port: 4306
myclust2:
type: mysql
is_arbitrator: true
port: 4307
settings_tune:
cfdb:
secure_cluster: true
pgsrv1:
type: postgresql
iface: vagrant
databases:
pdb1: {}
pdb2: {}
pgclust1:
type: postgresql
# there are repmgr issues
#is_arbitrator: true
is_secondary: true
port: 5300
settings_tune:
cfdb:
node_id: 3
pgclust2:
type: postgresql
# there are repmgr issues
#is_arbitrator: true
is_secondary: true
port: 5301
settings_tune:
cfdb:
secure_cluster: true
node_id: 3
cfdb::access:
vagrant_mysrv2_db2:
cluster: mysrv2
role: db2
local_user: vagrant
max_connections: 100
vagrant_mysrv2_db2ro:
cluster: mysrv2
role: db2readonly
local_user: vagrant
config_prefix: 'DBRO_'
max_connections: 200
vagrant_mysrv2_db2sandbox:
cluster: mysrv2
role: db2sandbox
local_user: vagrant
config_prefix: 'DBSB_'
vagrant_myclust1_db1:
cluster: myclust1
role: db1
local_user: vagrant
config_prefix: 'DBC1_'
vagrant_myclust1_db2:
cluster: myclust1
role: db2
local_user: vagrant
config_prefix: 'DBC2_'
vagrant_pgsrv1_pdb1:
cluster: pgsrv1
role: pdb1
local_user: vagrant
config_prefix: 'PDB1_'
For another related host running primary nodes of clusters:
cfdb::iface: main
cfdb::mysql::is_cluster: true
cfdb::instances:
myclust1:
type: mysql
is_cluster: true
databases:
db1:
roles:
ro:
readonly: true
db2: {}
port: 3306
# Not for production, but for vagrant bootup
is_bootstrap: true
myclust2:
type: mysql
is_cluster: true
databases:
- db1
- db2
port: 3307
# Not for production, but for vagrant bootup
is_bootstrap: true
settings_tune:
cfdb:
secure_cluster: true
pgclust1:
type: postgresql
is_cluster: true
databases:
pdb1:
roles:
ro:
readonly: true
pdb2: {}
port: 5300
pgclust2:
type: postgresql
is_cluster: true
databases:
- pdb3
- pdb4
port: 5301
settings_tune:
cfdb:
secure_cluster: true
For third related host running secondary nodes of clusters:
cfdb::iface: main
cfdb::mysql::is_cluster: true
cfdb::instances:
myclust1:
type: mysql
is_secondary: true
port: 3306
myclust2:
type: mysql
is_secondary: true
port: 3307
settings_tune:
cfdb:
secure_cluster: true
pgclust1:
type: postgresql
is_secondary: true
port: 5300
pgclust2:
type: postgresql
is_secondary: true
port: 5301
settings_tune:
cfdb:
secure_cluster: true
Implicitly created resources
# for every instance
#------------------
cfnetwork::describe_service:
cfdb_${cluster}:
server: "tcp/${port}"
# local cluster system user access to own instance
cfnetwork::service_port:
"local:cfdb_${cluster}": {}
cfnetwork::client_port:
"local:cfdb_${cluster}":
user: $user
# client access to local cluster instance
cfnetwork::service_port:
"${iface}:cfdb_${cluster}":
src: $client_hosts
# for each Galera cluster (inter-node comms)
# > access to local instance ports
cfnetwork::describe_service:
"cfdb_${cluster}_peer":
server: "tcp/${port}"
"cfdb_${cluster}_galera":
server:
- "tcp/${galera_port}"
- "udp/${galera_port}"
"cfdb_${cluster}_sst":
server: "tcp/${sst_port}"
"cfdb_${cluster}_ist":
server: "tcp/${ist_port}"
cfnetwork::service_port:
"${iface}:cfdb_${cluster}_peer":
src: $peer_addr_list
"${iface}:cfdb_${cluster}_galera":
src: $peer_addr_list
"${iface}:cfdb_${cluster}_sst":
src: $peer_addr_list
"${iface}:cfdb_${cluster}_ist":
src: $peer_addr_list
# > access to remote cluster instances
cfnetwork::describe_service:
"cfdb_${cluster}_peer_${host_under}":
server: "tcp/${peer_port}"
"cfdb_${cluster}_galera_${host_under}":
server:
- "tcp/${galera_port}"
- "udp/${galera_port}"
"cfdb_${cluster}_sst_${host_under}":
server: "tcp/${sst_port}"
"cfdb_${cluster}_ist_${host_under}":
server: "tcp/${ist_port}"
cfnetwork::client_port:
"${iface}:cfdb_${cluster}_peer_${host_underscore}":
dst: $peer_addr
user: $user
"${iface}:cfdb_${cluster}_galera_${host_underscore}":
dst: $peer_addr
user: $user
"${iface}:cfdb_${cluster}_sst_${host_underscore}":
dst: $peer_addr
user: $user
"${iface}:cfdb_${cluster}_ist_${host_underscore}":
dst: $peer_addr
user: $user
# for each repmgr PostgreSQL cluster (inter-node comms)
# > access to local instance ports
cfnetwork::describe_service:
"cfdb_${cluster}_peer":
server: "tcp/${port}"
cfnetwork::service_port:
"${iface}:cfdb_${cluster}_peer":
src: $peer_addr_list
# > access to remote cluster instances
cfnetwork::describe_service:
"cfdb_${cluster}_peer_${host_underscore}":
server: "tcp/${peer_port}"
cfnetwork::client_port:
"${iface}:cfdb_${cluster}_peer_${host_underscore}":
dst: $peer_addr
user: $user
# for each cluster node requiring SSH access (e.g. repmgr)
cfnetwork::client_port:
"${iface}:cfssh:cfdb_${cluster}_${host_underscore}":
dst: $peer_addr
user: $user
cfnetwork::service_port:
"${iface}:cfssh:cfdb_${cluster}_${host_underscore}":
src: $peer_addr
# for every cfdb::access when HAProxy is NOT used
#------------------
cfnetwork::describe_service:
cfdb_${cluster}:
server: "tcp/${port}"
cfnetwork::client_port:
"any:cfdb_${cluster}:${local_user}":
dst: [cluster_hosts]
user: $local_user
# for every cfdb::access when HAProxy IS used
#------------------
cfnetwork::describe_service:
"cfdbha_${cluster}_${port}":
server: "tcp/${port}"
cfnetwork::client_port:
"any:${fw_service}:${host_underscore}":
dst: $addr,
user: $cfdb::haproxy::user
class cfdb
parameters
This is a full featured class to use with Hiera
$instances = {}
- configurations forcfdb::instance
resources (Hiera-friendly)$access = {}
- configurations forcfdb::access
resources (Hiera-friendly)$iface = 'any'
- database network facing interface$root_dir = '/db'
- root to create instance home folders$max_connections_default = 10
- default value for$cfdb::access::max_connections
$backup = true
- default value for$cfdb::instance::backup
class cfdb::backup
parameters
This class is included automatically on demand.
$cron = { hour => 3, minute => 10 }
- defaultcron
config for periodic auto-backup$root_dir = '/mnt/backup'
- root folder for instance backup sub-folders
class cfdb::haproxy
parameters
This class is included automatically on demand.
$memory_weight = 1
- weighted amount of memory to reserve for HAProxy.- Note: optimal minimal amount is automatically reserved based on max number of connections
$memory_max = undef
- possible max memory limit$cpu_weight = 100
- CPU weight for cgroup isolation$io_weight = 100
- I/O weight for cgroup isolation$settings_tune = {}
- do not use, unless you know what you are doing. Mostly left for exceptional in-field case purposes.
class cfdb::mysql
parameters
This class is included automatically on demand.
$is_cluster = false
- if true, Percona XtraDB Cluster is installed instead of Percona Server$percona_apt_repo = 'http://repo.percona.com/apt'
- Percona APT repository location$version = '5.7'
- version of Percona Server to use$cluster_version = '5.6'
- version of PXC to use
class cfdb::postgresql
parameters
This class is included automatically on demand.
$version = '9.5'
- version of postgresql to use$default_extensions = true
- install default extension list, if true.- Default list: 'asn1oid', 'debversion', 'ip4r', 'partman', 'pgespresso', 'pgextwlist', 'pgmp', 'pgrouting', 'pllua', 'plproxy', 'plr', 'plv8', "postgis-${postgis_ver}", 'postgis-scripts', 'powa', 'prefix', 'preprepare', 'repack', 'repmgr', 'contrib', 'plperl', 'plpython', 'pltcl'
$extensions = []
- custom list of extensions to install$apt_repo = 'http://apt.postgresql.org/pub/repos/apt/'
- PostgreSQL APT repository location
type cfdb::access
parameters
This type defines client with specific properties for auto-configuration of instances.
$cluster
- unique cluster name$role
- unique role name within cluster (note roles defined in databases must be prefixed with database name)$local_user
- local user to make.env
configuration for. Theuser
resource must be defined with$home
parameter.$use_proxy = 'auto'
- do not change the default (for future use)$max_connections = $cfdb::max_connections_default
- define max number of client connections for particular case.$config_prefix = 'DB_'
- variable prefix for.env
file. The following variables are defined:- 'HOST', 'PORT', 'SOCKET', 'USER', 'PASS', 'DB', 'TYPE'.
- 'CONNINFO' - only for PostgreSQL
$env_file = '.env'
- name of dot-env file relative to $home of the user$iface = $cfdb::iface
- DB network facing interface$custom_config = undef
- name of custom resource to instantiate with the following parameters:cluster
- related cluster namerole
- related role namelocal_user
- related local user namemax_connections
- max connections allowed per this accessclient_host
- expected host for outgoing connectionsconfig_vars
- hash of configuration variables in lower case (see above)
type cfdb::database
parameters
This type must be used only on primary instance of cluster.
$cluster
- unique cluster name$database
- database name$roles = undef
- configuration forcfdb::role
resources (Hiera-friendly)
type cfdb::instance
parameters
Defines and auto-configures instances.
$type
- type of cluster, e.g. mysql, postgresql$is_cluster = false
- if true, configured instance with cluster in mind$is_secondary = false
- if true, secondary node is assumed$is_bootstrap = false
- if true, forces cluster bootstrap (should be used only TEMPORARY for recovery purposes)$is_arbitrator = false
- if true, assumes a witness node for quorum with no data$memory_weight = 100
- relative memory weight for automatic configuration based on available RAM$memory_max = undef
- max memory the instance can use in auto-configuration$cpu_weight = 100
- relative CPU weight for cgroup isolation$io_weight = 100
- relative I/O weight for cgroup isolation$target_size = 'auto'
- expected database size in bytes (auto - detects based on partition size)$settings_tune = {}
- very specific fine tune. See below$databases = undef
- configuration forcfdb::database
resources$iface = $cfdb::iface
- DB network facing interface$port = undef
- force specific network port (mandatory, if$is_cluster
)$backup = $cfdb::backup
- if true, automatic scheduled backup gets enabled$backup_tune = { base_date => 'month' }
- overrides$type
-specific backup script parameters. See below.$ssh_key_type = 'ed25519'
- SSH key type for in-cluster communication$ssh_key_bits = 2048
- SSH key bits for RSA
Backup & restore
Each instance has ~/bin/cfdb_backup
and ~/bin/cfdb_restore
scripts installed to
perform manual backup and manual restore from backup respectively. Of course,
restore will ask to input two different phrases for safety reasons.
There are two types of backup: base and incremental. The type of backup is detected automatically
based on base_date
option which can be set through $cfdb::instance::backup_tune
.
Possible values for for base_date
:
'year'
- '%Y''quarter'
- "%Y-Q$(( $(/bin/date +%m) / 4 + 1 ))"'month'
- '%Y-%m''week'
- '%Y-W%W''day'
- '%Y-%m-%d''daytime'
- '%Y-%m-%d_%H%M%S'- any accepted as
date
format
If $cfdb::instance::backup
is true then bin/cfdb_backup_auto
symlink is created.
The symlinks are automatically called in sequence during system-wide cron-based backup
to minimize stress on system.
Other commands & configurations
/opt/codingfuture/bin/cfdb_backup_all
is installed and used in periodic cron for sequential instance backup with minimized stress on the system.
MySQL
~/.my.cnf
is properly configured formysql
client to work without parameters.~/bin/cfdb_sysbench
is installed for easy sysbench invocation
PostgreSQL
~/bin/cfdb_psql
is installed to properly invoke psql with required parameters.~/bin/cfdb_repmgr
is installed to properly invoke with required parameters~/.pgpass
is properly configured for superuser and repmgr~/.pg_service.conf
is properly configured to be used with~/bin/cfdb_psql
$settings_tune
magic
MySQL
Quite simple. Every key is section name in MySQL INI. Each value is a hash of section's variable => value pairs.
Note: there are some configuration variables which are enforced by CFDB
However, there is a special "cfdb"
section, which is interpreted differently. There are
special keys:
optimize_ssd
- if true, assume data directory is located on high IOPS hardwaresecure_cluster
- if true, use Puppet PKI based TLS for inter-node communicationshared_secret
- DO NOT USE, for internal cluster purposes.max_connections_roundto = 100
- ceil max_connections to multiple of thatlisten = 0.0.0.0
- address to listen on, if external connections are detected based oncfdb::access
or$is_cluster
inodes_min = 1000
andinodes_max = 10000
- set gates for automatic calculation ofmysqld.table_definition_cache
andmysqld.table_open_cache
open_file_limit_roundto = 10000
- ceilmysqld.open_file_limit
to multiple of thatbinlog_reserve_percent = 10
- percent of $target_size to reserve for binary logsdefault_chunk_size = 2 * gb
- default for innodb_buffer_pool_chunk_sizeinnodb_buffer_pool_roundto = 1GB or 128MB
- rounding of memory available for InnoDB pool. The default depends on actual amount of RAM available.wsrep_provider_options = {}
- overrides for some ofwsrep_provider_options
of Galera Cluster
PostgreSQL
Similar to MySQL. Section named 'postgresql'
overrides some of configuration values in postgresql.conf
-
some other variables are enforced.
However, there is also a special "cfdb" section:
optimize_ssd
- if true, assume data directory is located on high IOPS hardwaresecure_cluster
- if true, use Puppet PKI based TLS for inter-node communicationshared_secret
- DO NOT USE, for internal cluster purposes.strict_hba_roles = true
- if true, hba conf strictly matches each role to host instead of using "all" for match. This imitates the same host-based security as provided by MySQL.max_connections_roundto = 100
- ceil max_connections to multiple of thatlisten = 0.0.0.0
- address to listen on, if external connections are detected based oncfdb::access
or$is_cluster
inodes_min = 1000
andinodes_max = 10000
- set gates for automatic calculation ofinodes_used
participating in:postgresql.max_files_per_process = 3 * (max_connections + inodes_used)
open_file_limit_roundto = 10000
- ceilpostgresql.max_files_per_process
to multiple of thatshared_buffers_percent = 20
- percent of allowed RAM to reserve for shared_bufferstemp_buffers_percent = 40
- percent of allowed RAM to reserve for tempbufferstemp_buffers_overcommit = 8
- ratio of allowed temp_buffers overcommitnode_id
- node ID for repmgr. If not set then the last digits of hostname are used as ID.upstream_node_id
- upstream node ID for repmgr. If not set then primary instance is used.locale = 'en_US.UTF-8'
- locale to use forinitdb
HAProxy
- Top level key matches haproxy.conf sections (e.g. global, defaults, frontend XXX, backend YYY, etc.)
- If section is missing - it is created
- Top level values must be hashes of section definitions
- Nil value is interpreted as "delete section"
- Due to quite messy HAProxy configuration, you should check
lib/puppet/provider/cfdb_haproxy/cfdbb.rb
for how to properly overrides entires (some of them include space, like "timeout client")
Types in this module release
Change Log
All notable changes to this project will be documented in this file. This project adheres to Semantic Versioning.
0.9.1
- Major refactoring to support provider mixins per database type
- Fixed PostgreSQL HBA files with strict_hba_roles
- Fixed to check that DB services are running
- HAProxy imrovements
- changed to always use custom cluster-aware health-check scripts
- changed to use special reverse-proxy socket for health checks
- changed to support frontend secure connection based on Puppet PKI
- fixed to properly support PostgreSQL UNIX sockets
- Fixed missing DB variable for .env files
- Added PostgreSQL-specific CONNINFO variable for cfdb::access
- Fixed to properly configure access & max connections on secondary servers
- Implemented automatic check for cfdb::access connection availability
- Fixed issues with roles not getting updated after transition error
- Fixed some cases when PostgreSQL roles were not getting created
- Added support for custom config variable resource for a sort of polymorphism in cfdb::access
0.9.0
Initial release
Dependencies
- puppetlabs-stdlib (>= 4.12.0)
- codingfuture-cfnetwork (>= 0.9.8)
- codingfuture-cfsystem (>= 0.9.14)
- dalen-puppetdbquery (>= 2.1.1)