Virtual cluster
On the host, you must set these sysctl settings:
Code Block |
---|
net.bridge.bridge-nf-call-arptables = 0
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0 |
You must define 3 network interfaces on each host of your cluster.
- One interface connects to a virtual network in NAT mode
- Two interfaces connect to two virtual networks with a MTU set to 9000 (it's to simulate an ethernet cable between two machines)
Inventories
The inventory must define these hosts to run:
cluster_machines
: Set of hosts in the clusterhypervisors
: Set of hosts to launch virtual machinesstandalone_machine
: To define only the cluster is composed with one host (replacecluster_machines
)
Prerequisite
...
The inventory must define these variables:
ansible_connection
: Protocol to use to connection to machineansible_python_interpreter
: Path to the python interpreter binaryansible_ssh_common_args
: Arguments to add for the SSH connectionansible_user
: Login to use for the connection to machine
Playbooks
Prerequisite
When the host is installed, the ansible/playbooks/cluster_setup_prerequisdebian.yaml
need to launch to finish the installation.
The inventory must define these variables to run the step:playbook:
admin_user
: Default user with admin privilegesadmin_passwd
: Password hash (optional)admin_ssh_keys
: (optional)apply_network_config
: Boolean to apply the network configurationadmin_ip_addr
: IP address for SNMPcpumachinesnort
: Range of allowed CPUs for no RT machinescpumachines
: Range of allowed CPUs for machines (RT and no RT)cpumachinesrt
: Range of allowed CPUs for RT machinescpuovs
: Range of allowed CPUs for OpenVSwitchcpusystem
: Range of allowed CPUs for the systemcpuuser
: Range of allowed CPUs for the userirqmask
: Set theIRQBALANCE_BANNED_CPUS
environment variable, seeirqbalance
manuallivemigration_user
:logstash_server_ip
: IP address forlogstash-seapath
alias in/etc/hosts
main_disk
: Main disk device to observe his temperatureworkqueuemask
: The negation of theirqmask
(= ~irqmask
)
Network
The inventory must define these variables to run the step:
br_rstp_priority
: TODO Multiple of 4096cluster_ip_addr
: IP address forteam0
interfacegateway_addr
: IP address of a gateway, it doesn't have to workip_addr
: IP address to communicate with the hostnetwork_interface
: Network interface to communicate with the hostntp_primary_server
: Address of a NTP server, it's the first server to requestsntp_secondary_server
: Address of a NTP server, it's the secondary server to requestssyslog_server_ip
: Address of a SYSLOG serverteam0_0
: Network interface to connect toteam0
bridgeteam0_1
: Other network interface to connect toteam0
bridge
...
Warnings
At the end of this step, make sure that:
- Each host in the cluster pings other (simple and fragmented package)
- Hosts are synchronized with NTP server (it's necessary for the shared storage)
Shared storage (via ceph)
The inventory may define these hosts to run the step completely:
clients
: Set of hosts that must be ceph-client. These hosts will access to the storage clustermons
: Set of hosts to that must be ceph-mon. These hosts will maintain a map of the state of the clusterosds
: Set of hosts to that must be ceph-osd. These hosts will interact with the logical disk to stock the data
More details in the documentation here.
The inventory must define these variables to run the step:
ceph_cluster_network
: Address block to access to cluster networkceph_public_network
: Address block to access to public network (ie to the world)ceph_osd_disk
: Device to stock datas (only for ceph-osd hosts)osd_pool_default_min_size
: Minimal number of available OSD to ensure cluster success (best:ceil(osd_pool_default_size / 2.0)
)osd_pool_default_size
: Number of OSD in the cluster
...
Ceph provides ansible rules to configure the software, you can read the documentation here.
...
In this part, the playbook define the scheduling and the prioritization (see the section).
Hardening
The ansible/playbooks/cluster_setup_hardening_debian.yaml
playbook enables system hardening and the ansible/playbooks/cluster_setup_unhardening_debian.yaml
playbook disables it.
The hardened elements are:
- the kernel with the parameters of the command line (see below section), the sysfs and modules;
- the GRUB;
- the systemd services;
- adding of bash profiles;
- SSH server;
- adding of
sudo
rules; - the shadow password suite configuration;
- the secure tty;
- the audit daemon.
Kernel
The project uses a real-time kernel, the Linux kernel with the PREEMPT_RT patch. So, he needs to have some parameters as:
cpufreq.default_governor=performance
: Use theperformance
governor by default (more details here).hugepagesz=1G
: Uses1
giga-bytes for HugeTLB pages (more details here).intel_pstate=disable
: Disables theintel_pstate
as the default scaling driver for supported processors (more details here).isolcpus=nohz,domain,managed_irq
:nohz
to disable the tick when a single task runs;domain
to isolate from the general SMP balancing and scheduling algorithms;managed_irq
to isolate from being targeted by managed. See the Scheduling and priorization section.no_debug_object
: Disables object debugging.nosoftlockup
: Disable the soft-lockup detector (more details here).processors.max_cstate=1
andintel_idle.max_cstate=1
: Discards of all the idle states deeper than idle state1
, for theacpi_idle
andintel_idle
drivers, respectively (more details here).rcu_nocbs
: See the Scheduling and priorization section.rcu_nocb_poll
: Make the kthreads poll for callbacks.rcutree.kthread_prio=10
: Set the SCHED_FIFO priority of the RCU per-CPU kthreads.skew_tick=1
: Helps to smooth jitter on systems with latency-sensitive applications running.tsc=reliable
: Disables clocksource verification at runtime, as well as the stability checks done at bootup.
In the hardening system, the kernel has these parameters:
init_on_alloc=1
: Fill newly allocated pages and heap objects with zeroes.init_on_free=1
: Fill freed pages and heap objects with zeroes.slab_nomerge
: Disable merging of slabs with similar size.pti=on
: Enable the control Page Table Isolation of user and kernel address spaces.slub_debug=ZF
: Enable red zoning (Z
) and zanity checks (F
) on for all slabs (more details here).randomize_kstack_offset=on
: Enable kernel stack offset randomization.slab_common.usercopy_fallback=N
:iommu=pt
: Get best performance using the SR-IOV (TODO).security=yama
: Use theyama
security module to enable at boot.mce=0
: Disables the time in us to wait for other CPUs on machine checks.rng_core.default_quality=500
: Set the value of the entropy for the system.lsm=apparmor,lockdown,capability,landlock,yama,bpf
: Set the order of LSM initialization.
More details on the kernel's parameters here.