Posts and stuff

Kubernetes is deprecating using docker as a runtime as of 1.20

A little background: I have a Kubernetes cluster with the masters running on Raspberry Pis running Raspbian and the workers on Intel NUCs running CentOS 7.

--------You can ignore this section if you aren't running Raspberry Pis--------
My Raspberry Pis were running kernel 4.19.75-v7l+ and was hitting a bug trying to do this conversion in kernel <4.14.159 or <4.19.89. I was getting the following error:
failed to "StartContainer" for "etcd" with RunContainerError: "failed to create containerd task: OCI runtime create failed: container_linux.go:349: starting container process caused \"process_linux.go:449: container init caused \\\"process_linux.go:415: setting cgroup config for procHooks process caused \\\\\\\"failed to write \\\\\\\\\\\\\\\"100000\\\\\\\\\\\\\\\" to \\\\\\\\\\\\\\\"/sys/fs/cgroup/cpu,cpuacct/system.slice/containerd.service/kubepods-besteffort-pod624587c2245d85257936b6b4eeee084f.slice:cri-containerd:56614f4494d205c14a16e47715d68fea52c62d6d4d0f99d51d83658bff0e7f10/cpu.cfs_period_us\\\\\\\\\\\\\\\":
To fix this, I had to run rpi-update to get a new kernel and modify the /boot/cmdline.txt to include the following and reboot.
cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory

Nothing needed to be done on CentOS for the kernel
--------End Raspberry Pi section--------

First thing to do is to drain the node you want to work on
kubectl drain node4 --ignore-daemonsets --delete-local-data
Once your node is drained you can convert.
Now we need to stop and remove docker from the box
systemctl stop docker
apt-get remove docker-ce docker-ce-cli
yum remove docker-ce docker-ce-cli
Debian only: I then did an install of containerd to ensure it was there and tracked as a package I installed.
apt-get install containerd.io
Generate a default config for containerd beyond what is there by default
containerd config default > /etc/containerd/config.toml
systemctl restart containerd
Now edit kubelet to use containerd, modify /var/lib/kubelet/kubeadm-flags.env add the following to the end of KUBELET_KUBEADM_ARGS
--container-runtime=remote --container-runtime-endpoint=unix:///run/containerd/containerd.sock
systemctl restart kubelet
Once I had done that, the apiserver and other master-node services started up without docker running.

You can verify things are running by checking the containers with (Images, containers and other things are namespaced with containerd):
ctr -n k8s.io container ls

If you are adding a new node to the cluster with just containerd you will need to modify the system a bit since Docker isn't there
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf

modprobe overlay
modprobe br_netfilter

# Setup required sysctl params, these persist across reboots.
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
net.bridge.bridge-nf-call-ip6tables = 1

# Apply sysctl params without reboot
sudo sysctl --system

In order to run a manual command against etcd running inside the K8S cluster you have to pass a few flags
Be very careful with this as you can really mess things up
Get the members of the etcd cluster
kubectl exec -it -n kube-system etcd-k8sn1 -- etcdctl --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/peer.crt --key-file=/etc/kubernetes/pki/etcd/peer.key --endpoints= member list
Should get you output like:
3c06d56f4e8193e0: name=k8sn3 peerURLs= clientURLs= isLeader=false
3cc1193e53313d05: name=k8sn2 peerURLs= clientURLs= isLeader=false
a7447908c6035378: name=k8sn1 peerURLs= clientURLs= isLeader=false
efd47b5d63640666: name=k8sn4 peerURLs= clientURLs= isLeader=true
You can then remove a specific node
kubectl exec -it -n kube-system etcd-k8sn1 -- etcdctl --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/peer.crt --key-file=/etc/kubernetes/pki/etcd/peer.key --endpoints= member remove a7447908c6035378

In order to specify the UEFI boot parameters in a UCS boot policy you need to be using one of the following disk types In this example we will be booting to a RedHat Virtualization Host (RHVH) in UEFI mode with SAN boot
Configure a boot policy to point to your disk type, in this case SAN then add the UEFI parameters as follows
Examples screenshots are using the 3.2 version of the UCS interface (HTML5), so they may look a bit different if using the old Java client
  1. Select the Servers tab
  2. Expand Servers -> Policies
  3. Expand Boot Policies and select your boot policy
  4. Select the LUN/Disk in the Boot Order list
  5. Click Modify/Set Uefi Boot Parameters
  6. For RHVH, set the following (This will be the similar for other Linux distros besides the name)
  7. Boot Loader Name: RHVH
  8. Boot Loader Path: EFI\BOOT\BOOTX64.EFI

Do base 10 arithmetic in bash instead of octal
((newvar = 10#$var + 10))
Remove leading zeros in bash
Find processes & threads for users (useful if running into nproc limits)
ps h -Led -o user | sort | uniq -c | sort -n
Show threads per process for user bob
ps -o nlwp,pid,lwp,args -u bob | sort -n
Delete files when rm/bash says too many arguments

No xargs
find . -name '*.pdf' -exec rm {} +
Using xargs
find . -name "*.pdf" -print0 | xargs -0 rm
Pretty tree of processes
ps -eo 'pid,args' --forest
Scan/re-scan a fibre channel adapter and scsi host

FC host
echo 1 > /sys/class/fc_host/host/issue_lip
SCSI host
echo "- - -" > /sys/class/scsi_host/host/scan
Individual disk
echo 1 > /sys/class/scsi_disk/0\:0\:0\:0/device/rescan
Check for open tcp ports from bash without telnet or nc

Replace <hostname> and <port> with your host and port you are checking
timeout 1 bash -c "cat < /dev/null > /dev/tcp/<hostname>/<port>"
Check the return result (echo $?) afterward and if it is 0 we had a successful connection, 1 means there was an error connecting before the timeout expired, 124 means the timeout expired and there was no connection

Display info from istheinternetonfire.com in a pretty way

Requires cowsay and the lolcat ruby gem (gem install lolcat)
case $- in *i*)dig +short txt istheinternetonfire.com | cowsay -f moose | lolcat; esac

There doesn't seem to be a way to install LVM directly to a raw device in the CentOS/RedHat installer. With a little bit of custom kickstart we can make it do it. This may work on CentOS 6, but I haven't tested it.

This assumes you have two drives connected to your system, one for boot and one for the system (I'm using VMs, so this is as simple as adding two disks). This also assumes that the disk device names are sd*, so make adjustments according to your system.

My Setup

1st disk - 2GB for boot
2nd disk - 30GB for OS

%pre section of config

In your kickstart config you will need a %pre section that does the following in order:

1. Creates an LVM physical volume on the device. This does a force (-ff) and overwrites any data there so be careful. -y just tells pvcreate to say yes to any "are you sure" messages.

2. Creates an LVM volume group named vg_system on the physical volume

This is all being redirected to /dev/tty1 so you can see the output during %pre

pvcreate -y -ff /dev/sdb &> /dev/tty1
vgcreate vg_system /dev/sdb &> /dev/tty1

Main section of config

In this main section of your config you will need something similar to this, adjust for the partitions and sizes you want. In order, this will:

1. Wipe out sda and initialize the disk if new

2. Put a single boot partition on it that fills the drive

3. Tells the kickstart to use the existing volume group created in the %pre

4. Creates a logical volume in the volume group for root (/) that fills the disk asside from swap

5. Creates a logical volume in the volume group for swap of 4GB

clearpart --drives=sda --initlabel --all
part /boot --fstype=ext4 --size=1 --grow --ondisk=sda
volgroup vg_system --useexisting
logvol / --fstype=ext4 --name=lv_root --vgname=vg_system --grow --size=1024
logvol swap --name=lv_swap --vgname=vg_system --grow --size=4096 --maxsize=4096

If you don't have a base kickstart config to work with look on a CentOS/RedHat box in root's home directory for an anaconda-ks.cfg which is the kickstart equivalent of how that system was installed


Make sure the hostname of the system is set correctly and all firewalls and selinux are disabled (Or configured for the ports specified at the bottom)

Install the packages

pam-devel is necessary if you want to build libnss_winbind.so.2 and use pam/winbind for local auth

yum -y install gcc make wget python-devel gnutls-devel openssl-devel libacl-devel krb5-server krb5-libs krb5-workstation bind bind-libs bind-utils pam-devel ntp openldap-devel ncurses-devel

Download and compile Samba

You can replace -j3 with n+1 CPUs you have available on the system.

wget https://www.samba.org/samba/ftp/stable/samba-4.3.1.tar.gz
tar xvzf samba-4.3.1.tar.gz
cd samba-4.3.1
./configure --enable-selftest
make -j3 && make -j3 install

Run Samba domain creation and configuration

This will create your domain as domain.com using bind as the backend and setting rfc2307 attributes in AD so you can store and retrieve UID/GID in AD.

/usr/local/samba/bin/samba-tool domain provision --realm=domain.com --domain=DOMAIN --server-role=dc --dns-backend=BIND9_DLZ --adminpass "myadpassword" --use-rfc2307

Create an init script for Samba

Edit /etc/init.d/samba4

#! /bin/bash
# samba4 start and stop samba service
# chkconfig: - 90 10
# description: Activates/Deactivates all samba4 interfaces configured to start at boot time.
# config: /usr/local/samba/etc/smb.conf
# Provides:
# Required-Start: $local_fs $network
# Required-Stop: $local_fs $network
# Should-Start:
# Short-Description: start and stop samba service
# Description: start and stop samba service
# Source function library.
. /etc/init.d/functions

if [ -f /etc/sysconfig/samba4 ]; then
. /etc/sysconfig/samba4


start() {
        #Samba is already running, exit
        [ -e $lockfile ] && exit 1
        #Start service
        echo -n $"Starting $prog: "
        daemon $prog_path/$prog
        [ $RETVAL -eq 0 ] && touch $lockfile
        return $RETVAL
stop() {
        # Stop service.
        echo -n $"Shutting down $prog: "
        killproc $prog_path/$prog
        [ $RETVAL -eq 0 ] && rm -f $lockfile
        return $RETVAL

# See how we were called.
case "$1" in
                status smbd
                status samba
                echo $"Usage: $0 {start|stop|restart|status}"
                exit 1

exit $RETVAL
chmod 755 /etc/init.d/samba4

Configure Bind/DNS

Configure Bind to allow recursion (external resolving) to local subnets and point to the kerberos key that the samba-tool command generated. Also include the Samba Bind database.

acl "trusted" {;;
options {
        listen-on port 53 { any; };
        allow-query { any; };
        tkey-gssapi-keytab "/usr/local/samba/private/dns.keytab";
        allow-recursion {
                /* Only trusted addresses are allowed to use recursion. */
        notify yes;
include "/usr/local/samba/private/named.conf";

/usr/local/samba/private/named.conf looks like below. CentOS 6 comes with Bind 9.8 which is the default.

dlz "AD DNS Zone" {
    # For BIND 9.8.0
    database "dlopen /usr/local/samba/lib/bind9/dlz_bind9.so";

    # For BIND 9.9.0
    # database "dlopen /usr/local/samba/lib/bind9/dlz_bind9_9.so";

Start named and make sure it is on by default

chkconfig named on
/etc/init.d/named start

Change /etc/resolv.conf to point to itself for DNS

domain domain.com

Configure kerberos

edit /etc/krb5.conf. Case is important on the REALM here

        default_realm = DOMAIN.COM
        dns_lookup_realm = true
        dns_lookup_kdc = true

Configure NTP

Edit /etc/ntp.conf. I commented out the IPv6 stuff as it was causing errors in the logs, not necessary but makes the log nicer. Note the sections on ntpsigndsocket and mssntp, these are required for Windows machines to successfully sync time with this AD server.

restrict default nomodify notrap nopeer noquery
#restrict -6 default nomodify notrap nopeer noquery

# Permit all access over the loopback interface.  This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
restrict -6 ::1

ntpsigndsocket /usr/local/samba/var/lib/ntp_signd/

# Hosts on local network are less restricted.
restrict mask nomodify notrap kod nopeer mssntp

# Undisciplined Local Clock. This is a fake driver intended for backup
# and when no outside source of synchronized time is available.
server     # local clock
fudge stratum 10

# List of time sources managed by puppet class { "ntp": servers => [ ... ] }
server 0.gentoo.pool.ntp.org version 3 prefer
server 1.gentoo.pool.ntp.org version 3 prefer
server 2.gentoo.pool.ntp.org version 3 prefer
server 3.gentoo.pool.ntp.org version 3 prefer

# Drift file.  Put this in a directory which the daemon can write to.
# No symbolic links allowed, either, since the daemon updates the file
# by creating a temporary in the same directory and then rename()\'ing
# it to the file.
driftfile /var/lib/ntp/drift

# Logging
logfile /var/log/ntp
logconfig =syncevents +clockevents

Start ntpd and make sure it is on by default

chkconfig ntpd on
/etc/init.d/ntpd start

Start it up

Start samba4 and make sure it is on by default

chkconfig samba4 on
/etc/init.d/samba4 start


Open the following ports: UDP 53, 123, 135, 138, 389 TCP 88, 464, 139, 445, 389, 636, 3268, 3269 53 - DNS 88 - Kerberos 123 - NTP 135 - RCP 138 - NetBIOS 139 - NetBIOS session 389 - LDAP 445 - MS directory services 464 - Kerberos passwd 636 - SSL LDAP 3268 - Global catalog 3269 - SSL Global catalog