Problem#
I was running Kubernetes upgrades with my script (kubify) but they were hanging on random machines. Not always the same one.
Running
2023-06-22 00:37:07,984 kubify.py:628 DEBUG running ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -t -t ubuntu@10.10.2.140 sudo apt-mark unhold kubeadm && sudo apt update && sudo apt install -y kubeadm=1.25.11-00 && sudo apt-mark hold kubeadm
Always to a hang. Never on the same machine.
Looking the machine in question, I found this output in ps
ubuntu 2663021 0.0 0.0 7892 3520 pts/0 Ss+ Jun21 0:00 bash -c sudo apt-mark unhold kubeadm && sudo apt update && sudo apt install -y kubeadm=1.25.11-00 && sudo apt-mark hold kubeadm
root 2663603 0.0 0.1 11896 4548 pts/0 S+ Jun21 0:00 sudo apt install -y kubeadm=1.25.11-00
root 2663604 0.0 0.0 11896 892 pts/1 Ss Jun21 0:00 sudo apt install -y kubeadm=1.25.11-00
root 2663605 0.0 1.7 78816 69156 pts/1 S+ Jun21 0:03 apt install -y kubeadm=1.25.11-00
root 2663817 0.0 0.4 78816 19672 pts/1 S+ Jun21 0:00 apt install -y kubeadm=1.25.11-00
root 2663826 0.0 0.0 2888 996 pts/1 S+ Jun21 0:00 sh -c test -x /usr/lib/needrestart/apt-pinvoke && /usr/lib/needrestart/apt-pinvoke || true
root 2663827 0.0 0.4 25332 19228 pts/1 S+ Jun21 0:01 /usr/bin/perl -w /usr/share/debconf/frontend /usr/sbin/needrestart
root 2663877 0.0 0.6 31288 24940 pts/1 S+ Jun21 0:00 /usr/bin/perl /usr/sbin/needrestart
root 2663941 0.0 0.1 10820 4296 pts/1 S+ Jun21 0:00 whiptail --backtitle Package configuration --title Pending kernel upgrade --output-fd 11 --msgbox Newer kernel available The currently running kernel version is 5.15.0-73-generic which is not the expected kernel version 5.15.0-75-generic. Restarting the system to load the new kernel will not be handled automatically, so you should consider rebooting. 11 122
So the machine needed to be rebooted to continue?
Solution#
First, detecting if a reboot was needed.
Enter /var/run/reboot-required. A file one can check for existance.
If you were curious why, /var/run/reboot-required.pkgs
has you covered.
$ cat /var/run/reboot-required.pkgs
linux-image-5.15.0-75-generic
linux-base
So, reboot before installing the packages? Reboot periodically to always be on the newest kernel?
Given I use Ansible as much as I can, enter the reboot module.
I wrote up a simple little playbook to reboot a set of hosts
# Reboot host(s) but only if necessary
---
- hosts: all
become: yes
gather_facts: no
# Only do one host at a time.
serial: 1
tasks:
- name: check if reboot required
stat:
path: /var/run/reboot-required
register: reboot_required_path
- name: reboot required found
debug:
msg: "Reboot-required file found on host."
when: reboot_required_path.stat.exists
- name: reboot host
ansible.builtin.reboot:
# How long to wait until retrying connection
# after host is back up.
post_reboot_delay: 300 # 5 min
# How long to wait for machine to reboot
# and respond to test command.
reboot_timeout: 600 # 10 min
when: reboot_required_path.stat.exists
- name: pause for 2 minutes after rebooting a host
pause:
minutes: 2
when: reboot_required_path.stat.exists