Problem
I was running Kubernetes upgrades with my script (kubify) but they were hanging on random machines. Not always the same one.
Running
2023-06-22 00:37:07,984 kubify.py:628 DEBUG running ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -t -t ubuntu@10.10.2.140 sudo apt-mark unhold kubeadm && sudo apt update && sudo apt install -y kubeadm=1.25.11-00 && sudo apt-mark hold kubeadm
Always to a hang. Never on the same machine.
Looking the machine in question, I found this output in ps
ubuntu 2663021 0.0 0.0 7892 3520 pts/0 Ss+ Jun21 0:00 bash -c sudo apt-mark unhold kubeadm && sudo apt update && sudo apt install -y kubeadm=1.25.11-00 && sudo apt-mark hold kubeadm
root 2663603 0.0 0.1 11896 4548 pts/0 S+ Jun21 0:00 sudo apt install -y kubeadm=1.25.11-00
root 2663604 0.0 0.0 11896 892 pts/1 Ss Jun21 0:00 sudo apt install -y kubeadm=1.25.11-00
root 2663605 0.0 1.7 78816 69156 pts/1 S+ Jun21 0:03 apt install -y kubeadm=1.25.11-00
root 2663817 0.0 0.4 78816 19672 pts/1 S+ Jun21 0:00 apt install -y kubeadm=1.25.11-00
root 2663826 0.0 0.0 2888 996 pts/1 S+ Jun21 0:00 sh -c test -x /usr/lib/needrestart/apt-pinvoke && /usr/lib/needrestart/apt-pinvoke || true
root 2663827 0.0 0.4 25332 19228 pts/1 S+ Jun21 0:01 /usr/bin/perl -w /usr/share/debconf/frontend /usr/sbin/needrestart
root 2663877 0.0 0.6 31288 24940 pts/1 S+ Jun21 0:00 /usr/bin/perl /usr/sbin/needrestart
root 2663941 0.0 0.1 10820 4296 pts/1 S+ Jun21 0:00 whiptail --backtitle Package configuration --title Pending kernel upgrade --output-fd 11 --msgbox Newer kernel available The currently running kernel version is 5.15.0-73-generic which is not the expected kernel version 5.15.0-75-generic. Restarting the system to load the new kernel will not be handled automatically, so you should consider rebooting. 11 122
So the machine needed to be rebooted to continue?
Solution
First, detecting if a reboot was needed.
Enter /var/run/reboot-required. A file one can check for existance.
If you were curious why, /var/run/reboot-required.pkgs has you covered.
$ cat /var/run/reboot-required.pkgs
linux-image-5.15.0-75-generic
linux-base
So, reboot before installing the packages? Reboot periodically to always be on the newest kernel?
Given I use Ansible as much as I can, enter the reboot module.
I wrote up a simple little playbook to reboot a set of hosts
# Reboot host(s) but only if necessary
---
- hosts: all
become: yes
gather_facts: no
# Only do one host at a time.
serial: 1
tasks:
- name: check if reboot required
stat:
path: /var/run/reboot-required
register: reboot_required_path
- name: reboot required found
debug:
msg: "Reboot-required file found on host."
when: reboot_required_path.stat.exists
- name: reboot host
ansible.builtin.reboot:
# How long to wait until retrying connection
# after host is back up.
post_reboot_delay: 300 # 5 min
# How long to wait for machine to reboot
# and respond to test command.
reboot_timeout: 600 # 10 min
when: reboot_required_path.stat.exists
- name: pause for 2 minutes after rebooting a host
pause:
minutes: 2
when: reboot_required_path.stat.exists