Skip to main content

I Need a Reboot

Author
Jeffrey Forman
Table of Contents

Problem
#

I was running Kubernetes upgrades with my script (kubify) but they were hanging on random machines. Not always the same one.

Running

2023-06-22 00:37:07,984 kubify.py:628 DEBUG running ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -t -t ubuntu@10.10.2.140 sudo apt-mark unhold kubeadm && sudo apt update && sudo apt install -y kubeadm=1.25.11-00 && sudo apt-mark hold kubeadm

Always to a hang. Never on the same machine.

Looking the machine in question, I found this output in ps

ubuntu   2663021  0.0  0.0   7892  3520 pts/0    Ss+  Jun21   0:00         bash -c sudo apt-mark unhold kubeadm && sudo apt update && sudo apt install -y kubeadm=1.25.11-00 && sudo apt-mark hold kubeadm
root     2663603  0.0  0.1  11896  4548 pts/0    S+   Jun21   0:00           sudo apt install -y kubeadm=1.25.11-00
root     2663604  0.0  0.0  11896   892 pts/1    Ss   Jun21   0:00             sudo apt install -y kubeadm=1.25.11-00
root     2663605  0.0  1.7  78816 69156 pts/1    S+   Jun21   0:03               apt install -y kubeadm=1.25.11-00
root     2663817  0.0  0.4  78816 19672 pts/1    S+   Jun21   0:00                 apt install -y kubeadm=1.25.11-00
root     2663826  0.0  0.0   2888   996 pts/1    S+   Jun21   0:00                   sh -c test -x /usr/lib/needrestart/apt-pinvoke && /usr/lib/needrestart/apt-pinvoke || true
root     2663827  0.0  0.4  25332 19228 pts/1    S+   Jun21   0:01                     /usr/bin/perl -w /usr/share/debconf/frontend /usr/sbin/needrestart
root     2663877  0.0  0.6  31288 24940 pts/1    S+   Jun21   0:00                       /usr/bin/perl /usr/sbin/needrestart
root     2663941  0.0  0.1  10820  4296 pts/1    S+   Jun21   0:00                       whiptail --backtitle Package configuration --title Pending kernel upgrade --output-fd 11 --msgbox Newer kernel available  The currently running kernel version is 5.15.0-73-generic which is not the expected kernel version 5.15.0-75-generic.  Restarting the system to load the new kernel will not be handled automatically, so you should consider rebooting. 11 122

So the machine needed to be rebooted to continue?

Solution
#

First, detecting if a reboot was needed.

Enter /var/run/reboot-required. A file one can check for existance.

If you were curious why, /var/run/reboot-required.pkgs has you covered.

$ cat /var/run/reboot-required.pkgs
linux-image-5.15.0-75-generic
linux-base

So, reboot before installing the packages? Reboot periodically to always be on the newest kernel?

Given I use Ansible as much as I can, enter the reboot module.

I wrote up a simple little playbook to reboot a set of hosts


# Reboot host(s) but only if necessary
---
- hosts: all
  become: yes
  gather_facts: no
  # Only do one host at a time.
  serial: 1

  tasks:
  - name: check if reboot required
    stat:
      path: /var/run/reboot-required
    register: reboot_required_path

  - name: reboot required found
    debug:
      msg: "Reboot-required file found on host."
    when: reboot_required_path.stat.exists

  - name: reboot host
    ansible.builtin.reboot:
      # How long to wait until retrying connection
      # after host is back up.
      post_reboot_delay: 300 # 5 min
      # How long to wait for machine to reboot
      # and respond to test command.
      reboot_timeout: 600 # 10 min
    when: reboot_required_path.stat.exists

  - name: pause for 2 minutes after rebooting a host
    pause:
      minutes: 2
    when: reboot_required_path.stat.exists

Docs
#