reboot命令的执行过程

本文最后更新于：2025年11月19日下午

reboot命令大家很熟，但是究竟是怎么reboot的，为何reboot会reboot出bug来。本文以busybox的reboot版本分析。

问题起源

今天碰到一个问题，代码内reboot后系统有bug。命令行reboot后又没有问题。
追溯代码发现二者的差异，代码内实际运行的reboot -f命令。多了一个-f参数。这参数的区别是什么呢？

busybox内的reboot流程

reboot命令处理

init/halt.c内包含了reboot命令的实现，根据参数不同分为两种reboot。

reboot
给init进程或者pid 1的进程发送SIGTERM信号。如果是调用halt,poweroff命令，那么发送的SIGUSR1,SIGUSR2信号。
reboot -f
直接调用c库的reboot函数。

init进程处理

代码在init/init.c。SIGTERM信号的回调函数为halt_reboot_pwoff

static void halt_reboot_pwoff(int sig)
{
	const char *m;
	unsigned rb;

	/* We may call run() and it unmasks signals,
	 * including the one masked inside this signal handler.
	 * Testcase which would start multiple reboot scripts:
	 *  while true; do reboot; done
	 * Preventing it:
	 */
	reset_sighandlers_and_unblock_sigs();

	run_shutdown_and_kill_processes();

	m = "halt";
	rb = RB_HALT_SYSTEM;
	if (sig == SIGTERM) {
		m = "reboot";
		rb = RB_AUTOBOOT;
	} else if (sig == SIGUSR2) {
		m = "poweroff";
		rb = RB_POWER_OFF;
	}
	message(L_CONSOLE, "Requesting system %s", m);
	pause_and_low_level_reboot(rb);
	/* not reached */
}

init进程会调用run_shutdown_and_kill_processes，这个函数如下
（1）运行/etc/inittab里面shutdown标记的程序
（2）给所有进程发SIGTERM信号，sync, sleep 1s
（3）然后给所有进程发SIGILL信号，强行杀死，再sync
从这儿可以看出，busybox的reboot，进程杀不死，最多等待1s。

static void run_shutdown_and_kill_processes(void)
{
	/* Run everything to be run at "shutdown".  This is done _prior_
	 * to killing everything, in case people wish to use scripts to
	 * shut things down gracefully... */
	run_actions(SHUTDOWN);

	message(L_CONSOLE | L_LOG, "The system is going down NOW!");

	/* Send signals to every process _except_ pid 1 */
	kill(-1, SIGTERM);
	message(L_CONSOLE, "Sent SIG%s to all processes", "TERM");
	sync();
	sleep(1);

	kill(-1, SIGKILL);
	message(L_CONSOLE, "Sent SIG%s to all processes", "KILL");
	sync();
	/*sleep(1); - callers take care about making a pause */
}

然后回到halt_reboot_pwoff，里面会调用pause_and_low_level_reboot，里面就是简单的fork调用c库reboot。
c库的reboot实际是一个系统调用。

内核的reboot系统调用

代码实现在kernel/reboot.c

SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd,
		void __user *, arg)
{
	struct pid_namespace *pid_ns = task_active_pid_ns(current);
	char buffer[256];
	int ret = 0;

	/* We only trust the superuser with rebooting the system. */
	if (!ns_capable(pid_ns->user_ns, CAP_SYS_BOOT))
		return -EPERM;

	/* For safety, we require "magic" arguments. */
	if (magic1 != LINUX_REBOOT_MAGIC1 ||
			(magic2 != LINUX_REBOOT_MAGIC2 &&
			magic2 != LINUX_REBOOT_MAGIC2A &&
			magic2 != LINUX_REBOOT_MAGIC2B &&
			magic2 != LINUX_REBOOT_MAGIC2C))
		return -EINVAL;

	/*
	 * If pid namespaces are enabled and the current task is in a child
	 * pid_namespace, the command is handled by reboot_pid_ns() which will
	 * call do_exit().
	 */
	ret = reboot_pid_ns(pid_ns, cmd);
	if (ret)
		return ret;

	/* Instead of trying to make the power_off code look like
	 * halt when pm_power_off is not set do it the easy way.
	 */
	if ((cmd == LINUX_REBOOT_CMD_POWER_OFF) && !pm_power_off)
		cmd = LINUX_REBOOT_CMD_HALT;

	mutex_lock(&system_transition_mutex);
	switch (cmd) {
	case LINUX_REBOOT_CMD_RESTART:
		kernel_restart(NULL);
		break;

reboot_pid_ns函数是判断执行reboot的pid命名空间和init的pid命名空间是否一致，如果命名空间不一致（比如docker容器内reboot），那么系统层面是不会reboot的。
kernel_restart为核心实现。

void kernel_restart(char *cmd)
{
	kernel_restart_prepare(cmd);
	migrate_to_reboot_cpu();
	syscore_shutdown();
	if (!cmd)
		pr_emerg("Restarting system\n");
	else
		pr_emerg("Restarting system with command '%s'\n", cmd);
	kmsg_dump(KMSG_DUMP_SHUTDOWN);
	machine_restart(cmd);
}
EXPORT_SYMBOL_GPL(kernel_restart);

里面是一些通知链，shutdown设备等的处理。
然后打印出经常看到的reboot: Restarting system

疑问：为何我在其他的pid命名空间中调用reboot把整个系统重启了

实验环境ubuntu 24.04

命令如下，然后整个系统重启了

1 2	`sudo unshare -p --mount-proc -f bash reboot`

原因：
ubuntu的reboot，实际是调用的systemctl reboot。它是通过通信的方式给systemd发的消息。systemd是我们外面的宿主机，所以整个重启了。

观察：
（1）使用busybox的reboot，系统不重启，也不杀其他进程，只有reboot进程死掉。

1 2	`sudo unshare -p --mount-proc -f bash busybox reboot`

（2）隔离mount，umount掉所有挂载，用意为阻断systemctl和systemd他们的通信。

1
2
3

unshare -p --mount-proc -m -f bash
umount -a
reboot

发现系统不重启了，而是connect报错。

总结

reboot是标准的重启动作。
reboot -f不经过init处理，直接调用内核reboot系统调用。可能会跳过一些应用层必要的善后动作。

人生苦短，远离bug Leon, 2025-05-06

reboot命令的执行过程

https://leon0625.github.io/2025/04/16/556b72b699c2/

作者

leon.liu

发布于

2025年4月16日

许可协议

linux命名空间简介下一篇