备注
AI Translation Notice
This document was automatically translated by hunyuan-turbos-latest model, for reference only.
Source document: kernel/process_management/de_thread.md
Translation time: 2026-01-19 06:02:08
Translation model:
hunyuan-turbos-latest
Please report issues via Community Channel
Principle of DragonOS Multi-threaded Exec (De-thread) Mechanism
备注
Author: longjin longjin@dragonos.org
1. Overview
In the POSIX standard, the execve system call is used to execute a new program, replacing the current process image. For multi-threaded programs (Thread Group), POSIX requires that after exec succeeds:
PID remains unchanged: The process’s PID (typically referring to TGID, i.e., Thread Group ID, in the kernel) must remain unchanged.
Single-threaded: All other threads in the original process group must be terminated, leaving only the thread that executed
exec(which becomes the new single-threaded process).
If the thread executing exec is the leader of the thread group (main thread), the situation is relatively simple: it only needs to kill the other threads. However, if the thread executing exec is an ordinary thread (non-leader), the situation becomes complex: it must “transform” into the leader, take over the original leader’s PID, and clean up the original leader. This process is called “de-threading” (De-thread).
This document details the implementation principles, processes, and concurrency control mechanisms of de_thread in the DragonOS kernel.
2. Core Challenges
Identity Theft (Identity Theft): The non-leader thread executing
exec(hereinafter referred to as the Exec-Thread) must exchange identities at the kernel level, making it appear to the user space and parent process as having the original leader’s PID.Concurrency Mutual Exclusion:
Two threads within the same process group may simultaneously call
exec, or one callsexecwhile another callsexit_group. Only one can successfully start.In the critical section of identity exchange, the process state is unstable, and it must be prevented for the parent process to see and reclaim the intermediate-state old leader via the
waitsystem call.
Resource Inheritance and Cleanup: The Exec-Thread needs to inherit the parent-child process relationships, child process lists, etc., of the original leader, while also being responsible for reclaiming the resources occupied by the old leader.
3. Implementation Mechanism
DragonOS refers to the design of the Linux kernel but combines it with Rust’s ownership and concurrency safety features in the specific implementation.
3.1 Key Data Structures
In the SigHand (signal handling structure, shared by the thread group), the following fields are introduced to manage the state:
SignalFlags::GROUP_EXEC: Flag bit. Indicates that the current thread group is undergoing de-threading.group_exec_task:Weak<ProcessControlBlock>. Points to the thread currently executingexec.group_exec_wait_queue:WaitQueue. Used to wait for other threads in the group to exit.group_exec_notify_count:isize. Count of threads waiting to exit (used to wake up the waiting queue).
3.2 Core Process (de_thread)
The de_thread function is located in kernel/src/process/exec.rs and is the core of implementing this logic.
Phase 1: Initiation and Mutual Exclusion
Lock and Mutual Exclusion: Call
sighand.start_group_exec(). IfGROUP_EXECorGROUP_EXIThas been set, returnEAGAINto ensure mutual exclusion with concurrentexec/exit_group.Set Executor: Record the current thread in
group_exec_taskand cleargroup_exec_notify_count.Single-thread Fast Path: If the thread group is empty (only itself), directly set
exit_signal = SIGCHLDand end the de-threading process.
Phase 2: Terminate Sibling Threads
Build Cleanup List: Traverse the thread group, collecting all still-alive threads not in the exit path (including the old leader).
Send Signal: Send
SIGKILLto each thread in the cleanup list one by one.Set Count: Write the length of the cleanup list into
group_exec_notify_countto wake up the waiting queue when threads exit.
Phase 3: Wait for Synchronization
Enter Wait: Sleep on
group_exec_wait_queuein a killable manner.Wake-up Conditions:
Traverse
group_tasksto confirm no other threads are alive except the current thread;If the current thread is not the leader, also ensure the old leader has entered the
ZombieorDeadstate;If a fatal signal is received or the wait is interrupted, return
EAGAIN.Note: Other threads in the exit path (
exit_notify) will decrement the count and wake up the waiting queue; if the executor exits abnormally, it will also clean up theGROUP_EXECflag in this path.
Phase 4: Identity Exchange (Identity Theft)
Only executed when the Exec-Thread is not the leader
PID/TID Exchange: Call
ProcessManager::exchange_tid_and_raw_pids.Swap the PID mapping relationships of the two in the global process table and swap the TID/PID fields inside the PCB.
Result: The Exec-Thread obtains the original TGID, and the Old-Leader obtains a temporary PID.
Signal Semantics Adjustment:
The
exit_signalof the new leader (Exec-Thread) is set toSIGCHLD;The
exit_signalof the Old-Leader is set toINVALIDto avoid being treated as an ordinary child process.
Structure Adjustment: Set the Exec-Thread as the new
group_leaderand clear both sides ofgroup_tasks(only one thread remains after de-threading).Relationship Inheritance:
Child Processes: Migrate the entire
childrenlist of the Old-Leader to under the Exec-Thread;Parent Process: The Exec-Thread inherits the
parent/real_parentof the Old-Leader and updatesfork_parentto itself.
Phase 5: Resource Cleanup
Reclaim Old Leader: If the Old-Leader is already a Zombie or Dead, the Exec-Thread directly marks it as Dead and releases the PID resource.
Completion: Regardless of success or failure, the
GROUP_EXECflag is ultimately cleared, and waiters are awakened.
3.3 Concurrency Protection: Preventing Parent Process from Incorrectly Reclaiming
This is the most delicate part of the implementation.
Problem Scenario:
Between Phase 3 and Phase 4, the Old-Leader has received SIGKILL and exited into a Zombie state. At this time, if the parent process calls wait(), it may find the Old-Leader (PID=TGID) is a Zombie and reclaim it.
If the parent process reclaims the Old-Leader before the Exec-Thread completes the PID exchange, the Exec-Thread will not be able to steal that PID (since the process corresponding to that PID has disappeared), or it may cause logical confusion.
Solution:
In the kernel/src/process/exit.rs’s do_wait logic, a check function reap_blocked_by_group_exec is added:
fn reap_blocked_by_group_exec(child_pcb: &Arc<ProcessControlBlock>) -> bool {
// 如果子进程是 Leader,且标记了 GROUP_EXEC,
// 且执行 exec 的不是它自己,那么说明它正在等待被“篡位”,父进程不能回收它。
if !child_pcb.is_thread_group_leader() {
return false;
}
if !child_pcb.sighand().flags_contains(SignalFlags::GROUP_EXEC) {
return false;
}
let exec_task = child_pcb.sighand().group_exec_task();
exec_task
.as_ref()
.map(|t| !Arc::ptr_eq(t, child_pcb))
.unwrap_or(true)
}
This check ensures that even if the Old-Leader becomes a Zombie, as long as the GROUP_EXEC flag is still present and exec_task is not it, the parent process cannot reclaim it. This provides a protective umbrella for the Exec-Thread to safely perform the PID/TID exchange.
4. Flowchart
sequenceDiagram
participant P as Parent Process
participant OL as Old Leader (PID=100)
participant ET as Exec Thread (PID=101)
participant S as Sibling (PID=102)
Note over P, S: 初始状态:TGID=100. Exec Thread 想要执行 exec
ET->>ET: start_group_exec()<br/>Set GROUP_EXEC flag
ET->>S: Send SIGKILL
ET->>OL: Send SIGKILL
ET->>ET: Wait for siblings & leader death
S->>S: Exit -> Zombie
S-->>ET: Notify (count--)
OL->>OL: Exit -> Zombie
OL-->>ET: Notify (count--)
Note right of P: Parent 调用 wait(PID=100)
P->>OL: Check PID 100 (Zombie)
P->>P: reap_blocked_by_group_exec?<br/>Yes! (Blocked by GROUP_EXEC)
Note right of P: Parent 无法回收 Old Leader,继续等待
ET->>ET: Woken up (All dead)
Note over ET, OL: 开始身份交换 (Identity Theft)
ET->>OL: exchange_tid_and_raw_pids()
Note over ET, OL: ET 变成 PID=100<br/>OL 变成 PID=101
ET->>ET: Inherit Parent/Children
ET->>OL: Release/Reap Old Leader
ET->>OL: OL 彻底销毁
ET->>ET: Finish GROUP_EXEC
Note over ET: exec 成功,ET 以 PID=100 运行新程序
5. References
Linux Kernel Source:
fs/exec.c(de_threadfunction)DragonOS Source:
kernel/src/process/exec.rskernel/src/process/exit.rskernel/src/ipc/sighand.rs