WARNING: this post talks about low level stuff, like named pipes, O_RDONLY, O_NONBLOCK which might sound like gibberish, so read on only if you need to reliably check the status (alive or not) of a given process.
Problem statement: we have a process and we need to programmatically check if the process is running. We want this check to be both reliable and cheap
This problem is pretty common whenever there is a daemon process and some other process (not necessarily a child) needs to check if the daemon is alive or not. Many solutions to this problem have been implemented in the wild, I’ll mention two common ones:
- PID file – the daemon creates a file and writes its process id. Processes interested in its status read the .pid file and check if the process is still running + some extra checks to ensure the PID wasn’t recycled. On graceful exit, daemon cleans up the .pid file
- socket/endpoint – the daemon opens up a local port and listens for connections. Processes interested in the status of the process, connect on the local port and probe the status
The above have either reliability issues and/or are not exactly cheap (at least a few syscalls each).
The proposed solution relies on the following behavior of named pipes:
A process can open a FIFO in nonblocking mode. In this case, opening for read-only succeeds even if no one has opened on the write side yet and opening for write-only fails with ENXIO (no such device or address) unless the other end has already been opened.
At startup, the daemon process creates and opens a named pipe (in a well known location) in read-only and non-blocking mode (O_RDONLY | O_NONBLOCK). This will succeed even if no writer are attached to the pipe.
Cost: 2 syscalls (mkfifo + open)
Processes interested in knowing the status of the daemon attempt to open the same pipe in write-only and non-blocking mode (O_WRONLY | O_NONBLOCK). This will only succeed if some other process/thread (the daemon) has opened the pipe for reading.
Cost: 1 syscalls (open)
The solution is simple, cheap and resilient to the daemon process exiting uncleanly, OS crash/reboot etc. If the daemon process ends up in a deadlock or some other hung state this solutions is not sufficient as another bit of information is needed, ie alive + well.
Enough explaining, checkout the C/C++ code or GO code , feel free to reuse/borrow etc. The concept works equally well on other languages, as long as you have access to mkfifo and non-blocking flags for file open
If you need to the same thing on Windows, you can use similar approach but open a file in exclusive mode (see dwShareMode)