or, how one thread should tell another thread to shut down when it might be doing a blocking call on a socket.
Unfortunately there doesn't seem to be a standard way of doing this which works across all Unix systems. I have investigated the behaviour of our two main Unix platforms, Solaris 2.5 and Digital Unix 3.2. On Digital Unix everything is fine, as the obvious method using shutdown() seems to work OK. Unfortunately on Solaris shutdown can only be used on a connected socket, so we need devious means to get around this limitation. The details are summarised below:
Thread A is in a loop, doing read(sock), processing the data, then going back into the read.
Thread B comes along and wants to shut it down - it can't cancel thread A since (i) working out how to clean up according to where A is in its loop is a nightmare, and (ii) this isn't available in omnithread anyway.
On Solaris 2.5 and Digital Unix 3.2 the following strategy works:
Thread B does shutdown(sock,2).
At this point thread A is either blocked inside read(sock), or is elsewhere in the loop. If the former then read will return 0, indicating that the socket is closed. If the latter then eventually thread A will call read(sock) and then this will return 0. Thread A should close(sock), do any other tidying up, and exit.
If there is another point in the loop that thread A can block then obviously thread B needs to be aware of this and be able to wake it up in the appropriate way from that point.
Again thread A is in a loop, this time doing an accept on listenSock, dealing with a new connection and going back into accept. Thread B wants to cancel it.
On Digital Unix 3.2 the strategy is identical to that for read:
Thread B does shutdown(listenSock,2). Wherever thread A is in the loop, eventually it will return ECONNABORTED from the accept call. It should close(listenSock), tidy up as necessary and exit.
On Solaris 2.5 thread B can't do shutdown(listenSock,2) - this returns ENOTCONN. Instead the following strategy can be used:
First thread B sets some sort of "shutdown flag" associated with listenSock. Then it does getsockaddr(listenSock) to find out which port listenSock is on (or knows already), sets up a socket dummySock, does connect(dummySock, this host, port) and finally does close(dummySock).
Now wherever thread A is in the loop, eventually it will call accept(listenSock). This will return successfully with a new socket, say connSock. Thread A then checks to see if the "shutdown flag" is set. If not, then it's a normal connection. If it is set, then thread A closes listenSock and connSock, tidies up and exits.
Thread A may be blocked in write, or about to go in to a potentially-blocking write. Thread B wants to shut it down.
On Solaris 2.5:
Thread B does shutdown(sock,2).
If thread A is already in write(sock) then it will return with ENXIO. If thread A calls write after thread B calls shutdown this will return EIO.
On Digital Unix 3.2:
Thread B does shutdown(sock,2).
If thread A is already in write(sock) then it will return the number of bytes written before it became blocked. A subsequent call to write will then generate SIGPIPE (or EPIPE will be returned if SIGPIPE is ignored by the thread).
Thread A may be blocked in connect, or about to go in to a potentially-blocking connect. Thread B wants to shut it down.
On Digital Unix 3.2:
Thread B does shutdown(sock,2).
If thread A is already in connect(sock) then it will return a successful connection. Subsequent reading or writing will show that the socket has been shut down (i.e. read returns 0, write generates SIGPIPE or returns EPIPE). If thread A calls connect after thread B calls shutdown this will return EINVAL.
On Solaris 2.5:
There is no way to wake up a thread which is blocked in connect. Instead Solaris forces us through a ridiculous procedure whichever way we try it. One way is this:
First thread A creates a pipe in addition to the socket. Instead of shutting down the socket, thread B simply writes a byte to the pipe.
Thread A meanwhile sets the socket non-blocking using fcntl(sock, F_SETFL, O_NONBLOCK). Then it calls connect on the socket - this will return EINPROGRESS. Then it must call select, waiting for either sock to become writable or for the pipe to become readable. If select returns that just sock is writable then the connection has succeeded. It then needs to set the socket back to blocking mode using fcntl(sock, F_SETFL, 0). If instead select returns that the pipe is readable, thread A closes the socket, tidies up and exits.
An alternative method is similar but to use polling instead of the pipe. Thread B justs sets a flag and thread A calls select with a timeout, periodically waking up to see if the flag has been set.