pthread_atfork and locks

Discussion:

Taylor R Campbell

2014-06-23 14:09:04 UTC

Pthread_atfork(3) says that no pthread_* routines may be used in the
child handler. But this completely defeats the purpose of
pthread_atfork. As the POSIX rationale section explains,

`The expected usage is that the prepare handler acquires all mutex
locks and the other two fork handlers release them.'
http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_atfork.html

There is a long comment in src/lib/libc/gen/pthread_atfork.c
explaining why the child does mutex_init instead of mutex_unlock: if
the spin lock inside a mutex is held by another thread, then trying to
unlock the mutex in the child will wait forever because the other
thread won't be there in the child.

That comment was written by nathanw@ in 2003 and is no longer accurate
about the state of affairs, since libpthread has been basically
completely rewritten and there is no internal spin lock involved. I
believe it is still safe to use mutex_init, but that doesn't help
portable code.

So, is it safe now to use mutex_unlock?

Cursory examination of pthread_mutex.c suggests to me that it is, in
spite of the exhortation in the pthread_atfork(3) man page. But
because of that exhortation I'm not sure we really guarantee it to
work. It would be good if we made sure pthread_mutex_unlock reliably
works in a child handler, and documented the fact and stuck to it.

We already do use pthread_mutex_unlock in pthread_atfork child
handlers -- see, e.g., the arc4random(3) code. (We could plausibly
make the arc4random state thread-local to sidestep this issue, but
that's not true of other global databases.)

Greg Troxel

2014-06-23 15:26:09 UTC

Permalink

Post by Taylor R Campbell
Pthread_atfork(3) says that no pthread_* routines may be used in the
child handler. But this completely defeats the purpose of
pthread_atfork. As the POSIX rationale section explains,
`The expected usage is that the prepare handler acquires all mutex
locks and the other two fork handlers release them.'
http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_atfork.html

This whole issue is a big mess. The basic issue is the notion of
"async-signal-safe", and which calls may be invoked from a signal
handler. After a fork, only one thread survives, and you basically
can't do much in the child except exec.

As I read POSIX, I don't see the restriction about pthread calls in the
child. But it may be implicit with the requirement not to use calls
which are not documented to be async-signal-safe.

We had a big problem with this where python (among other things) would
fork and then call malloc in the child. If the malloc lock were taken
by some other thread in the parent, then it would deadlock. python is
wrong here, as malloc is not specificed to be async-signal-safe. But
the workaround (which I'm a bit fuzzy on) was to take the malloc lock
in the before handler and release it in both - in libc, not in python.

Post by Taylor R Campbell
There is a long comment in src/lib/libc/gen/pthread_atfork.c
explaining why the child does mutex_init instead of mutex_unlock: if
the spin lock inside a mutex is held by another thread, then trying to
unlock the mutex in the child will wait forever because the other
thread won't be there in the child.

I think this is unsound (calling mutex_init). If the mutex is acquired
in the thread that calls fork, then it should be safe to unlock it.

Are you trying to figure out how to have the child act like a
full-fledged process without doing an exec? I don't think that's
really possible.

Taylor R Campbell

2014-06-23 20:16:02 UTC

Permalink

Date: Mon, 23 Jun 2014 11:26:09 -0400
From: Greg Troxel <***@ir.bbn.com>

We had a big problem with this where python (among other things) would
fork and then call malloc in the child. If the malloc lock were taken
by some other thread in the parent, then it would deadlock. python is
wrong here, as malloc is not specificed to be async-signal-safe. But
the workaround (which I'm a bit fuzzy on) was to take the malloc lock
in the before handler and release it in both - in libc, not in python.

As an aside, that's probably still an issue with NetBSD libc -- I
don't see any atfork handler in our malloc code.

I think this is unsound (calling mutex_init). If the mutex is acquired
in the thread that calls fork, then it should be safe to unlock it.

Calling mutex_init is safe in our implementation of libpthread. It's
just not portable. The question with mutex_unlock is whether, if some
other thread was trying to lock the mutex at the time of the fork,
some state internal to the mutex might be inconsistent in the child.

Are you trying to figure out how to have the child act like a
full-fledged process without doing an exec? I don't think that's
really possible.

Trying to figure out how to make a library with global state work in
multithreaded programs and in programs that use fork, without randomly
hanging or duplicating invariants or (for arc4random(3)) leaking
secrets to the children or anything horrible like that.

As far as I can tell, the intent in POSIX is that you should be able
to pthread_atfork(/*prepare*/lock, /*parent*/unlock, /*child*/unlock).
But I'm not confident we guarantee that to work in NetBSD, and we
ought to.

Greg Troxel

2014-06-23 20:35:56 UTC

Permalink

Post by Greg Troxel
We had a big problem with this where python (among other things) would
fork and then call malloc in the child. If the malloc lock were taken
by some other thread in the parent, then it would deadlock. python is
wrong here, as malloc is not specificed to be async-signal-safe. But
the workaround (which I'm a bit fuzzy on) was to take the malloc lock
in the before handler and release it in both - in libc, not in python.
As an aside, that's probably still an issue with NetBSD libc -- I
don't see any atfork handler in our malloc code.

There is _malloc_prefork() which gets the lock. I am not sure how
that's hooked in, but I remember someone (Christos) adding this from FreeBSD.

Post by Greg Troxel
I think this is unsound (calling mutex_init). If the mutex is acquired
in the thread that calls fork, then it should be safe to unlock it.
Calling mutex_init is safe in our implementation of libpthread. It's
just not portable. The question with mutex_unlock is whether, if some
other thread was trying to lock the mutex at the time of the fork,
some state internal to the mutex might be inconsistent in the child.

But if this thread had gotten the mutex, and then there was a signal,
we'd be clearing it. That's what I meant by unsound.

If the forking thread has the mutex, then the idea that some other
thread would be trying and somehow causing trouble doesn't make sense.
How is that different from the situation without fork? I think the
scary part about threaded fork is that the other threads just end in
some random place, possibly messy in time. (Hopefully fork() waits
until the other threads are descheduled.)

Post by Greg Troxel
Are you trying to figure out how to have the child act like a
full-fledged process without doing an exec? I don't think that's
really possible.
Trying to figure out how to make a library with global state work in
multithreaded programs and in programs that use fork, without randomly
hanging or duplicating invariants or (for arc4random(3)) leaking
secrets to the children or anything horrible like that.

But posix is very clear that in the child after fork from a threaded
program, only calls that are defined to be async-signal-safe may be
called:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html

but I guess defensive programming about this is fine, to avoid
disclosure.

I've come to believe that in a threaded program, the only sane thing
after fork is to immediately exec. But I can see trying to make this
work by taking all the mutexes before fork to make sure that the forking
thread has them.

Post by Greg Troxel
As far as I can tell, the intent in POSIX is that you should be able
to pthread_atfork(/*prepare*/lock, /*parent*/unlock, /*child*/unlock).
But I'm not confident we guarantee that to work in NetBSD, and we
ought to.

I would think that is safe, just based on the implementation. Do you
find that it isn't?

Taylor R Campbell

2014-06-25 00:27:44 UTC

Permalink

[FYI, in case it's not clear, in this thread when I have talked about
reinitializing or unlocking a mutex in the child, I assume that the
mutex was already locked by the forking thread in a pthread_atfork
prepare handler. I'm not interested in messing with a mutex that
could have been held by just anyone in the parent.]

Date: Mon, 23 Jun 2014 16:35:56 -0400
From: Greg Troxel <***@ir.bbn.com>

There is _malloc_prefork() which gets the lock. I am not sure how
that's hooked in, but I remember someone (Christos) adding this from FreeBSD.

Evidently it's not hooked in -- PR 47428 is still open.

If the forking thread has the mutex, then the idea that some other
thread would be trying and somehow causing trouble doesn't make sense.

Suppose mutex_lock looked like this:

void
mutex_lock(struct mutex *m)
{

spin_lock(&m->mtx_spin);
/* (*) */
while (m->mtx_owner)
cv_wait(&m->mtx_cv, &m->mtx_spin);
m->mtx_owner = curthread();
spin_unlock(&m->mtx_spin);
}

If the actual kernel-level fork happens at the time of (*), then even
if the thread calling fork() owns the mutex, the mutex's internal spin
lock would be held by a nonexistent thread in the child. So if the
child ever tried to unlock the mutex, or do anything requiring taking
the internal spin lock, it would hang.

mutex_lock doesn't look like that, however. My cursory analysis of
our actual mutex code suggests calling mutex_unlock in the child
should be safe, assuming the thread that forked in the parent held the
lock in the first place. But I'm not sure, and pthread_atfork(3) man
page says pthread_* is unsafe in the child with no exception for
pthread_mutex_unlock, hence my inquiry to the list.

Post by Taylor R Campbell
As far as I can tell, the intent in POSIX is that you should be able
to pthread_atfork(/*prepare*/lock, /*parent*/unlock, /*child*/unlock).
But I'm not confident we guarantee that to work in NetBSD, and we
ought to.

I would think that is safe, just based on the implementation. Do you
find that it isn't?

rmind@ was sceptical when I asked him, but I couldn't find any way
that it could go wrong.