This change adds a double linked list of threads, so that pthread_exit()
will know when it should call exit() from an orphaned child. This change
also improves ftrace and strace logging.
Since we're now on Windows 8, we can have clone() work as advertised on
Windows, where it sends a futex wake to the child tid. It's also likely
we no longer need to work around thread flakes on OpenBSD, in _wait0().
The organization of the source files is now much more rational.
Old experiments that didn't work out are now deleted. Naming of
things like files is now more intuitive.