cosmopolitan/libc/intrin/sched_yield.S
Justine Tunney 791f79fcb3
Make improvements
- We now serialize the file descriptor table when spawning / executing
  processes on Windows. This means you can now inherit more stuff than
  just standard i/o. It's needed by bash, which duplicates the console
  to file descriptor #255. We also now do a better job serializing the
  environment variables, so you're less likely to encounter E2BIG when
  using your bash shell. We also no longer coerce environ to uppercase

- execve() on Windows now remotely controls its parent process to make
  them spawn a replacement for itself. Then it'll be able to terminate
  immediately once the spawn succeeds, without having to linger around
  for the lifetime as a shell process for proxying the exit code. When
  process worker thread running in the parent sees the child die, it's
  given a handle to the new child, to replace it in the process table.

- execve() and posix_spawn() on Windows will now provide CreateProcess
  an explicit handle list. This allows us to remove handle locks which
  enables better fork/spawn concurrency, with seriously correct thread
  safety. Other codebases like Go use the same technique. On the other
  hand fork() still favors the conventional WIN32 inheritence approach
  which can be a little bit messy, but is *controlled* by guaranteeing
  perfectly clean slates at both the spawning and execution boundaries

- sigset_t is now 64 bits. Having it be 128 bits was a mistake because
  there's no reason to use that and it's only supported by FreeBSD. By
  using the system word size, signal mask manipulation on Windows goes
  very fast. Furthermore @asyncsignalsafe funcs have been rewritten on
  Windows to take advantage of signal masking, now that it's much more
  pleasant to use.

- All the overlapped i/o code on Windows has been rewritten for pretty
  good signal and cancelation safety. We're now able to ensure overlap
  data structures are cleaned up so long as you don't longjmp() out of
  out of a signal handler that interrupted an i/o operation. Latencies
  are also improved thanks to the removal of lots of "busy wait" code.
  Waits should be optimal for everything except poll(), which shall be
  the last and final demon we slay in the win32 i/o horror show.

- getrusage() on Windows is now able to report RUSAGE_CHILDREN as well
  as RUSAGE_SELF, thanks to aggregation in the process manager thread.
2023-10-08 08:59:53 -07:00

112 lines
4.2 KiB
ArmAsm

/*-*- mode:unix-assembly; indent-tabs-mode:t; tab-width:8; coding:utf-8 -*-│
vi: set et ft=asm ts=8 tw=8 fenc=utf-8 :vi
Copyright 2022 Justine Alexandra Roberts Tunney
Permission to use, copy, modify, and/or distribute this software for
any purpose with or without fee is hereby granted, provided that the
above copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL
WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE
AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL
DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THIS SOFTWARE.
*/
#include "libc/dce.h"
#include "libc/sysv/consts/nr.h"
#include "libc/macros.internal.h"
// Relinquishes scheduled quantum.
//
// @return 0 on success, or -1 w/ errno
.ftrace1
sched_yield:
.ftrace2
#ifdef __x86_64__
push %rbp
mov %rsp,%rbp
xor %eax,%eax
mov __hostos(%rip),%dl
#if SupportsMetal()
testb $_HOSTMETAL,%dl
jnz 9f
#endif
#if SupportsWindows()
// Windows Support
//
// A value of zero, together with the bAlertable parameter set to
// FALSE, causes the thread to relinquish the remainder of its time
// slice to any other thread that is ready to run, if there are no
// pending user APCs on the calling thread. If there are no other
// threads ready to run and no user APCs are queued, the function
// returns immediately, and the thread continues execution.
// Quoth MSDN
testb $_HOSTWINDOWS,%dl
jz 1f
xor %ecx,%ecx
xor %edx,%edx
ntcall __imp_SleepEx
xor %eax,%eax
jmp 9f
1:
#endif
#if SupportsSystemv()
// On XNU we polyfill sched_yield() using sleep() which'll
// be polyfilled using select() with a zero timeout, which
// means to wait zero microseconds and then returns a zero
// and this hopefully will give other threads a chance too
// XNU has a special version we use called select_nocancel
//
// "If the readfds, writefds, and errorfds arguments are
// all null pointers and the timeout argument is not a
// null pointer, the pselect() or select() function shall
// block for the time specified, or until interrupted by
// a signal." Quoth IEEE 1003.1-2017 §functions/select
//
// On other platforms, sched_yield() takes no arguments.
push $0 // timeout.tv_usec
push $0 // timeout.tv_sec
xor %edi,%edi // nfds
xor %esi,%esi // readfds
xor %edx,%edx // writefds
xor %r10d,%r10d // exceptfds
mov %rsp,%r8 // timeout
mov __NR_sched_yield,%eax // ordinal
clc // linux
syscall
// It should not be possible for this to fail so we don't
// bother going through the errno ritual. If this somehow
// fails a positive or negative errno might get returned.
#endif
9: leave
ret
#elif defined(__aarch64__)
stp x29,x30,[sp,-32]!
mov x29,sp
mov x3,0
mov x2,0
add x4,sp,16
mov x1,0
mov w0,0
stp xzr,xzr,[sp,16]
mov x8,#0x7c // sched_yield() for gnu/systemd
mov x16,#0x5d // select(0,0,0,0,&blah) for xnu
svc 0
ldp x29,x30,[sp],32
ret
#else
#error "arch unsupported"
#endif
.endfn sched_yield,globl
.previous