There is a bug in WIN32 where using CancelIoEx() on an overlapped i/o op
initiated by WSASend() will cause WSAGetOverlappedResult() to report the
operation failed when it actually succeeded. We now work around that, by
having send and sendto initially consult WSAPoll() on O_NONBLOCK sockets
The C standard specifies that, upon handling the a conversion specifier,
the argument is converted to a string in which "there is one hexadecimal
digit (which is nonzero [...]) before the decimal-point character", this
being a requirement which cosmopolitan does not currently always handle,
sometimes printing numbers like "0x0.1p+5", where a correct output would
have been e.g. "0x1.0p+1" (despite both representing the same value, the
first one illegally has a '0' digit before the decimal-point character).
The C standard indicates that when processing the a conversion specifier
"if the precision is zero *and* the # flag is not specified, no decimal-
point character appears.". This means that __fmt needs to ensure that it
prints the decimal-point character not only when the precision is non-0,
but also when the # flag is specified - cosmopolitan currently does not.
This patch fixes this, along with adding a few tests for this behaviour.
The a conversion specifier to printf had some issues w.r.t. rounding, in
particular in edge cases w.r.t. "to nearest, ties to even" rounding (for
instance, "%.1a" with 0x1.78p+4 outputted 0x1.7p+4 instead of 0x1.8p+4).
This patch fixes this and adds several tests w.r.t ties to even rounding
Spawning processes would leak lots of memory, due to a missing free call
in ntspawn(). Our tooling never caught this since ntspawn() must use the
WIN32 memory allocator. It means every time posix_spawn, fork, or execve
got called, we would leak 162kb of memory. I'm proud to say that's fixed
You would think this is an important bug fix, but unfortunately all UNIX
implementations I've evaluated have a bug in read that causes signals to
not be handled atomically. The only exception is the latest iteration of
Cosmopolitan's read/write polyfill on Windows, which is somewhat ironic.
POSIX specifies the <apostrophe> flag character for printf as formatting
decimal conversions with the thousands' grouping characters specified by
the current locale. Given that cosmopolitan currently has no support for
obtaining the locale's grouping character, all that is required (when in
the C/POSIX locale) for supporting this flag is ignoring it, and as it's
already used to indicate quoting (for non-decimal conversions), all that
has to be done is to avoid having it be an alias for the <space> flag so
that decimal conversions don't accidentally behave as though the <space>
flag has also been specified whenever the <apostrophe> flag is utilized.
This patch adds this flag, as described above, along with a test for it.
When reading hexadecimal floats, cosmopolitan would previously sometimes
print a number of warnings relating to undefined behavior on left shift:
third_party/gdtoa/gethex.c:172: ubsan warning: signed left shift changed
sign bit or overflowed 12 'int' 28 'int' is undefined behavior
This is because gdtoa assumes left shifts are safe when overflow happens
even on signed integers - this is false: the C standard considers it UB.
This is easy to fix, by simply casting the shifted value to unsigned, as
doing so does not change the value or the semantics of the left shifting
(except for avoiding the undefined behavior, as the C standard specifies
that unsigned overflow yields wraparound, avoiding undefined behaviour).
This commit does this, and adds a testcase that previously triggered UB.
(this also adds test macros to test for exact float equality, instead of
the existing {EXPECT,ASSERT}_FLOAT_EQ macros which only tests inputs for
being "almost equal" (with a significant epsilon) whereas exact equality
makes more sense for certain things such as reading floats from strings,
and modifies other testcases for sscanf/fscanf of floats to utilize it).
This test would sometimes crash due to the EZBENCH2() macro occasionally
running the first benchmark (BenchMmapPrivate()) less times than it does
the second benchmark (BenchUnmap()) - this would then lead to a crash in
BenchUnmap() because BenchUnmap() expects that BenchMmapPrivate() has to
previously have been called at least as many times as it has itself such
that a region of memory has been mapped, for BenchUnmap() to then unmap.
This commit fixes this by utilizing the newer BENCHMARK() macro (instead
of the EZBENCH2() macro) which runs the benchmark using an count of runs
specified directly by the benchmark itself, which allows us to make sure
that the two benchmark functions get ran the exact same amount of times.
Currently, cosmopolitan's pledge sandbox .so shared object wrongly tries
to use a bunch of UBSAN symbols, which are not defined when outside of a
cosmopolitan-based context (save if the sandboxed binary also happens to
be itself using UBSAN, but that's obviously very commonly not the case).
Fix this by making it such that the sandbox .so shared object traps when
UBSAN is triggered, avoiding any attempt to call into the UBSAN runtime.
While always blocking statx did not lead to particularly bad results for
most cases (most code that uses statx appears to utilize a fallback when
statx is unavailable), it does lead to using usually far less used (thus
far less well tested) code: for example, musl's current fstatat fallback
for statx fails to set any values for stx_rdev_major and stx_rdev_minor,
which the raw syscall wouldn't (I've have sent a patch to musl for this,
but this won't fix older versions of musl and binaries/OSes using them).
Along with the fact that statx extends stat in several useful ways, this
seems to indicate it is far better to simply allow statx whenever pledge
also allows stat-family syscalls, i.e. for both rpath and wpath pledges.
When sem_wait() used its futexes it would always use process shared mode
which can be problematic on platforms like Windows, where that causes it
to use the slow futex polyfill. Now when sem_init() is called in private
mode that'll be passed along so we can use a faster WaitOnAddress() call
Our old code wasn't working with projects like Qt that call connect() in
O_NONBLOCK mode multiple times. This change overhauls connect() to use a
simpler WSAConnect() API and follows the same pattern as cosmo accept().
This change also reduces the binary footprint of read(), which no longer
needs to depend on our enormous clock_gettime() function.
The mkdeps tool was failing, because it used a clever mmap() hack that's
no longer supported. I've also removed tinymalloc.inc from build tooling
because Windows doesn't like the way it uses overcommit memory. Sadly it
means our build tool binaries will be larger. It's less of an issue, now
that we are no longer putting build tool binaries in the git repository.
This change should fix the Windows issues Qt Creator has been having, by
ensuring accept() and accept4() work in O_NONBLOCK mode. I switched away
from AcceptEx() which is buggy, back to using WSAAccept(). This requires
making a tradeoff where we have to accept a busy loop. However it is low
latency in nature, just like our new and improved Windows poll() code. I
was furthermore able to eliminate a bunch of Windows-related test todos.
It appears that GetFileAttributes(u"\\etc\\passwd") can take two seconds
on Windows 10 at unpredictable times for reasons which are mysterious to
me. Let's try avoiding that path entirely and pray to Microsoft it works
This issue probably only impacted the earliest releases of Windows 7 and
we only support Windows 10+ these days, so it's not worth adding 2000 ms
of startup latency to vim when ~/.vim doesn't exist.
Hexadecimal printing of floating-point numbers in cosmopolitan (that is,
using the the conversion specifier) is improved to have correct rounding
of results in rounding modes other than the default one (ie. FE_NEAREST)
This commit fixes that, and adds tests for the change (note that there's
still some rounding issues with the a conversion specifier in general in
relatively rare cases (that is without non-default rounding modes) where
I've left commented-out tests for anyone interested in improving it more
This change fixes a CreateProcess failure when a process is spawned with
no handles inherited. This is due to violating a common design rule in C
that works as follows: when you have (ptr, size) the ptr must be ignored
when size is zero. That's because cosmo's malloc(0) always returns a non
null pointer, which was happening in __describe_fds(), but ntspawn() was
basing its decision off the nullness of the pointer rather than its size
When polling sockets poll() can now let you know about an event in about
10µs rather than 10ms. If you're not polling sockets then poll() reports
console events now in microseconds instead of milliseconds.
Recursive mutexes now go as fast as normal mutexes. The tradeoff is they
are no longer safe to use in signal handlers. However you can still have
signal safe mutexes if you set your mutex to both recursive and pshared.
You can also make functions that use recursive mutexes signal safe using
sigprocmask to ensure recursion doesn't happen due to any signal handler
The impact of this change is that, on Windows, many functions which edit
the file descriptor table rely on recursive mutexes, e.g. open(). If you
develop your app so it uses pread() and pwrite() then your app should go
very fast when performing a heavily multithreaded and contended workload
For example, when scaling to 40+ cores, *NSYNC mutexes can go as much as
1000x faster (in CPU time) than the naive recursive lock implementation.
Now recursive will use *NSYNC under the hood when it's possible to do so
Using callbacks is still problematic with cosmo_dlopen() due to the need
to restore the TLS register. So using callbacks is even more strict than
using signal handlers. We are better off introducing a cosmoaudio_poll()
function. It makes the API more UNIX-like. How bad could the latency be?
Currently, in cosmopolitan, there is no handling of the current rounding
mode for long double conversions, such that round-to-nearest gets always
used, regardless of the current rounding mode. %Le also improperly calls
gdtoa with a too small precision (which led to relatively similar bugs).
This patch fixes these issues, in particular by modifying the FPI object
passed to gdtoa such that it is modifiable (so that __fmt can adjust its
rounding field to correspond to FLT_ROUNDS (note that this is not needed
for dtoa, which checks FLT_ROUNDS directly)) and ors STRTOG_Neg into the
kind field in both of the __fmt_dfpbits and __fmt_ldfpbits functions, as
the gdtoa function also depends on it to be able to accurately round any
negative arguments. The change to kind also requires a few other changes
to make sure kind's upper bits (which include STRTOG_Neg) are masked off
when attempting to only examine the lower bits' value. Furthermore, this
patch also makes exactly one change in gdtoa, which appears to be needed
to fix rounding issues with FE_TOWARDZERO (this seems like a gdtoa bug).
The patch also adds a few tests for these issues, along with also taking
the opportunity to clean up some of the previous tests to do the asserts
in the right order (i.e. with the first argument as the expected result,
and the second one being used as the value that it is compared against).
Before this commit, cosmopolitan had some issues with handling arguments
of 0 and signs, such as returning an incorrect sign when the input value
== -0.0, and incorrectly handling ndigit == 0 on fcvt (ndigit determines
the amount of digits *after* the radix character on fcvt, thus the parts
before it still must be outputted before fcvt's job is completely done).
This patch fixes these issues, and adds tests with corresponding inputs.
This change introduces comsoaudio v1. We're using a new strategy when it
comes to dynamic linking of dso files and building miniaudio device code
which I think will be fast and stable in the long run. You now have your
choice of reading/writing to the internal ring buffer abstraction or you
can specify a device-driven callback function instead. It's now possible
to not open the microphone when you don't need it since touching the mic
causes security popups to happen. The DLL is now built statically, so it
only needs to depend on kernel32. Our NES terminal emulator now uses the
cosmoaudio library and is confirmed to be working on Windows, Mac, Linux
The C Standard specifies that, when a conversion specification specifies
a conversion specifier of n, the type of the passed pointer is specified
by the length modifier (if any), i.e. that e.g. the argument for %hhn is
of type signed char *, but Cosmopolitan currently does not handle this -
instead always simply assuming that the pointer always points to an int.
This patch implements, and tests, length modifiers with the n conversion
specifier, with the tests testing all of the available length modifiers.
This change gets printvideo working on aarch64. Performance improvements
have been introduced for magikarp decimation on aarch64. The last of the
old portable x86 intrinsics library is gone, but it still lives in Blink