Commit graph

957 commits

Author SHA1 Message Date
Gabriel Ravier
675abfa029
Add POSIX's apostrophe flag for printf-based funcs (#1285)
POSIX specifies the <apostrophe> flag character for printf as formatting
decimal conversions with the thousands' grouping characters specified by
the current locale. Given that cosmopolitan currently has no support for
obtaining the locale's grouping character, all that is required (when in
the C/POSIX locale) for supporting this flag is ignoring it, and as it's
already used to indicate quoting (for non-decimal conversions), all that
has to be done is to avoid having it be an alias for the <space> flag so
that decimal conversions don't accidentally behave as though the <space>
flag has also been specified whenever the <apostrophe> flag is utilized.

This patch adds this flag, as described above, along with a test for it.
2024-09-14 17:17:40 -07:00
Gabriel Ravier
e3d28de8a6
Fix UB in gdtoa hexadecimal float scanf and strtod (#1288)
When reading hexadecimal floats, cosmopolitan would previously sometimes
print a number of warnings relating to undefined behavior on left shift:

third_party/gdtoa/gethex.c:172: ubsan warning: signed left shift changed
sign bit or overflowed 12 'int' 28 'int' is undefined behavior

This is because gdtoa assumes left shifts are safe when overflow happens
even on signed integers - this is false: the C standard considers it UB.
This is easy to fix, by simply casting the shifted value to unsigned, as
doing so does not change the value or the semantics of the left shifting
(except for avoiding the undefined behavior, as the C standard specifies
that unsigned overflow yields wraparound, avoiding undefined behaviour).

This commit does this, and adds a testcase that previously triggered UB.
(this also adds test macros to test for exact float equality, instead of
the existing {EXPECT,ASSERT}_FLOAT_EQ macros which only tests inputs for
being "almost equal" (with a significant epsilon) whereas exact equality
makes more sense for certain things such as reading floats from strings,
and modifies other testcases for sscanf/fscanf of floats to utilize it).
2024-09-14 17:11:04 -07:00
Gabriel Ravier
7f21547122
Fix occasional crash in test/libc/intrin/mmap_test (#1289)
This test would sometimes crash due to the EZBENCH2() macro occasionally
running the first benchmark (BenchMmapPrivate()) less times than it does
the second benchmark (BenchUnmap()) - this would then lead to a crash in
BenchUnmap() because BenchUnmap() expects that BenchMmapPrivate() has to
previously have been called at least as many times as it has itself such
that a region of memory has been mapped, for BenchUnmap() to then unmap.

This commit fixes this by utilizing the newer BENCHMARK() macro (instead
of the EZBENCH2() macro) which runs the benchmark using an count of runs
specified directly by the benchmark itself, which allows us to make sure
that the two benchmark functions get ran the exact same amount of times.
2024-09-14 17:07:56 -07:00
Justine Tunney
462ba6909e
Speed up unnamed POSIX semaphores
When sem_wait() used its futexes it would always use process shared mode
which can be problematic on platforms like Windows, where that causes it
to use the slow futex polyfill. Now when sem_init() is called in private
mode that'll be passed along so we can use a faster WaitOnAddress() call
2024-09-13 06:25:27 -07:00
Justine Tunney
6b10f4d0b6
Fix ioctl() and FIONREAD for sockets on Windows
This change fixes an issue where using FIONREAD would cause control flow
to jump to null, due to a _weaken() reference that I refactored long ago
2024-09-13 01:47:33 -07:00
Justine Tunney
1260f9d0ed
Add more tests for strlcpy()
I asked ChatGPT o1 for an optimized version of strlcpy() but it couldn't
produce an implementation that didn't crash. Eventually it just gave up.
2024-09-13 01:14:35 -07:00
Justine Tunney
e142124730
Rewrite Windows connect()
Our old code wasn't working with projects like Qt that call connect() in
O_NONBLOCK mode multiple times. This change overhauls connect() to use a
simpler WSAConnect() API and follows the same pattern as cosmo accept().
This change also reduces the binary footprint of read(), which no longer
needs to depend on our enormous clock_gettime() function.
2024-09-12 23:07:52 -07:00
Justine Tunney
acd6c32184
Rewrite Windows accept()
This change should fix the Windows issues Qt Creator has been having, by
ensuring accept() and accept4() work in O_NONBLOCK mode. I switched away
from AcceptEx() which is buggy, back to using WSAAccept(). This requires
making a tradeoff where we have to accept a busy loop. However it is low
latency in nature, just like our new and improved Windows poll() code. I
was furthermore able to eliminate a bunch of Windows-related test todos.
2024-09-12 04:23:38 -07:00
Justine Tunney
6f868fe1de
Fix polling of files on Windows 2024-09-11 17:13:23 -07:00
Gabriel Ravier
4d05060aac
Partially fix printf hex float numbers/%a rounding (#1286)
Hexadecimal printing of floating-point numbers in cosmopolitan (that is,
using the the conversion specifier) is improved to have correct rounding
of results in rounding modes other than the default one (ie. FE_NEAREST)

This commit fixes that, and adds tests for the change (note that there's
still some rounding issues with the a conversion specifier in general in
relatively rare cases (that is without non-default rounding modes) where
I've left commented-out tests for anyone interested in improving it more
2024-09-10 20:42:52 -07:00
Justine Tunney
fbdf9d028c
Rewrite Windows poll()
We can now await signals, files, pipes, and console simultaneously. This
change also gives a deeper review and testing to changes made yesterday.
2024-09-10 20:04:02 -07:00
Justine Tunney
cceddd21b2
Reduce latency of poll() on Windows
When polling sockets poll() can now let you know about an event in about
10µs rather than 10ms. If you're not polling sockets then poll() reports
console events now in microseconds instead of milliseconds.
2024-09-10 04:12:21 -07:00
Justine Tunney
a0a404a431
Fix issues with previous commit 2024-09-10 01:59:46 -07:00
Justine Tunney
2f48a02b44
Make recursive mutexes faster
Recursive mutexes now go as fast as normal mutexes. The tradeoff is they
are no longer safe to use in signal handlers. However you can still have
signal safe mutexes if you set your mutex to both recursive and pshared.
You can also make functions that use recursive mutexes signal safe using
sigprocmask to ensure recursion doesn't happen due to any signal handler

The impact of this change is that, on Windows, many functions which edit
the file descriptor table rely on recursive mutexes, e.g. open(). If you
develop your app so it uses pread() and pwrite() then your app should go
very fast when performing a heavily multithreaded and contended workload

For example, when scaling to 40+ cores, *NSYNC mutexes can go as much as
1000x faster (in CPU time) than the naive recursive lock implementation.
Now recursive will use *NSYNC under the hood when it's possible to do so
2024-09-10 00:08:59 -07:00
Justine Tunney
95fee8614d
Test recursive mutex code more 2024-09-09 00:19:23 -07:00
Gabriel Ravier
4754f200ee
Fix printf-family long double prec/rounding issues (#1283)
Currently, in cosmopolitan, there is no handling of the current rounding
mode for long double conversions, such that round-to-nearest gets always
used, regardless of the current rounding mode. %Le also improperly calls
gdtoa with a too small precision (which led to relatively similar bugs).

This patch fixes these issues, in particular by modifying the FPI object
passed to gdtoa such that it is modifiable (so that __fmt can adjust its
rounding field to correspond to FLT_ROUNDS (note that this is not needed
for dtoa, which checks FLT_ROUNDS directly)) and ors STRTOG_Neg into the
kind field in both of the __fmt_dfpbits and __fmt_ldfpbits functions, as
the gdtoa function also depends on it to be able to accurately round any
negative arguments. The change to kind also requires a few other changes
to make sure kind's upper bits (which include STRTOG_Neg) are masked off
when attempting to only examine the lower bits' value. Furthermore, this
patch also makes exactly one change in gdtoa, which appears to be needed
to fix rounding issues with FE_TOWARDZERO (this seems like a gdtoa bug).

The patch also adds a few tests for these issues, along with also taking
the opportunity to clean up some of the previous tests to do the asserts
in the right order (i.e. with the first argument as the expected result,
and the second one being used as the value that it is compared against).
2024-09-07 18:26:04 -07:00
Gabriel Ravier
f882887178
Fix ecvt/fcvt issues w.r.t. value==0 and ndigit==0 (#1282)
Before this commit, cosmopolitan had some issues with handling arguments
of 0 and signs, such as returning an incorrect sign when the input value
== -0.0, and incorrectly handling ndigit == 0 on fcvt (ndigit determines
the amount of digits *after* the radix character on fcvt, thus the parts
before it still must be outputted before fcvt's job is completely done).

This patch fixes these issues, and adds tests with corresponding inputs.
2024-09-07 18:08:11 -07:00
Gabriel Ravier
c66abd7260
Implement length modifiers for printf %n conv spec (#1278)
The C Standard specifies that, when a conversion specification specifies
a conversion specifier of n, the type of the passed pointer is specified
by the length modifier (if any), i.e. that e.g. the argument for %hhn is
of type signed char *, but Cosmopolitan currently does not handle this -
instead always simply assuming that the pointer always points to an int.

This patch implements, and tests, length modifiers with the n conversion
specifier, with the tests testing all of the available length modifiers.
2024-09-06 19:34:24 -07:00
Justine Tunney
d1157d471f
Upgrade pl_mpeg
This change gets printvideo working on aarch64. Performance improvements
have been introduced for magikarp decimation on aarch64. The last of the
old portable x86 intrinsics library is gone, but it still lives in Blink
2024-09-06 19:10:34 -07:00
Justine Tunney
dd8544c3bd
Delve into clock rabbit hole
The worst issue I had with consts.sh for clock_gettime is how it defined
too many clocks. So I looked into these clocks all day to figure out how
how they overlap in functionality. I discovered counter-intuitive things
such as how CLOCK_MONOTONIC should be CLOCK_UPTIME on MacOS and BSD, and
that CLOCK_BOOTTIME should be CLOCK_MONOTONIC on MacOS / BSD. Windows 10
also has some incredible new APIs, that let us simplify clock_gettime().

  - Linux CLOCK_REALTIME         -> GetSystemTimePreciseAsFileTime()
  - Linux CLOCK_MONOTONIC        -> QueryUnbiasedInterruptTimePrecise()
  - Linux CLOCK_MONOTONIC_RAW    -> QueryUnbiasedInterruptTimePrecise()
  - Linux CLOCK_REALTIME_COARSE  -> GetSystemTimeAsFileTime()
  - Linux CLOCK_MONOTONIC_COARSE -> QueryUnbiasedInterruptTime()
  - Linux CLOCK_BOOTTIME         -> QueryInterruptTimePrecise()

Documentation on the clock crew has been added to clock_gettime() in the
docstring and in redbean's documentation too. You can read that to learn
interesting facts about eight essential clocks that survived this purge.
This is original research you will not find on Google, OpenAI, or Claude

I've tested this change by porting *NSYNC to become fully clock agnostic
since it has extensive tests for spotting irregularities in time. I have
also included these tests in the default build so they no longer need to
be run manually. Both CLOCK_REALTIME and CLOCK_MONOTONIC are good across
the entire amd64 and arm64 test fleets.
2024-09-04 01:32:46 -07:00
Gabriel Ravier
8f8145105c
Add POSIX's C conversion specifier to printf funcs (#1276)
POSIX specifies the C conversion specifier as being "equivalent to %lc",
i.e. printf("%C", arg) is equivalent in behaviour to printf("%lc", arg).

This patch implements this conversion specifier, and adds a test for it,
alongside another test, which ensures that va_arg uses the correct size,
even though we set signbit to 63 in the code (which one might think will
result in the wrong size of argument being va_arg-ed, but having signbit
set to 63 is in fact what __fmt_stoa expects and is a requirement for it
properly formatting the wchar_t argument - this does not result in wrong
usage of va_arg because the implementation of the c conversion specifier
(which the implementation of the C conversion specifier fallsthrough to)
always calls va_arg with an argument type of int, to avoid the very same
bug occuring with %lc, as the l length modifier also sets signbit to 63)
2024-09-03 00:33:55 -07:00
Justine Tunney
3c61a541bd
Introduce pthread_condattr_setclock()
This is one of the few POSIX APIs that was missing. It lets you choose a
monotonic clock for your condition variables. This might improve perf on
some platforms. It might also grant more flexibility with NTP configs. I
know Qt is one project that believes it needs this. To introduce this, I
needed to change some the *NSYNC APIs, to support passing a clock param.
There's also new benchmarks, demonstrating Cosmopolitan's supremacy over
many libc implementations when it comes to mutex performance. Cygwin has
an alarmingly bad pthread_mutex_t implementation. It is so bad that they
would have been significantly better off if they'd used naive spinlocks.
2024-09-02 23:45:42 -07:00
Justine Tunney
90460ceb3c
Make Cosmo mutexes competitive with Apple Libc
While we have always licked glibc and musl libc on gnu/systemd sadly the
Apple Libc implementation of pthread_mutex_t is better than ours. It may
be due to how the XNU kernel and M2 microprocessor are in league when it
comes to scheduling processes and the NSYNC behavior is being penalized.
We can solve this by leaning more heavily on ulock using Drepper's algo.
It's kind of ironic that Linux's official mutexes work terribly on Linux
but almost as good as Apple Libc if used on MacOS.
2024-09-02 19:03:11 -07:00
Justine Tunney
2ec413b5a9
Fix bugs in poll(), select(), ppoll(), and pselect()
poll() and select() now delegate to ppoll() and pselect() for assurances
that both polyfill implementations are correct and well-tested. Poll now
polyfills XNU and BSD quirks re: the hanndling of POLLNVAL and the other
similar status flags. This change resolves a misunderstanding concerning
how select(exceptfds) is intended to map to POLPRI. We now use E2BIG for
bouncing requests that exceed the 64 handle limit on Windows. With pipes
and consoles on Windows our poll impl will now report POLLHUP correctly.

Issues with Windows path generation have been fixed. For example, it was
problematic on Windows to say: posix_spawn_file_actions_addchdir_np("/")
due to the need to un-UNC paths in some additional places. Calling fstat
on UNC style volume path handles will now work. posix_spawn now supports
simulating the opening of /dev/null and other special paths on Windows.

Cosmopolitan no longer defines epoll(). I think wepoll is a nice project
for using epoll() on Windows socket handles. However we need generalized
file descriptor support to make epoll() for Windows work well enough for
inclusion in a C library. It's also not worth having epoll() if we can't
get it to work on XNU and BSD OSes which provide different abstractions.
Even epoll() on Linux isn't that great of an abstraction since it's full
of footguns. Last time I tried to get it to be useful I had little luck.
Considering how long it took to get poll() and select() to be consistent
across platforms, we really have no business claiming to have epoll too.
While it'd be nice to have fully implemented, the only software that use
epoll() are event i/o libraries used by things like nodejs. Event i/o is
not the best paradigm for handling i/o; threads make so much more sense.
2024-09-02 00:29:52 -07:00
Gabriel Ravier
a089c07ddc
Fix printf funcs on memory pressure with floats (#1275)
Cosmopolitan's printf-family functions will currently crash if one tries
formatting a floating point number with a larger precision (large enough
that gdtoa attempts to allocate memory to format the number) while under
memory pressure (i.e. when malloc fails) because gdtoa fails to check if
malloc fails.

The added tests (which would previously crash under cosmopolitan without
this patch) show how to reproduce the issue.

This patch fixes this, and adds the aforementioned tests.
2024-09-01 14:42:14 -07:00
Steven Dee (Jōshin)
ae57fa2c4e
Fix shared_ptr::owner_before (#1274)
This method is supposed to give equivalence iff two shared pointers both
own the same object, even if they point to different addresses. We can't
control the exact order of the control blocks in memory, so the test can
only check that this equivalence/non-equivalence relationship holds, and
this is in fact all that it should check.
2024-09-01 17:34:39 -04:00
Steven Dee (Jōshin)
48b703b3f6
Minor cleanup/improvements in unique_ptr_test (#1266)
I'd previously introduced a bunch of small wrappers around the class and
functions under test to avoid excessive cpp use, but we can achieve this
more expediently with simple using-declarations. This also cuts out some
over-specified tests (e.g. there's no reason a stateful deleter wouldn't
compile.)
2024-09-01 13:47:30 -07:00
Gabriel Ravier
75e161b27b
Fix printf-family functions on long double inf (#1273)
Cosmopolitan's printf-family functions currently very poorly handle
being passed a long double infinity.

For instance, a program such as:

```cpp
#include <stdio.h>

int main()
{
    printf("%f\n", 1.0 / 0.0);
    printf("%Lf\n", 1.0L / 0.0L);
    printf("%e\n", 1.0 / 0.0);
    printf("%Le\n", 1.0L / 0.0L);
    printf("%g\n", 1.0 / 0.0);
    printf("%Lg\n", 1.0L / 0.0L);
}
```

will currently output the following:

```
inf
0.000000[followed by 32763 more zeros]
inf
N.aN0000e-32769
inf
N.aNe-32769
```

when the correct expected output would be:

```
inf
inf
inf
inf
inf
inf
```

This patch fixes this, and adds tests for the behavior.
2024-09-01 13:10:48 -07:00
Justine Tunney
7c83f4abc8
Make improvements
- wcsstr() is now linearly complex
- strstr16() is now linearly complex
- strstr() is now vectorized on aarch64 (10x)
- strstr() now uses KMP on pathological cases
- memmem() is now vectorized on aarch64 (10x)
- memmem() now uses KMP on pathological cases
- Disable shared_ptr::owner_before until fixed
- Make iswlower(), iswupper() consistent with glibc
- Remove figure space from iswspace() implementation
- Include line and paragraph separator in iswcntrl()
- Use Musl wcwidth(), iswalpha(), iswpunct(), towlower(), towupper()
2024-09-01 01:27:47 -07:00
Steven Dee (Jōshin)
e1528a71e2
Basic CTL shared_ptr implementation (#1267) 2024-08-31 14:00:56 -04:00
Justine Tunney
c9152b6f14
Release Cosmopolitan v3.8.0
This change switches c++ exception handling from sjlj to standard dwarf.
It's needed because clang for aarch64 doesn't support sjlj. It turns out
that libunwind had a bare-metal configuration that made this easy to do.

This change gets the new experimental cosmocc -mclang flag in a state of
working so well that it can now be used to build all of llamafile and it
goes 3x faster in terms of build latency, without trading away any perf.

The int_fast16_t and int_fast32_t types are now always defined as 32-bit
in the interest of having more abi consistency between cosmocc -mgcc and
-mclang mode.
2024-08-30 20:14:07 -07:00
Gabriel Ravier
6baf6cdb10
Fix vfprintf and derived functions badly handling +/` flag conflict (#1269) 2024-08-29 19:07:05 -07:00
Justine Tunney
ebe1cbb1e3
Add crash proofing to ipv4.games server 2024-08-26 12:57:28 -07:00
Justine Tunney
f3ce684aef
Fix getpeername() bug on Windows
The WIN32 getpeername() function returns ENOTCONN when it uses connect()
the SOCK_NONBLOCK way. So we simply store the address, provided earlier.
2024-08-25 11:28:53 -07:00
Justine Tunney
bb06230f1e
Avoid linker conflicts on DescribeFoo symbols
These symbols belong to the user. It caused a confusing error for Blink.
2024-08-24 18:10:22 -07:00
Justine Tunney
60e697f7b2
Move LoadZipArgs() to cosmo.h 2024-08-17 12:06:27 -07:00
Justine Tunney
ca2c30c977
Release redbean v3.0.0 2024-08-17 06:45:35 -07:00
Justine Tunney
8e14b27749
Make fread() more consistent with glibc 2024-08-17 02:57:22 -07:00
Justine Tunney
098638cc6c
Fix pthread_kill_test flake on qemu 2024-08-16 21:18:26 -07:00
Gavin Hayes
914d521090
Fix relative Windows path normalization (#1261)
Fixes #1223
2024-08-16 11:55:49 -07:00
Justine Tunney
11d9fb521d
Make atomics faster on aarch64
This change implements the compiler runtime for ARM v8.1 ISE atomics and
gets rid of the mandatory -mno-outline-atomics flag. It can dramatically
speed things up, on newer ARM CPUs, as indicated by the changed lines in
test/libc/thread/footek_test.c. In llamafile dispatching on hwcap atomic
also shaved microseconds off synchronization barriers.
2024-08-16 11:14:46 -07:00
Justine Tunney
0a79c6961f
Make malloc scalable on all platforms
It turns out sched_getcpu() didn't work on many platforms. So the system
call now has tests and is well documented. We now employ new workarounds
on platforms where it isn't supported in our malloc() implementation. It
was previously the case that malloc() was only scalable on Linux/Windows
for x86-64. Now the other platforms are scalable too.
2024-08-15 23:32:53 -07:00
Justine Tunney
2045e87b7c
Fix build issues 2024-08-15 18:37:33 -07:00
Justine Tunney
31194165d2
Remove .internal from more header filenames 2024-08-04 12:52:25 -07:00
Justine Tunney
3f26dfbb31
Share file offset across execve() on Windows
This is a breaking change. It defines the new environment variable named
_COSMO_FDS_V2 which is used for inheriting non-stdio file descriptors on
execve() or posix_spawn(). No effort has been spent thus far integrating
with the older variable. If a new binary launches the older ones or vice
versa they'll only be able to pass stdin / stdout / stderr to each other
therefore it's important that you upgrade all your cosmo binaries if you
depend on this functionality. You'll be glad you did because inheritance
of file descriptors is more aligned with the POSIX standard than before.
2024-08-03 17:48:00 -07:00
Justine Tunney
761c6ad615
Share file offset across processes
This change ensures that if a file descriptor for an open disk file gets
shared by multiple processes within a process tree, then lseek() changes
will be visible across processes, and read() / write() are synchronized.
Note this only applies to Windows, because UNIX kernels already do this.
2024-08-03 01:39:11 -07:00
Justine Tunney
a80ab3f8fe
Implement bf16 compiler runtime library 2024-08-02 02:04:53 -07:00
Justine Tunney
f8cfc89eba
Allow -c to be specified with -E in cosmocc 2024-07-31 02:09:15 -07:00
Justine Tunney
4ed4a1095a
Improve build latency 2024-07-31 01:21:27 -07:00
Justine Tunney
8d8aecb6d9
Avoid legacy instruction penalties on x86 2024-07-31 01:02:38 -07:00