Commit graph

582 commits

Author SHA1 Message Date
Justine Tunney
b490e23d63
Improve Windows sleep accuracy from 15ms to 15µs 2024-12-06 23:03:57 -08:00
Justine Tunney
5fae582e82
Protect privileged demangler from stack overflow 2024-11-24 06:43:17 -08:00
Justine Tunney
ef00a7d0c2
Fix AFL crashes in C++ demangler
American Fuzzy Lop didn't need to try very hard, to crash our privileged
__demangle() implementation. This change helps ensure our barebones impl
will fail rather than crash when given adversarial input data.
2024-11-23 14:25:09 -08:00
Justine Tunney
746660066f
Release Cosmopolitan v3.9.7 2024-11-22 21:38:09 -08:00
Justine Tunney
fd15b2d7a3
Ensure ^C gets printed to Windows console 2024-11-22 14:56:53 -08:00
Justine Tunney
e228aa3e14
Save rax register in getcontext 2024-11-22 13:32:52 -08:00
Justine Tunney
9ddbfd921e
Introduce cosmo_futex_wait and cosmo_futex_wake
Cosmopolitan Futexes are now exposed as a public API.
2024-11-22 11:25:15 -08:00
Justine Tunney
d3279d3c0d
Fix typo in mmap() Windows implementation 2024-11-01 02:29:58 -07:00
Justine Tunney
913b573661
Fix mmap MT bug on Windows 2024-10-31 23:06:06 -07:00
Justine Tunney
dc1afc968b
Fix fork() crash on Windows
On Windows, sometimes fork() could crash with message likes:

    fork() ViewOrDie(170000) failed with win32 error 487

This is due to a bug in our file descriptor inheritance. We have cursors
which are shared between processes. They let us track the file positions
of read() and write() operations. At startup they were being mmap()ed to
memory addresses that were assigned by WIN32. That's bad because Windows
likes to give us memory addresses beneath the program image in the first
4mb range that are likely to conflict with other assignments. That ended
up causing problems because fork() needs to be able to assume that a map
will be possible to resurrect at the same address. But for one reason or
another, Windows libraries we don't control could sneak allocations into
the memory space that overlap with these mappings. This change solves it
by choosing a random memory address instead when mapping cursor objects.
2024-10-12 15:38:58 -07:00
Justine Tunney
ad11fc32ad
Avoid an --ftrace crash on Windows 2024-10-07 18:39:25 -07:00
Justine Tunney
e4d6eb382a
Make memchr() and memccpy() faster 2024-09-30 05:54:34 -07:00
Justine Tunney
518eabadf5
Further optimize poll() on Windows 2024-09-22 22:28:59 -07:00
Justine Tunney
556a294363
Improve Windows mode bits
We were too zealous about security before by only setting the owner bits
and that would cause issues for projects like redbean that check "other"
bits to determine if it's safe to serve a file. Since that doesn't exist
on Windows, it's better to have things work than not work. So what we'll
do instead is return modes like 0664 for files and 0775 for directories.
2024-09-22 16:51:57 -07:00
Justine Tunney
dd8c4dbd7d
Write more tests for signal handling
There's now a much stronger level of assurance that signaling on Windows
will be atomic, low-latency, low tail latency, and shall never deadlock.
2024-09-21 05:24:56 -07:00
Justine Tunney
0e59afb403
Fix conflicting RTTI related symbol 2024-09-19 20:34:43 -07:00
Justine Tunney
f68fc1f815
Put more thought into new signaling code 2024-09-19 20:21:33 -07:00
Justine Tunney
6107eb38f9
Fix m=tinylinux build 2024-09-19 04:25:34 -07:00
Justine Tunney
d50f4c02f6
Make revision to previous change 2024-09-19 03:29:39 -07:00
Justine Tunney
0d74673213
Introduce interprocess signaling on Windows
This change gets rsync working without any warning or errors. On Windows
we now create a bunch of C:\var\sig\x\y.pid shared memory files, so sigs
can be delivered between processes. WinMain() creates this file when the
process starts. If the program links signaling system calls then we make
a thread at startup too, which allows asynchronous delivery each quantum
and cancelation points can spot these signals potentially faster on wait

See #1240
2024-09-19 03:02:13 -07:00
Justine Tunney
87a6669900
Make more Windows socket fixes and improvements
This change makes send() / sendto() always block on Windows. It's needed
because poll(POLLOUT) doesn't guarantee a socket is immediately writable
on Windows, and it caused rsync to fail because it made that assumption.
The only exception is when a SO_SNDTIMEO is specified which will EAGAIN.

Tests are added confirming MSG_WAITALL and MSG_NOSIGNAL work as expected
on all our supported OSes. Most of the platform-specific MSG_FOO magnums
have been deleted, with the exception of MSG_FASTOPEN. Your --strace log
will now show MSG_FOO flags as symbols rather than numbers.

I've also removed cv_wait_example_test because it's 0.3% flaky with Qemu
under system load since it depends on a process being readily scheduled.
2024-09-18 20:29:42 -07:00
Justine Tunney
bb7942e557
Improve socket option story 2024-09-17 01:17:07 -07:00
Justine Tunney
c3482af66d
Fix file descriptor assignment issues on Windows 2024-09-15 22:16:38 -07:00
Justine Tunney
b5fcb59a85
Implement more bf16/fp16 compiler runtimes
Fixes #1259
2024-09-13 05:06:34 -07:00
Justine Tunney
e142124730
Rewrite Windows connect()
Our old code wasn't working with projects like Qt that call connect() in
O_NONBLOCK mode multiple times. This change overhauls connect() to use a
simpler WSAConnect() API and follows the same pattern as cosmo accept().
This change also reduces the binary footprint of read(), which no longer
needs to depend on our enormous clock_gettime() function.
2024-09-12 23:07:52 -07:00
Justine Tunney
a5c0189bf6
Make vim startup faster
It appears that GetFileAttributes(u"\\etc\\passwd") can take two seconds
on Windows 10 at unpredictable times for reasons which are mysterious to
me. Let's try avoiding that path entirely and pray to Microsoft it works
2024-09-11 00:52:34 -07:00
Justine Tunney
a0a404a431
Fix issues with previous commit 2024-09-10 01:59:46 -07:00
Justine Tunney
2f48a02b44
Make recursive mutexes faster
Recursive mutexes now go as fast as normal mutexes. The tradeoff is they
are no longer safe to use in signal handlers. However you can still have
signal safe mutexes if you set your mutex to both recursive and pshared.
You can also make functions that use recursive mutexes signal safe using
sigprocmask to ensure recursion doesn't happen due to any signal handler

The impact of this change is that, on Windows, many functions which edit
the file descriptor table rely on recursive mutexes, e.g. open(). If you
develop your app so it uses pread() and pwrite() then your app should go
very fast when performing a heavily multithreaded and contended workload

For example, when scaling to 40+ cores, *NSYNC mutexes can go as much as
1000x faster (in CPU time) than the naive recursive lock implementation.
Now recursive will use *NSYNC under the hood when it's possible to do so
2024-09-10 00:08:59 -07:00
Justine Tunney
d1157d471f
Upgrade pl_mpeg
This change gets printvideo working on aarch64. Performance improvements
have been introduced for magikarp decimation on aarch64. The last of the
old portable x86 intrinsics library is gone, but it still lives in Blink
2024-09-06 19:10:34 -07:00
Justine Tunney
03875beadb
Add missing ICANON features 2024-09-05 03:17:19 -07:00
Justine Tunney
dd8544c3bd
Delve into clock rabbit hole
The worst issue I had with consts.sh for clock_gettime is how it defined
too many clocks. So I looked into these clocks all day to figure out how
how they overlap in functionality. I discovered counter-intuitive things
such as how CLOCK_MONOTONIC should be CLOCK_UPTIME on MacOS and BSD, and
that CLOCK_BOOTTIME should be CLOCK_MONOTONIC on MacOS / BSD. Windows 10
also has some incredible new APIs, that let us simplify clock_gettime().

  - Linux CLOCK_REALTIME         -> GetSystemTimePreciseAsFileTime()
  - Linux CLOCK_MONOTONIC        -> QueryUnbiasedInterruptTimePrecise()
  - Linux CLOCK_MONOTONIC_RAW    -> QueryUnbiasedInterruptTimePrecise()
  - Linux CLOCK_REALTIME_COARSE  -> GetSystemTimeAsFileTime()
  - Linux CLOCK_MONOTONIC_COARSE -> QueryUnbiasedInterruptTime()
  - Linux CLOCK_BOOTTIME         -> QueryInterruptTimePrecise()

Documentation on the clock crew has been added to clock_gettime() in the
docstring and in redbean's documentation too. You can read that to learn
interesting facts about eight essential clocks that survived this purge.
This is original research you will not find on Google, OpenAI, or Claude

I've tested this change by porting *NSYNC to become fully clock agnostic
since it has extensive tests for spotting irregularities in time. I have
also included these tests in the default build so they no longer need to
be run manually. Both CLOCK_REALTIME and CLOCK_MONOTONIC are good across
the entire amd64 and arm64 test fleets.
2024-09-04 01:32:46 -07:00
Justine Tunney
3c61a541bd
Introduce pthread_condattr_setclock()
This is one of the few POSIX APIs that was missing. It lets you choose a
monotonic clock for your condition variables. This might improve perf on
some platforms. It might also grant more flexibility with NTP configs. I
know Qt is one project that believes it needs this. To introduce this, I
needed to change some the *NSYNC APIs, to support passing a clock param.
There's also new benchmarks, demonstrating Cosmopolitan's supremacy over
many libc implementations when it comes to mutex performance. Cygwin has
an alarmingly bad pthread_mutex_t implementation. It is so bad that they
would have been significantly better off if they'd used naive spinlocks.
2024-09-02 23:45:42 -07:00
Justine Tunney
90460ceb3c
Make Cosmo mutexes competitive with Apple Libc
While we have always licked glibc and musl libc on gnu/systemd sadly the
Apple Libc implementation of pthread_mutex_t is better than ours. It may
be due to how the XNU kernel and M2 microprocessor are in league when it
comes to scheduling processes and the NSYNC behavior is being penalized.
We can solve this by leaning more heavily on ulock using Drepper's algo.
It's kind of ironic that Linux's official mutexes work terribly on Linux
but almost as good as Apple Libc if used on MacOS.
2024-09-02 19:03:11 -07:00
Justine Tunney
2ec413b5a9
Fix bugs in poll(), select(), ppoll(), and pselect()
poll() and select() now delegate to ppoll() and pselect() for assurances
that both polyfill implementations are correct and well-tested. Poll now
polyfills XNU and BSD quirks re: the hanndling of POLLNVAL and the other
similar status flags. This change resolves a misunderstanding concerning
how select(exceptfds) is intended to map to POLPRI. We now use E2BIG for
bouncing requests that exceed the 64 handle limit on Windows. With pipes
and consoles on Windows our poll impl will now report POLLHUP correctly.

Issues with Windows path generation have been fixed. For example, it was
problematic on Windows to say: posix_spawn_file_actions_addchdir_np("/")
due to the need to un-UNC paths in some additional places. Calling fstat
on UNC style volume path handles will now work. posix_spawn now supports
simulating the opening of /dev/null and other special paths on Windows.

Cosmopolitan no longer defines epoll(). I think wepoll is a nice project
for using epoll() on Windows socket handles. However we need generalized
file descriptor support to make epoll() for Windows work well enough for
inclusion in a C library. It's also not worth having epoll() if we can't
get it to work on XNU and BSD OSes which provide different abstractions.
Even epoll() on Linux isn't that great of an abstraction since it's full
of footguns. Last time I tried to get it to be useful I had little luck.
Considering how long it took to get poll() and select() to be consistent
across platforms, we really have no business claiming to have epoll too.
While it'd be nice to have fully implemented, the only software that use
epoll() are event i/o libraries used by things like nodejs. Event i/o is
not the best paradigm for handling i/o; threads make so much more sense.
2024-09-02 00:29:52 -07:00
Justine Tunney
39e7f24947
Fix handling of paths with dirfd on Windows
This change fixes an issue with all system calls ending with *at(), when
the caller passes `dirfd != AT_FDCWD` and an absolute path. It's because
the old code was turning paths like C:\bin\ls into \\C:\bin\ls\C:\bin\ls
after being converted from paths like /C/bin/ls. I noticed this when the
Emacs dired mode stopped working. It's unclear if it's a regression with
Cosmopolitan Libc or if this was introduced by the Emacs v29 upgrade. It
also impacted posix_spawn() for which a newly minted example now exists.
2024-09-01 17:52:30 -07:00
Justine Tunney
cca0edd62b
Make pthread mutex non-recursive 2024-09-01 02:05:17 -07:00
Justine Tunney
7c83f4abc8
Make improvements
- wcsstr() is now linearly complex
- strstr16() is now linearly complex
- strstr() is now vectorized on aarch64 (10x)
- strstr() now uses KMP on pathological cases
- memmem() is now vectorized on aarch64 (10x)
- memmem() now uses KMP on pathological cases
- Disable shared_ptr::owner_before until fixed
- Make iswlower(), iswupper() consistent with glibc
- Remove figure space from iswspace() implementation
- Include line and paragraph separator in iswcntrl()
- Use Musl wcwidth(), iswalpha(), iswpunct(), towlower(), towupper()
2024-09-01 01:27:47 -07:00
Justine Tunney
c9152b6f14
Release Cosmopolitan v3.8.0
This change switches c++ exception handling from sjlj to standard dwarf.
It's needed because clang for aarch64 doesn't support sjlj. It turns out
that libunwind had a bare-metal configuration that made this easy to do.

This change gets the new experimental cosmocc -mclang flag in a state of
working so well that it can now be used to build all of llamafile and it
goes 3x faster in terms of build latency, without trading away any perf.

The int_fast16_t and int_fast32_t types are now always defined as 32-bit
in the interest of having more abi consistency between cosmocc -mgcc and
-mclang mode.
2024-08-30 20:14:07 -07:00
Justine Tunney
884d89235f
Harden against aba problem 2024-08-26 20:01:55 -07:00
Justine Tunney
610c951f71
Fix the build 2024-08-26 16:44:05 -07:00
Justine Tunney
f3ce684aef
Fix getpeername() bug on Windows
The WIN32 getpeername() function returns ENOTCONN when it uses connect()
the SOCK_NONBLOCK way. So we simply store the address, provided earlier.
2024-08-25 11:28:53 -07:00
Justine Tunney
bb06230f1e
Avoid linker conflicts on DescribeFoo symbols
These symbols belong to the user. It caused a confusing error for Blink.
2024-08-24 18:10:22 -07:00
Justine Tunney
38cc4b3c68
Get rid of some legacy code 2024-08-24 17:53:30 -07:00
Justine Tunney
11d9fb521d
Make atomics faster on aarch64
This change implements the compiler runtime for ARM v8.1 ISE atomics and
gets rid of the mandatory -mno-outline-atomics flag. It can dramatically
speed things up, on newer ARM CPUs, as indicated by the changed lines in
test/libc/thread/footek_test.c. In llamafile dispatching on hwcap atomic
also shaved microseconds off synchronization barriers.
2024-08-16 11:14:46 -07:00
Justine Tunney
0a79c6961f
Make malloc scalable on all platforms
It turns out sched_getcpu() didn't work on many platforms. So the system
call now has tests and is well documented. We now employ new workarounds
on platforms where it isn't supported in our malloc() implementation. It
was previously the case that malloc() was only scalable on Linux/Windows
for x86-64. Now the other platforms are scalable too.
2024-08-15 23:32:53 -07:00
Justine Tunney
2045e87b7c
Fix build issues 2024-08-15 18:37:33 -07:00
Justine Tunney
31194165d2
Remove .internal from more header filenames 2024-08-04 12:52:25 -07:00
Justine Tunney
3f26dfbb31
Share file offset across execve() on Windows
This is a breaking change. It defines the new environment variable named
_COSMO_FDS_V2 which is used for inheriting non-stdio file descriptors on
execve() or posix_spawn(). No effort has been spent thus far integrating
with the older variable. If a new binary launches the older ones or vice
versa they'll only be able to pass stdin / stdout / stderr to each other
therefore it's important that you upgrade all your cosmo binaries if you
depend on this functionality. You'll be glad you did because inheritance
of file descriptors is more aligned with the POSIX standard than before.
2024-08-03 17:48:00 -07:00
Justine Tunney
761c6ad615
Share file offset across processes
This change ensures that if a file descriptor for an open disk file gets
shared by multiple processes within a process tree, then lseek() changes
will be visible across processes, and read() / write() are synchronized.
Note this only applies to Windows, because UNIX kernels already do this.
2024-08-03 01:39:11 -07:00
Justine Tunney
a80ab3f8fe
Implement bf16 compiler runtime library 2024-08-02 02:04:53 -07:00