cosmopolitan

mirror of https://github.com/jart/cosmopolitan.git synced 2025-02-12 01:08:00 +00:00

Author	SHA1	Message	Date
Justine Tunney	538ce338f4	Fix fork thread handle leak on windows	2025-01-02 19:33:14 -08:00
Justine Tunney	a15958edc6	Remove some legacy cruft Function trace logs will report stack usage accurately. It won't include the argv/environ block. Our clone() polyfill is now simpler and does not use as much stack memory. Function call tracing on x86 is now faster too	2025-01-02 18:44:07 -08:00
Justine Tunney	fde03f8487	Remove leaf attribute where appropriate This change fixes a bug where gcc assumed thread synchronization such as pthread_cond_wait() wouldn't alter static variables, because the headers were using __attribute__((__leaf__)) inappropriately.	2025-01-02 08:07:15 -08:00
Justine Tunney	f24c854b28	Write more runtime tests and fix bugs This change adds tests for the new memory manager code particularly with its windows support. Function call tracing now works reliably on Silicon since our function hooker was missing new Apple self-modifying code APIs Many tests that were disabled a long time ago on aarch64 are reactivated by this change, now that arm support is on equal terms with x86. There's been a lot of places where ftrace could cause deadlocks, which have been hunted down across all platforms thanks to new tests. A bug in Windows's kill() function has been identified.	2025-01-01 22:25:22 -08:00
Justine Tunney	0b3c81dd4e	Make fork() go 30% faster This change makes fork() go nearly as fast as sys_fork() on UNIX. As for Windows this change shaves about 4-5ms off fork() + wait() latency. This is accomplished by using WriteProcessMemory() from the parent process to setup the address space of a suspended process; it is better than a pipe	2025-01-01 04:59:38 -08:00
Justine Tunney	98c5847727	Fix fork waiter leak in nsync This change fixes a bug where nsync waiter objects would leak. It'd mean that long-running programs like runitd would run out of file descriptors on NetBSD where waiter objects have ksem file descriptors. On other OSes this bug is mostly harmless since the worst that can happen with a futex is to leak a little bit of ram. The bug was caused because tib_nsync was sneaking back in after the finalization code had cleared it. This change refactors the thread exiting code to handle nsync teardown appropriately and in making this change I found another issue, which is that user code which is buggy, and tries to exit without joining joinable threads which haven't been detached, would result in a deadlock. That doesn't sound so bad, except the main thread is a joinable thread. So this deadlock would be triggered in ways that put libc at fault. So we now auto-join threads and libc will log a warning to --strace when that happens for any thread	2024-12-31 01:30:13 -08:00
Justine Tunney	c7e3d9f7ff	Make recursive mutexes slightly faster	2024-12-30 01:37:14 -08:00
Justine Tunney	9ba5b227d9	Unblock stalled i/o signals on windows	2024-12-29 00:22:41 -08:00
Justine Tunney	aca4214ff6	Simplify memory manager code	2024-12-28 17:09:28 -08:00
Justine Tunney	379cd77078	Improve memory manager and signal handling On Windows, mmap() now chooses addresses transactionally. It reduces the risk of badness when interacting with the WIN32 memory manager. We don't throw darts anymore. There is also no more retry limit, since we recover from mystery maps more gracefully. The subroutine for combining adjacent maps has been rewritten for clarity. The print maps subroutine is better This change goes to great lengths to perfect the stack overflow code. On Windows you can now longjmp() out of a crash signal handler. Guard pages previously weren't being restored properly by the signal handler. That's fixed, so on Windows you can now handle a stack overflow multiple times. Great thought has been put into selecting the perfect SIGSTKSZ constants so you can save sigaltstack() memory. You can now use kprintf() with 512 bytes of stack available. The guard pages beneath the main stack are now recorded in the memory manager. This change fixes getcontext() so it works right with the %rax register.	2024-12-27 01:33:00 -08:00
Justine Tunney	36e5861b0c	Reduce stack virtual memory consumption on Linux	2024-12-25 20:58:08 -08:00
Justine Tunney	2de3845b25	Build tool for hunting down flakes	2024-12-24 11:36:16 -08:00
Justine Tunney	93e22c581f	Reduce pthread memory usage	2024-12-24 10:30:59 -08:00
Justine Tunney	55b7aa1632	Allow user to override pthread mutex and cond	2024-12-23 21:57:52 -08:00
Justine Tunney	c8e10eef30	Make bulk_free() go faster	2024-12-23 20:31:57 -08:00
Justine Tunney	624573207e	Make threads faster and more reliable This change doubles the performance of thread spawning. That's thanks to our new stack manager, which allows us to avoid zeroing stacks. It gives us 15µs spawns rather than 30µs spawns on Linux. Also, pthread_exit() is faster now, since it doesn't need to acquire the pthread GIL. On NetBSD, that helps us avoid allocating too many semaphores. Even if that happens we're now able to survive semaphores running out and even memory running out, when allocating *NSYNC waiter objects. I found a lot more rare bugs in the POSIX threads runtime that could cause things to crash, if you've got dozens of threads all spawning and joining dozens of threads. I want cosmo to be world class production worthy for 2025 so happy holidays all	2024-12-21 22:13:00 -08:00
Justine Tunney	af7bd80430	Eliminate cyclic locks in runtime This change introduces a new deadlock detector for Cosmo's POSIX threads implementation. Error check mutexes will now track a DAG of nested locks and report EDEADLK when a deadlock is theoretically possible. These will occur rarely, but it's important for production hardening your code. You don't even need to change your mutexes to use the POSIX error check mode because `cosmocc -mdbg` will enable error checking on mutexes by default globally. When cycles are found, an error message showing your demangled symbols describing the strongly connected component are printed and then the SIGTRAP is raised, which means you'll also get a backtrace if you're using ShowCrashReports() too. This new error checker is so low-level and so pure that it's able to verify the relationships of every libc runtime lock, including those locks upon which the mutex implementation depends.	2024-12-16 22:25:12 -08:00
Justine Tunney	26c051c297	Spoof PID across execve() on Windows It's now possible with cosmo and redbean, to deliver a signal to a child process after it has called execve(). However the executed program needs to be compiled using cosmocc. The cosmo runtime WinMain() implementation now intercepts a _COSMO_PID environment variable that's set by execve(). It ensures the child process will use the same C:\ProgramData\cosmo\sigs file, which is where kill() will place the delivered signal. We are able to do this on Windows even better than NetBSD, which has a bug with this Fixes #1334	2024-12-14 13:13:08 -08:00
Justine Tunney	b490e23d63	Improve Windows sleep accuracy from 15ms to 15µs	2024-12-06 23:03:57 -08:00
Justine Tunney	5fae582e82	Protect privileged demangler from stack overflow	2024-11-24 06:43:17 -08:00
Justine Tunney	ef00a7d0c2	Fix AFL crashes in C++ demangler American Fuzzy Lop didn't need to try very hard, to crash our privileged __demangle() implementation. This change helps ensure our barebones impl will fail rather than crash when given adversarial input data.	2024-11-23 14:25:09 -08:00
Justine Tunney	746660066f	Release Cosmopolitan v3.9.7	2024-11-22 21:38:09 -08:00
Justine Tunney	fd15b2d7a3	Ensure ^C gets printed to Windows console	2024-11-22 14:56:53 -08:00
Justine Tunney	e228aa3e14	Save rax register in getcontext	2024-11-22 13:32:52 -08:00
Justine Tunney	9ddbfd921e	Introduce cosmo_futex_wait and cosmo_futex_wake Cosmopolitan Futexes are now exposed as a public API.	2024-11-22 11:25:15 -08:00
Justine Tunney	d3279d3c0d	Fix typo in mmap() Windows implementation	2024-11-01 02:29:58 -07:00
Justine Tunney	913b573661	Fix mmap MT bug on Windows	2024-10-31 23:06:06 -07:00
Justine Tunney	dc1afc968b	Fix fork() crash on Windows On Windows, sometimes fork() could crash with message likes: fork() ViewOrDie(170000) failed with win32 error 487 This is due to a bug in our file descriptor inheritance. We have cursors which are shared between processes. They let us track the file positions of read() and write() operations. At startup they were being mmap()ed to memory addresses that were assigned by WIN32. That's bad because Windows likes to give us memory addresses beneath the program image in the first 4mb range that are likely to conflict with other assignments. That ended up causing problems because fork() needs to be able to assume that a map will be possible to resurrect at the same address. But for one reason or another, Windows libraries we don't control could sneak allocations into the memory space that overlap with these mappings. This change solves it by choosing a random memory address instead when mapping cursor objects.	2024-10-12 15:38:58 -07:00
Justine Tunney	ad11fc32ad	Avoid an --ftrace crash on Windows	2024-10-07 18:39:25 -07:00
Justine Tunney	e4d6eb382a	Make memchr() and memccpy() faster	2024-09-30 05:54:34 -07:00
Justine Tunney	518eabadf5	Further optimize poll() on Windows	2024-09-22 22:28:59 -07:00
Justine Tunney	556a294363	Improve Windows mode bits We were too zealous about security before by only setting the owner bits and that would cause issues for projects like redbean that check "other" bits to determine if it's safe to serve a file. Since that doesn't exist on Windows, it's better to have things work than not work. So what we'll do instead is return modes like 0664 for files and 0775 for directories.	2024-09-22 16:51:57 -07:00
Justine Tunney	dd8c4dbd7d	Write more tests for signal handling There's now a much stronger level of assurance that signaling on Windows will be atomic, low-latency, low tail latency, and shall never deadlock.	2024-09-21 05:24:56 -07:00
Justine Tunney	0e59afb403	Fix conflicting RTTI related symbol	2024-09-19 20:34:43 -07:00
Justine Tunney	f68fc1f815	Put more thought into new signaling code	2024-09-19 20:21:33 -07:00
Justine Tunney	6107eb38f9	Fix m=tinylinux build	2024-09-19 04:25:34 -07:00
Justine Tunney	d50f4c02f6	Make revision to previous change	2024-09-19 03:29:39 -07:00
Justine Tunney	0d74673213	Introduce interprocess signaling on Windows This change gets rsync working without any warning or errors. On Windows we now create a bunch of C:\var\sig\x\y.pid shared memory files, so sigs can be delivered between processes. WinMain() creates this file when the process starts. If the program links signaling system calls then we make a thread at startup too, which allows asynchronous delivery each quantum and cancelation points can spot these signals potentially faster on wait See #1240	2024-09-19 03:02:13 -07:00
Justine Tunney	87a6669900	Make more Windows socket fixes and improvements This change makes send() / sendto() always block on Windows. It's needed because poll(POLLOUT) doesn't guarantee a socket is immediately writable on Windows, and it caused rsync to fail because it made that assumption. The only exception is when a SO_SNDTIMEO is specified which will EAGAIN. Tests are added confirming MSG_WAITALL and MSG_NOSIGNAL work as expected on all our supported OSes. Most of the platform-specific MSG_FOO magnums have been deleted, with the exception of MSG_FASTOPEN. Your --strace log will now show MSG_FOO flags as symbols rather than numbers. I've also removed cv_wait_example_test because it's 0.3% flaky with Qemu under system load since it depends on a process being readily scheduled.	2024-09-18 20:29:42 -07:00
Justine Tunney	bb7942e557	Improve socket option story	2024-09-17 01:17:07 -07:00
Justine Tunney	c3482af66d	Fix file descriptor assignment issues on Windows	2024-09-15 22:16:38 -07:00
Justine Tunney	b5fcb59a85	Implement more bf16/fp16 compiler runtimes Fixes #1259	2024-09-13 05:06:34 -07:00
Justine Tunney	e142124730	Rewrite Windows connect() Our old code wasn't working with projects like Qt that call connect() in O_NONBLOCK mode multiple times. This change overhauls connect() to use a simpler WSAConnect() API and follows the same pattern as cosmo accept(). This change also reduces the binary footprint of read(), which no longer needs to depend on our enormous clock_gettime() function.	2024-09-12 23:07:52 -07:00
Justine Tunney	a5c0189bf6	Make vim startup faster It appears that GetFileAttributes(u"\\etc\\passwd") can take two seconds on Windows 10 at unpredictable times for reasons which are mysterious to me. Let's try avoiding that path entirely and pray to Microsoft it works	2024-09-11 00:52:34 -07:00
Justine Tunney	a0a404a431	Fix issues with previous commit	2024-09-10 01:59:46 -07:00
Justine Tunney	2f48a02b44	Make recursive mutexes faster Recursive mutexes now go as fast as normal mutexes. The tradeoff is they are no longer safe to use in signal handlers. However you can still have signal safe mutexes if you set your mutex to both recursive and pshared. You can also make functions that use recursive mutexes signal safe using sigprocmask to ensure recursion doesn't happen due to any signal handler The impact of this change is that, on Windows, many functions which edit the file descriptor table rely on recursive mutexes, e.g. open(). If you develop your app so it uses pread() and pwrite() then your app should go very fast when performing a heavily multithreaded and contended workload For example, when scaling to 40+ cores, NSYNC mutexes can go as much as 1000x faster (in CPU time) than the naive recursive lock implementation. Now recursive will use NSYNC under the hood when it's possible to do so	2024-09-10 00:08:59 -07:00
Justine Tunney	d1157d471f	Upgrade pl_mpeg This change gets printvideo working on aarch64. Performance improvements have been introduced for magikarp decimation on aarch64. The last of the old portable x86 intrinsics library is gone, but it still lives in Blink	2024-09-06 19:10:34 -07:00
Justine Tunney	03875beadb	Add missing ICANON features	2024-09-05 03:17:19 -07:00
Justine Tunney	dd8544c3bd	Delve into clock rabbit hole The worst issue I had with consts.sh for clock_gettime is how it defined too many clocks. So I looked into these clocks all day to figure out how how they overlap in functionality. I discovered counter-intuitive things such as how CLOCK_MONOTONIC should be CLOCK_UPTIME on MacOS and BSD, and that CLOCK_BOOTTIME should be CLOCK_MONOTONIC on MacOS / BSD. Windows 10 also has some incredible new APIs, that let us simplify clock_gettime(). - Linux CLOCK_REALTIME -> GetSystemTimePreciseAsFileTime() - Linux CLOCK_MONOTONIC -> QueryUnbiasedInterruptTimePrecise() - Linux CLOCK_MONOTONIC_RAW -> QueryUnbiasedInterruptTimePrecise() - Linux CLOCK_REALTIME_COARSE -> GetSystemTimeAsFileTime() - Linux CLOCK_MONOTONIC_COARSE -> QueryUnbiasedInterruptTime() - Linux CLOCK_BOOTTIME -> QueryInterruptTimePrecise() Documentation on the clock crew has been added to clock_gettime() in the docstring and in redbean's documentation too. You can read that to learn interesting facts about eight essential clocks that survived this purge. This is original research you will not find on Google, OpenAI, or Claude I've tested this change by porting *NSYNC to become fully clock agnostic since it has extensive tests for spotting irregularities in time. I have also included these tests in the default build so they no longer need to be run manually. Both CLOCK_REALTIME and CLOCK_MONOTONIC are good across the entire amd64 and arm64 test fleets.	2024-09-04 01:32:46 -07:00
Justine Tunney	3c61a541bd	Introduce pthread_condattr_setclock() This is one of the few POSIX APIs that was missing. It lets you choose a monotonic clock for your condition variables. This might improve perf on some platforms. It might also grant more flexibility with NTP configs. I know Qt is one project that believes it needs this. To introduce this, I needed to change some the *NSYNC APIs, to support passing a clock param. There's also new benchmarks, demonstrating Cosmopolitan's supremacy over many libc implementations when it comes to mutex performance. Cygwin has an alarmingly bad pthread_mutex_t implementation. It is so bad that they would have been significantly better off if they'd used naive spinlocks.	2024-09-02 23:45:42 -07:00

1 2 3 4 5 ...

600 commits