Commit graph

110 commits

Author SHA1 Message Date
Justine Tunney
c68f6599e5
Fix definition of getpeername on FreeBSD
We were using the COMPAT magic number, which was recently removed.
2024-05-26 17:03:22 -07:00
Justine Tunney
2d93788ce3
Fix --ftrace with cosmo_dlopen()
This change ensures function call logging won't crash the process when
cosmo_dlopen() is called.
2024-01-05 15:13:07 -08:00
Justine Tunney
d8ad34686a
Implement issetugid() on NetBSD 2023-12-30 14:58:16 -08:00
Justine Tunney
83107f78ed
Introduce FreeBSD ARM64 support
It's 100% passing test fleet. Solid as a rock.
2023-12-29 20:14:02 -08:00
Stephen Gregoratto
cc5c5319bf
Linux: Add cachestat, fchmodat2 syscalls (#958) 2023-11-19 19:01:20 -08:00
Justine Tunney
529cb4817c
Improve dlopen() on Apple Silicon
- Introduce MAP_JIT which is zero on other platforms
- Invent __jit_begin() and __jit_end() which wrap Apple's APIs
- Runtime dispatch to sys_icache_invalidate() in __clear_cache()
2023-11-17 02:33:14 -08:00
Justine Tunney
8f5e516b39
Remove sync_file_range()
After hearing horror stories from a trusted colleague, I don't think
this is the kind of API we want to be supporting. Also SQLite wisdom
regarding fdatasync() has been added to the documentation.
2023-11-15 23:21:22 -08:00
Justine Tunney
85f64f3851
Make futexes 100x better on x86 MacOS
Thanks to @autumnjolitz (in #876) the Cosmopolitan codebase is now
acquainted with Apple's outstanding ulock system calls which offer
something much closer to futexes than Grand Central Dispatch which
wasn't quite as good, since its wait function can't be interrupted
by signals (therefore necessitating a busy loop) and it also needs
semaphore objects to be created and freed. Even though ulock is an
internal Apple API, strictly speaking, the benefits of futexes are
so great that it's worth the risk for now especially since we have
the GCD implementation still as a quick escape hatch if it changes

Here's why this change is important for x86 XNU users. Cosmo has a
suboptimal polyfill when the operating system doesn't offer an API
that let's us implement futexes properly. Sadly we had to use that
on X86 XNU until now. The polyfill works using clock_nanosleep, to
poll the futex in a busy loop with exponential backoff. On XNU x86
clock_nanosleep suffers from us not being able to use a fast clock
gettime implementation, which had a compounding effect that's made
the polyfill function even more poorly. On X86 XNU we also need to
polyfill sched_yield() using select(), which made things even more
troublesome. Now that we have futexes we don't have any busy loops
anymore for both condition variables and thread joining so optimal
performance is attained. To demonstrate, consider these benchmarks

Before:

    $ ./lockscale_test.com -b
    consumed 38.8377   seconds real time and
              0.087131 seconds cpu time

After:

    $ ./lockscale_test.com -b
    consumed 0.007955 seconds real time and
             0.011515 seconds cpu time

Fixes #876
2023-10-03 15:15:43 -07:00
Justine Tunney
ff77f2a6af
Make improvements
- This change fixes a bug that allowed unbuffered printf() output (to
  streams like stderr) to be truncated. This regression was introduced
  some time between now and the last release.

- POSIX specifies all functions as thread safe by default. This change
  works towards cleaning up our use of the @threadsafe / @threadunsafe
  documentation annotations to reflect that. The goal is (1) to use
  @threadunsafe to document functions which POSIX say needn't be thread
  safe, and (2) use @threadsafe to document functions that we chose to
  implement as thread safe even though POSIX didn't mandate it.

- Tidy up the clock_gettime() implementation. We're now trying out a
  cleaner approach to system call support that aims to maintain the
  Linux errno convention as long as possible. This also fixes bugs that
  existed previously, where the vDSO errno wasn't being translated
  properly. The gettimeofday() system call is now a wrapper for
  clock_gettime(), which reduces bloat in apps that use both.

- The recently-introduced improvements to the execute bit on Windows has
  had bugs fixed. access(X_OK) on a directory on Windows now succeeds.
  fstat() will now perform the MZ/#! ReadFile() operation correctly.

- Windows.h is no longer included in libc/isystem/, because it confused
  PCRE's build system into thinking Cosmopolitan is a WIN32 platform.
  Cosmo's Windows.h polyfill was never even really that good, since it
  only defines a subset of the subset of WIN32 APIs that Cosmo defines.

- The setlongerjmp() / longerjmp() APIs are removed. While they're nice
  APIs that are superior to the standardized setjmp / longjmp functions,
  they weren't superior enough to not be dead code in the monorepo. If
  you use these APIs, please file an issue and they'll be restored.

- The .com appending magic has now been removed from APE Loader.
2023-10-03 06:17:16 -07:00
Justine Tunney
ec480f5aa0
Make improvements
- Every unit test now passes on Apple Silicon. The final piece of this
  puzzle was porting our POSIX threads cancelation support, since that
  works differently on ARM64 XNU vs. AMD64. Our semaphore support on
  Apple Silicon is also superior now compared to AMD64, thanks to the
  grand central dispatch library which lets *NSYNC locks go faster.

- The Cosmopolitan runtime is now more stable, particularly on Windows.
  To do this, thread local storage is mandatory at all runtime levels,
  and the innermost packages of the C library is no longer being built
  using ASAN. TLS is being bootstrapped with a 128-byte TIB during the
  process startup phase, and then later on the runtime re-allocates it
  either statically or dynamically to support code using _Thread_local.
  fork() and execve() now do a better job cooperating with threads. We
  can now check how much stack memory is left in the process or thread
  when functions like kprintf() / execve() etc. call alloca(), so that
  ENOMEM can be raised, reduce a buffer size, or just print a warning.

- POSIX signal emulation is now implemented the same way kernels do it
  with pthread_kill() and raise(). Any thread can interrupt any other
  thread, regardless of what it's doing. If it's blocked on read/write
  then the killer thread will cancel its i/o operation so that EINTR can
  be returned in the mark thread immediately. If it's doing a tight CPU
  bound operation, then that's also interrupted by the signal delivery.
  Signal delivery works now by suspending a thread and pushing context
  data structures onto its stack, and redirecting its execution to a
  trampoline function, which calls SetThreadContext(GetCurrentThread())
  when it's done.

- We're now doing a better job managing locks and handles. On NetBSD we
  now close semaphore file descriptors in forked children. Semaphores on
  Windows can now be canceled immediately, which means mutexes/condition
  variables will now go faster. Apple Silicon semaphores can be canceled
  too. We're now using Apple's pthread_yield() funciton. Apple _nocancel
  syscalls are now used on XNU when appropriate to ensure pthread_cancel
  requests aren't lost. The MbedTLS library has been updated to support
  POSIX thread cancelations. See tool/build/runitd.c for an example of
  how it can be used for production multi-threaded tls servers. Handles
  on Windows now leak less often across processes. All i/o operations on
  Windows are now overlapped, which means file pointers can no longer be
  inherited across dup() and fork() for the time being.

- We now spawn a thread on Windows to deliver SIGCHLD and wakeup wait4()
  which means, for example, that posix_spawn() now goes 3x faster. POSIX
  spawn is also now more correct. Like Musl, it's now able to report the
  failure code of execve() via a pipe although our approach favors using
  shared memory to do that on systems that have a true vfork() function.

- We now spawn a thread to deliver SIGALRM to threads when setitimer()
  is used. This enables the most precise wakeups the OS makes possible.

- The Cosmopolitan runtime now uses less memory. On NetBSD for example,
  it turned out the kernel would actually commit the PT_GNU_STACK size
  which caused RSS to be 6mb for every process. Now it's down to ~4kb.
  On Apple Silicon, we reduce the mandatory upstream thread size to the
  smallest possible size to reduce the memory overhead of Cosmo threads.
  The examples directory has a program called greenbean which can spawn
  a web server on Linux with 10,000 worker threads and have the memory
  usage of the process be ~77mb. The 1024 byte overhead of POSIX-style
  thread-local storage is now optional; it won't be allocated until the
  pthread_setspecific/getspecific functions are called. On Windows, the
  threads that get spawned which are internal to the libc implementation
  use reserve rather than commit memory, which shaves a few hundred kb.

- sigaltstack() is now supported on Windows, however it's currently not
  able to be used to handle stack overflows, since crash signals are
  still generated by WIN32. However the crash handler will still switch
  to the alt stack, which is helpful in environments with tiny threads.

- Test binaries are now smaller. Many of the mandatory dependencies of
  the test runner have been removed. This ensures many programs can do a
  better job only linking the the thing they're testing. This caused the
  test binaries for LIBC_FMT for example, to decrease from 200kb to 50kb

- long double is no longer used in the implementation details of libc,
  except in the APIs that define it. The old code that used long double
  for time (instead of struct timespec) has now been thoroughly removed.

- ShowCrashReports() is now much tinier in MODE=tiny. Instead of doing
  backtraces itself, it'll just print a command you can run on the shell
  using our new `cosmoaddr2line` program to view the backtrace.

- Crash report signal handling now works in a much better way. Instead
  of terminating the process, it now relies on SA_RESETHAND so that the
  default SIG_IGN behavior can terminate the process if necessary.

- Our pledge() functionality has now been fully ported to AARCH64 Linux.
2023-09-18 21:04:47 -07:00
Justine Tunney
032b1f3449
Implement thread cancellation for aarch64 2023-09-07 08:48:38 -07:00
Justine Tunney
c776a32f75
Replace COSMO define with _COSMO_SOURCE
This change might cause ABI breakages for /opt/cosmos. It's needed to
help us better conform to header declaration practices.
2023-08-13 20:55:04 -07:00
Justine Tunney
40eb3b9d5d
Fully support OpenBSD 7.3
This change (1) upgrades to OpenBSD's newer kernel ABIs, and (2)
modifies APE to have a read-only data segment. Doing this required
creating APE Loader v1.1, which is backwards and forwards compatible
with the previous version.

If you've run the following commands in the past to install your APE
Loader systemwide, then you need to run them again. Ad-hoc installations
shouldn't be impacted. It's also recommended that APE binaries be remade
after upgrading, since they embed old versions of the APE Loader.

    ape/apeuninstall.sh
    ape/apeinstall.sh

This change does more than just fix OpenBSD. The new loader is smarter
and more reliable. We're now able create much tinier ELF and Mach-O data
structures than we could before. Both APE Loader and execvpe() will now
normalize ambiguous argv[0] resolution the same way as the UNIX shell.
Badness with TLS linkage has been solved.

Fixes #826
2023-07-01 18:14:27 -07:00
Justine Tunney
0409096658
Get us closer to building busybox
This change undefines __linux__ and adds APIs like clock_settime(). The
gosh darned getopt_long() API has been reintroduced, thanks to OpenBSD.
2023-06-18 04:13:45 -07:00
Justine Tunney
23e235b7a5
Fix bugs in cosmocc toolchain
This change integrates e58abc1110b335a3341e8ad5821ad8e3880d9bb2 from
https://github.com/ahgamut/musl-cross-make/ which fixes the issues we
were having with our C language extension for symbolic constants. This
change also performs some code cleanup and bug fixes to getaddrinfo().
It's now possible to compile projects like ncurses, readline and python
without needing to patch anything upstream, except maybe a line or two.
Pretty soon it should be possible to build a Linux distro on Cosmo.
2023-06-08 23:44:03 -07:00
Justine Tunney
8f522cb702
Make improvements
This change progresses our AARCH64 support:

- The AARCH64 build and tests are now passing
- Add 128-bit floating-point support to printf()
- Fix clone() so it initializes cosmo's x28 TLS register
- Fix TLS memory layout issue with aarch64 _Alignas vars
- Revamp microbenchmarking tools so they work on aarch64
- Make some subtle improvements to aarch64 crash reporting
- Make kisdangerous() memory checks more accurate on aarch64
- Remove sys_open() since it's not available on Linux AARCH64

This change makes general improvements to Cosmo and Redbean:

- Introduce GetHostIsa() function in Redbean
- You can now feature check using pledge(0, 0)
- You can now feature check using unveil("",0)
- Refactor some more x86-specific asm comments
- Refactor and write docs for some libm functions
- Make the mmap() API behave more similar to Linux
- Fix WIFSIGNALED() which wrongly returned true for zero
- Rename some obscure cosmo keywords from noFOO to dontFOO
2023-06-03 08:12:22 -07:00
Justine Tunney
1422e96b4e
Introduce native support for MacOS ARM64
There's a new program named ape/ape-m1.c which will be used to build an
embeddable binary that can load ape and elf executables. The support is
mostly working so far, but still chasing down ABI issues.
2023-05-20 04:17:03 -07:00
Justine Tunney
550b52abf6
Port a lot more code to AARCH64
- Introduce epoll_pwait()
- Rewrite -ftrapv and ffs() libraries in C code
- Use more FreeBSD code in math function library
- Get significantly more tests passing on qemu-aarch64
- Fix many Musl long double functions that were broken on AARCH64
2023-05-14 09:37:26 -07:00
Justine Tunney
fd34ef732d
Make considerably more progress on AARCH64
- Utilities like pledge.com now build
- kprintf() will no longer balk at 48-bit addresses
- There's a new aarch64-dbg build mode that should work
- gc() and defer() are mostly pacified; avoid using them on aarch64
- THIRD_PART_STB now has Arm Neon intrinsics for fast image handling
2023-05-12 22:42:57 -07:00
Justine Tunney
1f2a5a8fc1
Implement crash reporting for AARCH64
The ShowCrashReports() feature for aarch64 should work even better than
the x86 crash reports. Thanks to the benefit of hindsight these reports
should be rock solid reliable and beautiful to read.

This change also improves the syscall polyfills for aarch64. Some of the
sys_foo() functions have been removed, usually because they're legacy or
downright footguns not worth building.
2023-05-12 05:47:54 -07:00
Justine Tunney
5a455eaa0b
Work on magic numbers for aarch64 2023-05-10 04:20:48 -07:00
Justine Tunney
86d9323a43
Remove sys_getrandom() on NetBSD
This fixes an apparent regression caused by
3f0bcdc3ef where getrandom() on NetBSD 9.2
doesn't appear to work; ktrace oddly reports:

    1446      1 .ape     CALL  #91 (unimplemented getdopt)
    1446      1 .ape     RET   #91 (unimplemented getdopt) -1 errno 78
    Function not implemented
    1446      1 .ape     PSIG  SIGSYS SIG_DFL: code=SI_NOINFO
2023-05-10 04:20:47 -07:00
Justine Tunney
e5e3cdf447
Get LIBC_RUNTIME and LIBC_CALLS building on aarch64 2023-05-10 04:20:47 -07:00
Justine Tunney
2b73e72d59
Make more code aarch64 friendly 2023-05-10 04:20:46 -07:00
Justine Tunney
b407327972
Make fixes and improvements
- clock_nanosleep() is now much faster on OpenBSD and NetBSD
- Thread joining is now much faster on NetBSD
- FreeBSD timestamps are now more accurate
- Thread spawning now goes faster on XNU
- Clean up the clone() code
2022-11-08 10:11:46 -08:00
Justine Tunney
c995838e5c
Make improvements
- Clean up sigaction() code
- Add a port scanner example
- Introduce a ParseCidr() API
- Clean up our futex abstraction code
- Fix a harmless integer overflow in ParseIp()
- Use kernel semaphores on NetBSD to make threads much faster
2022-11-07 02:26:06 -08:00
Justine Tunney
3f0bcdc3ef
Improve cancellations, randomness, and time
- Exhaustively document cancellation points
- Rename SIGCANCEL to SIGTHR just like BSDs
- Further improve POSIX thread cancellations
- Ensure asynchronous cancellations work correctly
- Elevate the quality of getrandom() and getentropy()
- Make futexes cancel correctly on OpenBSD 6.x and 7.x
- Add reboot.com and shutdown.com to examples directory
- Remove underscore prefix from awesome timespec_*() APIs
- Create assertions that help verify our cancellation points
- Remove bad timespec APIs (cmp generalizes eq/ne/gt/gte/lt/lte)
2022-11-05 23:45:32 -07:00
Justine Tunney
022536cab6
Make futexes cancellable by pthreads 2022-11-04 18:36:34 -07:00
Justine Tunney
2278327eba
Implement support for POSIX thread cancellations
This change makes some miracle modifications to the System Five system
call support, which lets us have safe, correct, and atomic handling of
thread cancellations. It all turned out to be cheaper than anticipated
because it wasn't necessary to modify the system call veneers. We were
able to encode the cancellability of each system call into the magnums
found in libc/sysv/syscalls.sh. Since cancellations are so waq, we are
also supporting a lovely Musl Libc mask feature for raising ECANCELED.
2022-11-04 01:04:43 -07:00
Justine Tunney
37d40e087f
Ignore SIGSYS on BSD by default 2022-11-03 09:32:12 -07:00
Justine Tunney
e522aa3a07
Make more threading improvements
- ASAN memory morgue is now lockless
- Make C11 atomics header more portable
- Rewrote pthread keys support to be lockless
- Simplify Python's unicode table unpacking code
- Make crash report write(2) closer to being atomic
- Make it possible to strace/ftrace a single thread
- ASAN now checks nul-terminated strings fast and properly
- Windows fork() now restores TLS memory of calling thread
2022-11-01 23:28:26 -07:00
Justine Tunney
f7ff77d865
Make fixes and improvements
- Invent iso8601us() for faster timestamps
- Improve --strace descriptions of sigset_t
- Rebuild the Landlock Make bootstrap binary
- Introduce MODE=sysv for non-Windows builds
- Permit OFD fcntl() locks under pledge(flock)
- redbean can now protect your kernel from ddos
- Have vfork() fallback to sys_fork() not fork()
- Change kmalloc() to not die when out of memory
- Improve documentation for some termios functions
- Rewrite putenv() and friends to conform to POSIX
- Fix linenoise + strace verbosity issue on Windows
- Fix regressions in our ability to show backtraces
- Change redbean SetHeader() to no-op if value is nil
- Improve fcntl() so SQLite locks work in non-WAL mode
- Remove some unnecessary work during fork() on Windows
- Create redbean-based SSL reverse proxy for IPv4 TurfWar
- Fix ape/apeinstall.sh warning when using non-bash shells
- Add ProgramTrustedIp(), and IsTrustedIp() APIs to redbean
- Support $PWD, $UID, $GID, and $EUID in command interpreter
- Introduce experimental JTqFpD APE prefix for non-Windows builds
- Invent blackhole daemon for firewalling IP addresses via UNIX named socket
- Add ProgramTokenBucket(), AcquireToken(), and CountTokens() APIs to redbean
2022-10-19 07:19:19 -07:00
Justine Tunney
467a332e38
Introduce sigtimedwait() and sigwaitinfo()
This change also invents sigcountset() and strsignal_r() and improves
the quality of siginfo_t handling.
2022-10-10 07:39:44 -07:00
Justine Tunney
59ac141e49
Improve the affinity system calls 2022-10-06 15:08:29 -07:00
Justine Tunney
7822917fc2
Add shared memory apis to redbean
You can now do things like implement mutexes using futexes in your
redbean lua code. This provides the fastest possible inter-process
communication for your production systems when SQLite alone as ipc
or things like pipes aren't sufficient.
2022-10-06 04:55:26 -07:00
Justine Tunney
7549a5755e
Support futexes on FreeBSD 2022-10-02 11:57:13 -07:00
Justine Tunney
acd8900071
Add fexecve() and map O_EXEC to O_PATH on Linux 2022-10-02 09:15:46 -07:00
Justine Tunney
c7a8cd21e9
Improve system call wrappers
This change improves copy_file_range(), sendfile(), splice(), openpty(),
closefrom(), close_range(), fadvise() and posix_fadvise() in addition to
writing tests that confirm things like errno and seeking behavior across
platforms. We now less aggressively polyfill behavior with some of these
functions when the platform support isn't available. Please see:

https://justine.lol/cosmopolitan/functions.html
2022-09-19 15:06:25 -07:00
Gavin Hayes
4c40c500b8
Add getgroups and setgroups (#619) 2022-09-18 02:48:53 -07:00
Justine Tunney
aab4ee4072
Add sys_ prefix to unwrapped system calls
This change also implements getlogin() and getlogin_r().
2022-09-13 11:20:35 -07:00
Justine Tunney
2d17ab016c
Perform more low-level code cleanup 2022-09-09 04:07:08 -07:00
Gavin Hayes
a849a63771
Implement sigpending for sysv and nt (#597) 2022-09-07 05:38:12 -07:00
Justine Tunney
c5c4dfcd21 Improve quality of raise(), abort(), and tkill()
This change fixes a nasty bug where SIG_IGN and SIG_DFL weren't working
as advertised on BSDs. This change also fixes the tkill() definition on
MacOS so it maps to __pthread_kill().
2022-09-03 20:17:54 -07:00
Justine Tunney
35203c0551 Do some string library work 2022-08-20 22:17:14 -07:00
Justine Tunney
c2211c9e63 Polyfill statfs() and fstatfs() on Windows 2022-08-17 19:01:51 -07:00
Justine Tunney
f7ee9d7d99 Polyfill statfs() and fstatfs() on BSD distros 2022-08-17 14:54:03 -07:00
Justine Tunney
7cf66bc161 Prevent Make from talking to public Internet
This change introduces the nointernet() function which may be called to
prevent a process and its descendants from communicating with publicly
routable Internet addresses. GNU Make has been modified to always call
this function. In the future Landlock Make will have a way to whitelist
subnets to override this behavior, or disable it entirely. Support is
available for Linux only. Our firewall does not require root access.

Calling nointernet() will return control to the caller inside a new
process that has a SECCOMP BPF filter installed, which traps network
related system calls. Your original process then becomes a permanent
ptrace() supervisor that monitors all processes and threads descending
from the returned child. Whenever a networking system call happens the
kernel will stop the process and wakes up the monitor, which then peeks
into the child memory to read the sockaddr_in to determine if it's ok.

The downside to doing this is that there can be only one supervisor at a
time using ptrace() on a process. So this firewall won't be enabled if
you run make under strace or inside gdb. It also makes testing tricky.
2022-08-12 21:51:39 -07:00
Justine Tunney
0277d7d6e9 Rewrite Linux pledge() code so it can be a payload
It's now possible to build our pledge() polyfill as a dynamic shared
object that can be injected into a glibc executable using LD_PRELOAD
2022-08-08 11:41:08 -07:00
Justine Tunney
5546559034 Improve pledge() usability and consistency
- We now kill the program on violations like OpenBSD
- We now print a message explaining which promise is needed
- This change also fixes a linkage bug with thread local storage
- Your sigaction() handlers should now be more thread safe

A new `__pledge_mode` global has been introduced to make pledge() more
customizable on Linux. For example:

    __attribute__((__constructor__)) static void init(void) {
      __pledge_mode = SECCOMP_RET_ERRNO | EPERM;
    }

Can be used to restore our old permissive pledge() behavior.
2022-08-07 16:18:33 -07:00
Justine Tunney
13c1c45075 Make some last minute improvements to make.com 2022-08-07 05:59:53 -07:00