Commit graph

834 commits

Author SHA1 Message Date
Justine Tunney
3609f65de3
Make malloc() go 200x faster
If pthread_create() is linked into the binary, then the cosmo runtime
will create an independent dlmalloc arena for each core. Whenever the
malloc() function is used it will index `g_heaps[sched_getcpu() / 2]`
to find the arena with the greatest hyperthread / numa locality. This
may be configured via an environment variable. For example if you say
`export COSMOPOLITAN_HEAP_COUNT=1` then you can restore the old ways.
Your process may be configured to have anywhere between 1 - 128 heaps

We need this revision because it makes multithreaded C++ applications
faster. For example, an HTTP server I'm working on that makes extreme
use of the STL went from 16k to 2000k requests per second, after this
change was made. To understand why, try out the malloc_test benchmark
which calls malloc() + realloc() in a loop across many threads, which
sees a a 250x improvement in process clock time and 200x on wall time

The tradeoff is this adds ~25ns of latency to individual malloc calls
compared to MODE=tiny, once the cosmo runtime has transitioned into a
fully multi-threaded state. If you don't need malloc() to be scalable
then cosmo provides many options for you. For starters the heap count
variable above can be set to put the process back in single heap mode
plus you can go even faster still, if you include tinymalloc.inc like
many of the programs in tool/build/.. are already doing since that'll
shave tens of kb off your binary footprint too. Theres also MODE=tiny
which is configured to use just 1 plain old dlmalloc arena by default

Another tradeoff is we need more memory now (except in MODE=tiny), to
track the provenance of memory allocation. This is so allocations can
be freely shared across threads, and because OSes can reschedule code
to different CPUs at any time.
2024-06-05 02:02:14 -07:00
Justine Tunney
9aa353d88b
Document __demangle() and fix a const func ptr bug 2024-06-02 04:15:48 -07:00
Justine Tunney
ea081b262c
Add some noexcept annotations 2024-06-01 03:19:53 -07:00
Justine Tunney
fae1c32267
Encode ±INFINITY as ±1e5000
The V8 behavior of encoding infinity as null doesn't make sense to me.
Using ±1e5000 is better, because JSON.parse decodes it as INFINITY and
the information is preserved. This could be a breaking change for some
2024-06-01 03:19:50 -07:00
Justine Tunney
e4d25d68e4
Drop support for Windows 8
Microsoft caused some very gentle breakages for Cosmopolitan. They
removed the version information from the PEB which caused uname to
report WINDOWS 0.0.0. We should have called GetVersionExW but that
doesn't really exist anymore either. Windows policy is now to give
whatever version we used in ape/ape.S. Windows8 has been EOL since
2023-01-10 so lets avoid our modern executables being relegated to
legacy infrastructure. Requiring Windows 10+ going forward lets us
remove runtime compatibility bloat from the codebase. Further note
Cosmopolitan maintains a Windows Vista branch on GitHub, so anyone
preferring the older versions, can still have a future with Cosmo.

Another neat thing this fixes is UTF-8 support in the console. The
changes Microsoft made broke the if statement that enabled UTF8 in
terminals. This explains why bug reports had broken arrows. In the
future this should be less of an issue, since the PEB code is gone
which means we more strictly conform to only Microsoft's WIN32 API
2024-05-29 19:37:47 -07:00
Justine Tunney
f31a98d50a
Fix bug with realpath() on Windows 2024-05-29 18:47:01 -07:00
Justine Tunney
a05ce3ad9d
Support avx512f + vpclmulqdq crc32() acceleration
Cosmo's _Cz_crc32() function now goes 73 GiB/s on Threadripper. This
will significantly improve the performance of the PKZIP file format.
This algorithm is also used by apelink, to create deterministic ids.
2024-05-29 10:13:37 -07:00
Justine Tunney
b74b974cfd
Introduce #include <tinygetopt.h>
The normal getopt() function is bloated because it links printf(). This
change exports the original authentic bsd getopt function, that cosmo's
always used internally so cosmocc users don't need to include internals
2024-05-29 10:11:17 -07:00
Justine Tunney
07cef612c3
Make dlmalloc 2.4x faster for multithreading
This change adds a TLS freelist for small dynamic memory allocations.
Cosmopolitan's TIB is now 512 bytes in size. Single-threaded malloc()
performance isn't impacted by this, until pthread_create() is called.
Single-threaded programs may also want to consider using:

    #include "libc/mem/tinymalloc.inc"

Which will shave 30k off the executable size and sometimes go faster.
2024-05-28 11:18:34 -07:00
Justine Tunney
deaef81463
Favor siginfo_t over struct siginfo 2024-05-28 02:34:17 -07:00
Justine Tunney
c638eabfe0
Fix compiler warning 2024-05-27 02:23:24 -07:00
Justine Tunney
8e68384e15
Upgrade to 2022-era LLVM LIBCXX 2024-05-27 02:12:27 -07:00
Justine Tunney
086d7006da
Improve crash handler on XNU
This avoids an issue where a crash signal could cause the MacOS process
to freeze and consume all CPU rather than dying as it rightfully should
2024-05-26 18:42:09 -07:00
Justine Tunney
c2db3b703a
Introduce --timelog=FILE flag to GNU Make 2024-05-25 14:50:20 -07:00
Justine Tunney
1df4296208
Fix stdio for character device regression
Caused by ed93fc3dd7
2024-05-25 05:58:09 -07:00
Justine Tunney
f029375d39
Introduce MAP_HUGETLB 2024-05-24 11:44:44 -07:00
Jōshin
0768807935
Rename python -> python3 (closes #1144) (#1187)
When we removed the com suffix from ape binaries, we broke the build for
ape's python for any case-insensitive file system, i.e. Windows and XNU,
because there is a third_party/python/Python that gets mirrored in the o
directory with the python object files and clashes with the binary name.
This patch hacks around this by renaming the binary to "python3" so that
it no longer clashes with that directory.
2024-05-24 10:56:33 -07:00
Jōshin
65c9b28e99
Fix buffer overflow in os.tmpname (#1180)
At least on macOS, `strlen(getenv("TMPDIR"))` is 50. We now allow a /tmp
that takes up to 120 or so bytes to spell. Instead of overflowing, we do
a bounds check and the function fails successfully on even longer /tmps.

Fixes #1108 (os.tmpname crashes redbean)
2024-05-20 03:46:27 -04:00
Justine Tunney
2f3c6e7cc3
Revert "Remove zlib namespacing (#1142)"
This reverts commit 5488f0b2ca which was a
good experiment to try, that didn't work out due to #1176

Fixes #1176
2024-05-14 20:45:23 -07:00
Jōshin
317c8bc312
Update MODE=tiny time zone list (#1167)
I took one canonical IANA zone ID from each of the different colored
regions in this article, except those that do not observe DST and do
not have a Google office. See the "Time in Europe" Wikipedia article.

As to which canonical ID to use, this was somewhat arbitrary. Brussels
was obvious, as the de facto capital of the EU. For the rest, I mostly
just went with lexicographic ordering of the most recognizable options.

I've sorted the American zones. This Keeps the U.S. ones together but
does everything alphabetically otherwise. I've added the remaining
Canadian zones These have DST (and Newfoundland is off by a half-
hour from a UTC interval) so they cannot use Etc/. The Pacific/ zones
are sort of sorted. The Chathan Islands have been added. This is the
last of the zones I believe with a non-integer hour offset from UTC.
2024-05-06 16:48:49 -07:00
Justine Tunney
9ea64725b6
Fix build error 2024-05-04 23:23:56 -07:00
Justine Tunney
f9fc7eb49f
Fix MODE=dbg build errors 2024-05-04 23:20:12 -07:00
Justine Tunney
b0df6c1fce
Implement proper time zone support
Cosmopolitan now supports 104 time zones. They're embedded inside any
binary that links the localtime() function. Doing so adds about 100kb
to the binary size. This change also gets time zones working properly
on Windows for the first time. It's not needed to have /etc/localtime
exist on Windows, since we can get this information from WIN32. We're
also now updated to the latest version of Paul Eggert's TZ library.
2024-05-04 23:06:37 -07:00
Justine Tunney
8a44f913ae
Delete flaky tests
Signals are extremely difficult to unit test reliably. This is why
functions like sigsuspend() exist. When testing something else and
portably it becomes impossible without access to kernel internals.

OpenMP flakes in QEMU on one of my workstations. I don't think the
support is production worthy, because there's been issues on MacOS
additionally. It works great for every experiment I've used it for
though. However a flaky test is worse than no test at all. So it's
removed until someone takes an interest in productionizing it.
2024-05-03 09:11:04 -07:00
Gautham
5488f0b2ca
Remove zlib namespacing (#1142)
We have an optimized version of zlib from the Chromium project.
We need it for a lot of our libc services. It would be nice to export
this to user applications if we can, since projects like llamafile are
already depending on it under the private namespace, to avoid
needing to link zlib twice.
2024-05-03 08:07:25 -07:00
Justine Tunney
181cd4cbe8
Add sysctlbyname() for MacOS 2024-05-02 23:21:43 -07:00
Justine Tunney
0eef971494
Add much of C11 threads.h API 2024-04-28 07:04:08 -07:00
Jōshin
342d0c81e5
vim spells the c++ filetype 'cpp' 2024-04-24 13:56:37 -07:00
Alexey Izbyshev
f8c0186221
Fix calling __dns_parse with potentially too large rlen
__res_send returns the full answer length even if it didn't fit the
buffer, but __dns_parse expects the length of the filled part of the
buffer.

Analogous to Musl commit 77327ed064bd57b0e1865cd0e0364057ff4a53b4 which
fixed the only other __dns_parse call site.
2024-04-23 09:36:07 -07:00
Quentin Rameau
6992d8c195
Remove arbitrary limit from DNS result parsing
The name resolution would abort when getting more than 63 records per
request, due to what seems to be a left-over from the original code.
This check was non-breaking but spurious prior to TCP fallback
support, since any 512-byte packet with more than 63 records was
necessarily malformed. But now, it wrongly rejects valid results.

Reported by Daniel Stefanik in Alpine Linux aports issue 15320.
2024-04-23 09:33:02 -07:00
Justine Tunney
1a6b4ab627
Import mntent bug fixes from Musl Libc
f314e133929b6379eccc632bef32eaebb66a7335
Author: Rich Felker <dalias@aerifal.cx>
Date:   Thu Nov 16 12:55:21 2023 -0500

    mntent: fields are delimited only by tabs or spaces, not general whitespace

    this matters because the kernel-provided mtab only escapes tabs,
    spaces, newlines, and backslashes. it leaves carriage returns, form
    feeds, and vertical tabs literal.

commit ee1d39bc1573c1ae49ee6b658938b56bbef95a6c
Author: q66 <q66@chimera-linux.org>
Date:   Thu Nov 9 20:48:44 2023 +0100

    mntent: unescape octal sequences

    As entries in mtab are delimited by spaces, whitespace characters
    are escaped as octal sequences. When reading them out, we have to
    unescape these sequences to get the proper string.
2024-04-23 09:29:56 -07:00
Justine Tunney
9e848abad9
Add missing Musl license headers 2024-04-23 09:29:46 -07:00
Justine Tunney
cf9a1f7f33
Fix MODE=optlinux build 2024-04-06 21:02:19 -07:00
Justine Tunney
49a32136f8
Upgrade the One True Awk 2024-04-06 19:21:48 -07:00
Justine Tunney
8bfd56b59e
Rename _bsr/_bsf to bsr/bsf
Now that these functions are behind _COSMO_SOURCE there's no reason for
having the ugly underscore anymore. To use these functions, you need to
pass -mcosmo to cosmocc.
2024-03-04 17:33:26 -08:00
Justine Tunney
a6baba1b07
Stop using .com extension in monorepo
The WIN32 CreateProcess() function does not require an .exe or .com
suffix in order to spawn an executable. Now that we have Cosmo bash
we're no longer so dependent on the cmd.exe prompt.
2024-03-03 03:12:19 -08:00
Justine Tunney
64a9e6fe56
Fix compiler runtime for _Float16 type 2024-02-27 09:06:23 -08:00
Justine Tunney
e72a88ea70
Make fixups for libcrypt 2024-02-23 07:39:44 -08:00
Justine Tunney
4d018306b3
Fix MODE=optlinux build 2024-02-20 14:24:56 -08:00
Justine Tunney
b3bb93d1d9
Fix MODE=opt build 2024-02-20 14:12:12 -08:00
Justine Tunney
957c61cbbf
Release Cosmopolitan v3.3
This change upgrades to GCC 12.3 and GNU binutils 2.42. The GNU linker
appears to have changed things so that only a single de-duplicated str
table is present in the binary, and it gets placed wherever the linker
wants, regardless of what the linker script says. To cope with that we
need to stop using .ident to embed licenses. As such, this change does
significant work to revamp how third party licenses are defined in the
codebase, using `.section .notice,"aR",@progbits`.

This new GCC 12.3 toolchain has support for GNU indirect functions. It
lets us support __target_clones__ for the first time. This is used for
optimizing the performance of libc string functions such as strlen and
friends so far on x86, by ensuring AVX systems favor a second codepath
that uses VEX encoding. It shaves some latency off certain operations.
It's a useful feature to have for scientific computing for the reasons
explained by the test/libcxx/openmp_test.cc example which compiles for
fifteen different microarchitectures. Thanks to the upgrades, it's now
also possible to use newer instruction sets, such as AVX512FP16, VNNI.

Cosmo now uses the %gs register on x86 by default for TLS. Doing it is
helpful for any program that links `cosmo_dlopen()`. Such programs had
to recompile their binaries at startup to change the TLS instructions.
That's not great, since it means every page in the executable needs to
be faulted. The work of rewriting TLS-related x86 opcodes, is moved to
fixupobj.com instead. This is great news for MacOS x86 users, since we
previously needed to morph the binary every time for that platform but
now that's no longer necessary. The only platforms where we need fixup
of TLS x86 opcodes at runtime are now Windows, OpenBSD, and NetBSD. On
Windows we morph TLS to point deeper into the TIB, based on a TlsAlloc
assignment, and on OpenBSD/NetBSD we morph %gs back into %fs since the
kernels do not allow us to specify a value for the %gs register.

OpenBSD users are now required to use APE Loader to run Cosmo binaries
and assimilation is no longer possible. OpenBSD kernel needs to change
to allow programs to specify a value for the %gs register, or it needs
to stop marking executable pages loaded by the kernel as mimmutable().

This release fixes __constructor__, .ctor, .init_array, and lastly the
.preinit_array so they behave the exact same way as glibc.

We no longer use hex constants to define math.h symbols like M_PI.
2024-02-20 13:27:59 -08:00
Justine Tunney
2ab9e9f7fd
Make improvements
- Introduce portable sched_getcpu() api
- Support GCC's __target_clones__ feature
- Make fma() go faster on x86 in default mode
- Remove some asan checks from core libraries
- WinMain() now ensures $HOME and $USER are defined
2024-02-12 10:23:00 -08:00
Justine Tunney
616717fa82
Fine tune OpenMP some more 2024-01-30 06:30:24 -08:00
Justine Tunney
369aebfc48
Make improvements
- Let OpenMP be usable via cosmocc
- Let libunwind be usable via cosmocc
- Make X86_HAVE(AVXVNNI) work correctly
- Avoid using MAP_GROWSDOWN on qemu-aarch64
- Introduce in6addr_any and in6addr_loopback
- Have thread stacks use MAP_GROWSDOWN by default
- Ask OpenMP to not use filesystem to manage threads
- Make NI_MAXHOST and NI_MAXSERV available w/o _GNU_SOURCE
2024-01-29 16:31:58 -08:00
Justine Tunney
5f8e9f14c1
Add OpenMP support 2024-01-28 22:39:02 -08:00
Justine Tunney
c1e18e7903
Restore MODE=dbg support
We recently broke MODE=dbg support when we added C++ exception support.
This change adds the missing UBSAN interfaces, needed to get it working
again. Some of the ASAN checking in the SJLJ guts needed to be disabled
since I doubt anyone's combined the two features until now.
2024-01-26 23:07:18 -08:00
Justine Tunney
8ab3a545c6
Increase build memory quota
If you install qemu-user from apt then glibc links a lot of address
space bloat that causes pthread_create() to ENOMEM (a.k.a. EAGAIN).
Boosting the virtual memory quota from 512m to 2048m will hopefully
future proof the build for the future, as Linux distros get fatter.
Please note this only applies to MODE=aarch64 on x86_64 builds when
you're using QEMU from Debian/Ubuntu rather than installing the one
cosmo provides in third_party/qemu/qemu-aarch64.gz. This change may
also be useful to people who are using the host compiler toolchain.
2024-01-22 10:02:30 -08:00
Trung Nguyen
8834dde0c2
libcxx: Add missing implementation source files (#1089)
Added the implementation for `std::bad_any_cast` from upstream
`any.cpp`, and `std::bad_variant_access` from upstream `variant.cpp`.

This fixes missing `vtable` and `typeinfo` symbols when trying to link
code referencing these exception types.
2024-01-18 08:20:25 -08:00
Justine Tunney
1ef63eb206
Retire third_party/quickjs/
QuickJS cosmocc binaries are now being distributed on
https://bellard.org/quickjs/
2024-01-17 12:35:46 -08:00
Justine Tunney
fd75fd1467
Make std::pair trivial 2024-01-09 09:51:59 -08:00