Commit graph

852 commits

Author SHA1 Message Date
Steven Dee (Jōshin)
7e780e57d4
More ctl::string optimization (#1232)
Moves some isbig checks into string.h, enabling smarter optimizations to
be made on small strings. Also we no longer zero out our string prior to
calling the various constructors, buying back the performance we lost on
big strings when we made the small-string optimization. We further add a
little optimization to the big_string copy constructor: if the string is
using half or more of its capacity, then we don’t recompute capacity and
just take the old string’s. As well, the copy constructor always makes a
small string when it will fit, even if copied from a big string that got
truncated.

This also reworks the test to follow the idiom adopted elsewhere re stl,
and adds a helper function to tell if a string is small based on data().
2024-06-20 14:52:12 -04:00
Steven Dee (Jōshin)
9a5a13854d
CTL: utility.h, use ctl::swap in string (#1227)
* Add ctl utility.h

Implements forward, move, swap, and declval. This commit also adds a def
for nullptr_t to cxx.inc. We need it now because the CTL headers stopped
including anything from libc++, so we no longer get their basic types.

* Use ctl::swap in string

The STL spec says that swap is located in the string_view header anyawy.
Performance-wise this is a noop, but it’s slightly cleaner.
2024-06-19 01:00:59 -04:00
Steven Dee (Jōshin)
f9dd5683a4
Implement ctl::unique_ptr (#1216)
The way unique_ptr is supposed to work is as a purely compile-time check
that your raw pointers are getting deleted when they go out of scope. It
should ideally emit the same exact machine code as if you were using raw
pointers with manual deletes.

Part of what this means is that under normal circumstances, a unique_ptr
shouldn’t take up more space than a raw pointer - in other words, sizeof
unique_ptr<T> should == sizeof(T*).

The present PR doesn’t bother with the specialization for array types. I
also left a couple other parts of the STL API unimplemented. I’d love to
see someone else implement these, or I’ll get to them at some point.
2024-06-15 22:54:52 -04:00
Jōshin
89fc95fefd
Rerun clang-format on the repo (#1217)
🚨 clang-format changes output per version!

This is with version 19.0.0. The modifications seem to be fixing the old
version’s errors - mainly involving omitted whitespace around binary ops
and inserted whitespace between goto labels and colons (if followed by a
curly brace.)

Also fixes a few mistakes made by e.g. someone (ahem) forgetting to pass
his ctl/string.h modifications through it.

We should add this to .git-blame-ignore-revs once we have its final hash
on master.
2024-06-15 16:34:48 -04:00
Jōshin
d9b4f647d8
Uncomment swap test (#1210) 2024-06-10 21:51:19 -07:00
Jōshin
0dde3a0e70
Get rid of preprocessor stuff in test (#1202)
Also it's a bit more idiomatic to say s.npos rather than string::npos.
2024-06-10 07:00:37 -07:00
Jōshin
118db71121
Provide a minimal new.h for CTL (#1205)
This replaces the STL <new> header. Mainly, it defines a global operator
new and operator delete, as well as the placement versions of these. The
placement versions are required to not get compile errors when trying to
write a placement new statement.

Each of these operators is defined with many, many different variants. A
glance at new.cc is recommended followed by a chaser of the Alexandrescu
talk "std::allocator is to Allocation as std::vector is to Vexation". We
must provide a global-namespace source-level definition of each operator
and it is illegal for any of them to be marked inline, so here we are.

The upshot is that we no longer need to include <new>, and our optional/
vector headers are self-contained.
2024-06-08 15:05:38 -07:00
Alkis Evlogimenos
d44a7dc603
Fix bugs in in ctl::optional (#1203)
Manually manage the lifetime of `value_` by using an anonymous
`union`. This fixes a bunch of double-frees and double-constructs.

Additionally move the `present_` flag last. When `T` has padding
`present_` will be placed there saving `alignof(T)` bytes from
`sizeof(optional<T>)`.
2024-06-07 20:47:24 -04:00
Jōshin
2ba6b0158f
Fix some memory issues with ctl::string (#1201)
There were a few errors in how capacity and memory was being handled for
small strings. The capacity errors meant that small strings would become
big strings too soon, and the memory error introduced undefined behavior
that was caught by CheckMemoryLeaks in our test file but only sometimes.

The crucial change is in reserve: we only copy n bytes into p2, and then
we manually set the null terminator instead of expecting it to have been
there already. (E.g. it might not be there for an empty small string.)

We also fix one other doozy in append when we were exactly at the small-
to-big string boundary: we set the last byte (i.e., the remainder field)
to 0, then decremented it, giving us size_t max. Whoops. We boneheadedly
fix this by setting the 0 byte after we've fixed up the remainder, so it
is at worst a no-op.

Otherwise, capacity now works the same for small strings as it does with
big strings: it's the amount of space available including the null byte.

We test all of this with a new test that only gets included if our class
under test is not std::string (presumably meaning it's ctl::string.) The
test manually verifies that the small string optimization behaves how we
expect.

Since this test checks against std::string, we go ahead and include that
other header from the STL.

Also modifies the new test we introduced to also run on std::string, but
it just does the append without expecting anything about how its data is
stored. We also check that the string has the right value afterwards.
2024-06-07 01:15:37 -04:00
Jōshin
8b3e368e9a
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.

The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.

Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.

This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".

We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.

There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-06 20:50:51 -04:00
Jōshin
2c5e7ec547
Add terminating :vi on some modelines
Noticed because the settings they specified weren't getting picked up by
editor sessions in those files.
2024-06-05 20:36:55 -07:00
Jōshin
04c6bc478e
vim C++ filetype is still spelled "cpp" 2024-06-05 16:34:47 -07:00
Justine Tunney
3609f65de3
Make malloc() go 200x faster
If pthread_create() is linked into the binary, then the cosmo runtime
will create an independent dlmalloc arena for each core. Whenever the
malloc() function is used it will index `g_heaps[sched_getcpu() / 2]`
to find the arena with the greatest hyperthread / numa locality. This
may be configured via an environment variable. For example if you say
`export COSMOPOLITAN_HEAP_COUNT=1` then you can restore the old ways.
Your process may be configured to have anywhere between 1 - 128 heaps

We need this revision because it makes multithreaded C++ applications
faster. For example, an HTTP server I'm working on that makes extreme
use of the STL went from 16k to 2000k requests per second, after this
change was made. To understand why, try out the malloc_test benchmark
which calls malloc() + realloc() in a loop across many threads, which
sees a a 250x improvement in process clock time and 200x on wall time

The tradeoff is this adds ~25ns of latency to individual malloc calls
compared to MODE=tiny, once the cosmo runtime has transitioned into a
fully multi-threaded state. If you don't need malloc() to be scalable
then cosmo provides many options for you. For starters the heap count
variable above can be set to put the process back in single heap mode
plus you can go even faster still, if you include tinymalloc.inc like
many of the programs in tool/build/.. are already doing since that'll
shave tens of kb off your binary footprint too. Theres also MODE=tiny
which is configured to use just 1 plain old dlmalloc arena by default

Another tradeoff is we need more memory now (except in MODE=tiny), to
track the provenance of memory allocation. This is so allocations can
be freely shared across threads, and because OSes can reschedule code
to different CPUs at any time.
2024-06-05 02:02:14 -07:00
Justine Tunney
9906f299bb
Refactor and improve CTL and other code 2024-06-04 05:45:48 -07:00
Justine Tunney
1d8f37a2f0
Fix the MODE=tiny builds 2024-06-03 10:36:38 -07:00
Justine Tunney
4937843f70
Introduce Cosmopolitan Templates Library (CTL) 2024-06-03 09:21:59 -07:00
Justine Tunney
2ca491dc56
Write more __demangle() tests 2024-06-02 07:37:15 -07:00
Justine Tunney
9aa353d88b
Document __demangle() and fix a const func ptr bug 2024-06-02 04:15:48 -07:00
Justine Tunney
165c6b37e2
Add C++ demangling to privileged runtime
Cosmo will now print C++ symbols correctly in --ftrace logs and
backtraces. Doing this required reducing the memory requirement
of the __demangle() function by 3x. This was accomplished using
16-bit indices and 16-bit malloc granularity. That puts a limit
on the longest symbol we can successfully decode, which I think
would be around 6553 characters long, given a 65536-byte buffer
2024-06-01 20:10:58 -07:00
Jōshin
f032b5570b
Run clang-format (#1197) 2024-06-01 16:30:43 -04:00
Justine Tunney
ea081b262c
Add some noexcept annotations 2024-06-01 03:19:53 -07:00
Justine Tunney
fae1c32267
Encode ±INFINITY as ±1e5000
The V8 behavior of encoding infinity as null doesn't make sense to me.
Using ±1e5000 is better, because JSON.parse decodes it as INFINITY and
the information is preserved. This could be a breaking change for some
2024-06-01 03:19:50 -07:00
Justine Tunney
500a47bc2f
Fix undefined behavior in unit test
Fixes #1194
2024-05-29 20:31:46 -07:00
Justine Tunney
e4d25d68e4
Drop support for Windows 8
Microsoft caused some very gentle breakages for Cosmopolitan. They
removed the version information from the PEB which caused uname to
report WINDOWS 0.0.0. We should have called GetVersionExW but that
doesn't really exist anymore either. Windows policy is now to give
whatever version we used in ape/ape.S. Windows8 has been EOL since
2023-01-10 so lets avoid our modern executables being relegated to
legacy infrastructure. Requiring Windows 10+ going forward lets us
remove runtime compatibility bloat from the codebase. Further note
Cosmopolitan maintains a Windows Vista branch on GitHub, so anyone
preferring the older versions, can still have a future with Cosmo.

Another neat thing this fixes is UTF-8 support in the console. The
changes Microsoft made broke the if statement that enabled UTF8 in
terminals. This explains why bug reports had broken arrows. In the
future this should be less of an issue, since the PEB code is gone
which means we more strictly conform to only Microsoft's WIN32 API
2024-05-29 19:37:47 -07:00
Justine Tunney
f31a98d50a
Fix bug with realpath() on Windows 2024-05-29 18:47:01 -07:00
Justine Tunney
a05ce3ad9d
Support avx512f + vpclmulqdq crc32() acceleration
Cosmo's _Cz_crc32() function now goes 73 GiB/s on Threadripper. This
will significantly improve the performance of the PKZIP file format.
This algorithm is also used by apelink, to create deterministic ids.
2024-05-29 10:13:37 -07:00
Justine Tunney
07cef612c3
Make dlmalloc 2.4x faster for multithreading
This change adds a TLS freelist for small dynamic memory allocations.
Cosmopolitan's TIB is now 512 bytes in size. Single-threaded malloc()
performance isn't impacted by this, until pthread_create() is called.
Single-threaded programs may also want to consider using:

    #include "libc/mem/tinymalloc.inc"

Which will shave 30k off the executable size and sometimes go faster.
2024-05-28 11:18:34 -07:00
Justine Tunney
deaef81463
Favor siginfo_t over struct siginfo 2024-05-28 02:34:17 -07:00
Justine Tunney
8e68384e15
Upgrade to 2022-era LLVM LIBCXX 2024-05-27 02:12:27 -07:00
Justine Tunney
086d7006da
Improve crash handler on XNU
This avoids an issue where a crash signal could cause the MacOS process
to freeze and consume all CPU rather than dying as it rightfully should
2024-05-26 18:42:09 -07:00
Justine Tunney
af3f62a71a
Ensure io requests are always capped at 0x7ffff000
This gives us the Linux behavior across platforms.

Fixes #1189
2024-05-26 16:53:13 -07:00
Justine Tunney
ce9aeb2aed
Release Cosmopolitan v3.3.7 2024-05-24 19:37:21 -07:00
Justine Tunney
ed93fc3dd7
Fix fread() with 2gb+ sizes 2024-05-24 19:28:23 -07:00
Jōshin
787b04f752
Run all BLAKE2B256 test vectors (#1185) 2024-05-24 10:59:23 -07:00
Justine Tunney
cf70a44756
Support shebang on Windows
Fixes #1010
2024-05-20 22:11:42 -07:00
Gavin Hayes
624119ea38
Fix NT accept/connect not initializing with SO_UPDATE_*_CONTEXT (#1164) 2024-05-17 02:45:30 -07:00
Justine Tunney
2f3c6e7cc3
Revert "Remove zlib namespacing (#1142)"
This reverts commit 5488f0b2ca which was a
good experiment to try, that didn't work out due to #1176

Fixes #1176
2024-05-14 20:45:23 -07:00
Justine Tunney
19c81863a3
Improve crash backtrace reliability
We're now able to pretty print a C++ backtrace upon crashing in pretty
much any runtime execution scenario. The default pledge sandbox policy
on Linux is now to return EPERM. If you call pledge and have debugging
functions linked (e.g. GetSymbolTable) then the symbol table shall get
loaded before any security policy is put in place. This change updates
build/bootstrap/fixupobj too and fixes some other sneaky build errors.
2024-05-07 18:10:28 -07:00
Justine Tunney
57c0b065c8
Make old C++ demangler asynchronous signal safe
It's now possible to safely print C++ backtraces from signal handlers.
This symbol demangler doesn't need malloc, tls, or even static memory.
Additionally, this change makes it 2x faster and adds test cases. It's
almost as performant and accurate as the libcxxabi implementation now.
2024-05-07 03:41:33 -07:00
Justine Tunney
b0df6c1fce
Implement proper time zone support
Cosmopolitan now supports 104 time zones. They're embedded inside any
binary that links the localtime() function. Doing so adds about 100kb
to the binary size. This change also gets time zones working properly
on Windows for the first time. It's not needed to have /etc/localtime
exist on Windows, since we can get this information from WIN32. We're
also now updated to the latest version of Paul Eggert's TZ library.
2024-05-04 23:06:37 -07:00
Justine Tunney
8a44f913ae
Delete flaky tests
Signals are extremely difficult to unit test reliably. This is why
functions like sigsuspend() exist. When testing something else and
portably it becomes impossible without access to kernel internals.

OpenMP flakes in QEMU on one of my workstations. I don't think the
support is production worthy, because there's been issues on MacOS
additionally. It works great for every experiment I've used it for
though. However a flaky test is worse than no test at all. So it's
removed until someone takes an interest in productionizing it.
2024-05-03 09:11:04 -07:00
Gautham
5488f0b2ca
Remove zlib namespacing (#1142)
We have an optimized version of zlib from the Chromium project.
We need it for a lot of our libc services. It would be nice to export
this to user applications if we can, since projects like llamafile are
already depending on it under the private namespace, to avoid
needing to link zlib twice.
2024-05-03 08:07:25 -07:00
Gavin Hayes
deff138e7e
recvfrom: don't convert address if addrsize is 0 (#1153) 2024-05-03 08:03:57 -07:00
Gavin Hayes
b6e40a3a58
Add /dev/(u)random on NT (#1163) 2024-05-03 07:59:51 -07:00
Cadence Ember
8f6bc9dabc
Let signals interrupt fgets unless SA_RESTART set (#1152) 2024-05-03 07:49:41 -07:00
Justine Tunney
5c6877b02b
Introduce support for trapping math
The feenableexcept() and fedisableexcept() APIs are now provided which
let you detect when NaNs appear the moment it happens from anywhere in
your program. Tests have also been added for the mission critical math
functions expf() and erff(), whose perfect operation has been assured.
See examples/trapping.c to see how to use this powerful functionality.
2024-04-30 13:38:43 -07:00
Jōshin
6e6fc38935
Apply clang-format update to repo (#1154)
Commit bc6c183 introduced a bunch of discrepancies between what files
look like in the repo and what clang-format says they should look like.
However, there were already a few discrepancies prior to that. Most of
these discrepancies seemed to be unintentional, but a few of them were
load-bearing (e.g., a #include that violated header ordering needing
something to have been #defined by a 'later' #include.)

I opted to take what I hope is a relatively smooth-brained approach: I
reverted the .clang-format change, ran clang-format on the whole repo,
reapplied the .clang-format change, reran clang-format again, and then
reverted the commit that contained the first run. Thus the full effect
of this PR should only be to apply the changed formatting rules to the
repo, and from skimming the results, this seems to be the case.

My work can be checked by applying the short, manual commits, and then
rerunning the command listed in the autogenerated commits (those whose
messages I have prefixed auto:) and seeing if your results agree.

It might be that the other diffs should be fixed at some point but I'm
leaving that aside for now.

fd '\.c(c|pp)?$' --print0| xargs -0 clang-format -i
2024-04-25 10:38:00 -07:00
Jōshin
342d0c81e5
vim spells the c++ filetype 'cpp' 2024-04-24 13:56:37 -07:00
Jōshin
06839ab301
Change loop bound on uuidv4 test (#1143)
A half second (on my machine) is too long for a unit test. 1000
iterations is probably still overkill, but 0.01 seconds is fine.

Also made `y` local.
2024-04-12 12:22:34 -04:00
BONNAURE Olivier
39dde41516
[Redbean] Add UuidV4 method (#1140) 2024-04-12 11:10:27 -04:00