Commit graph

2310 commits

Author SHA1 Message Date
Jōshin
32643e9fa7
Decouple swap from std (#1211)
This allows you to implement your own swap function without it having to
be part of the std namespace. std::swap is still used if it's available.
2024-06-10 03:40:17 -07:00
Jōshin
0a92c78035
Remove erroneous ctl:: prefixes 2024-06-08 15:07:39 -07:00
Jōshin
118db71121
Provide a minimal new.h for CTL (#1205)
This replaces the STL <new> header. Mainly, it defines a global operator
new and operator delete, as well as the placement versions of these. The
placement versions are required to not get compile errors when trying to
write a placement new statement.

Each of these operators is defined with many, many different variants. A
glance at new.cc is recommended followed by a chaser of the Alexandrescu
talk "std::allocator is to Allocation as std::vector is to Vexation". We
must provide a global-namespace source-level definition of each operator
and it is illegal for any of them to be marked inline, so here we are.

The upshot is that we no longer need to include <new>, and our optional/
vector headers are self-contained.
2024-06-08 15:05:38 -07:00
Alkis Evlogimenos
a0410f0170
Make big_string pod (#1204)
`big_string` is not pod which means it needs to be properly constructed
and destroyed. Instead make it POD and destroy it manually in `string`
destructor.
2024-06-08 10:02:33 -07:00
Alkis Evlogimenos
d44a7dc603
Fix bugs in in ctl::optional (#1203)
Manually manage the lifetime of `value_` by using an anonymous
`union`. This fixes a bunch of double-frees and double-constructs.

Additionally move the `present_` flag last. When `T` has padding
`present_` will be placed there saving `alignof(T)` bytes from
`sizeof(optional<T>)`.
2024-06-07 20:47:24 -04:00
Jōshin
2ba6b0158f
Fix some memory issues with ctl::string (#1201)
There were a few errors in how capacity and memory was being handled for
small strings. The capacity errors meant that small strings would become
big strings too soon, and the memory error introduced undefined behavior
that was caught by CheckMemoryLeaks in our test file but only sometimes.

The crucial change is in reserve: we only copy n bytes into p2, and then
we manually set the null terminator instead of expecting it to have been
there already. (E.g. it might not be there for an empty small string.)

We also fix one other doozy in append when we were exactly at the small-
to-big string boundary: we set the last byte (i.e., the remainder field)
to 0, then decremented it, giving us size_t max. Whoops. We boneheadedly
fix this by setting the 0 byte after we've fixed up the remainder, so it
is at worst a no-op.

Otherwise, capacity now works the same for small strings as it does with
big strings: it's the amount of space available including the null byte.

We test all of this with a new test that only gets included if our class
under test is not std::string (presumably meaning it's ctl::string.) The
test manually verifies that the small string optimization behaves how we
expect.

Since this test checks against std::string, we go ahead and include that
other header from the STL.

Also modifies the new test we introduced to also run on std::string, but
it just does the append without expecting anything about how its data is
stored. We also check that the string has the right value afterwards.
2024-06-07 01:15:37 -04:00
Jōshin
f3effcb703
One more SSO erratum from #1199
Making a string_view from a string appears to take about 1.3ns no matter
what. 100% definitely no point deviating from the STL API over that.
2024-06-06 18:01:26 -07:00
Jōshin
03b476f943
Minor small-string errata from #1199
These commits were sitting on a local branch that I neglected to push
before merging. :(

* Use memcpy for string::reserve

* Remove fence comments
2024-06-06 17:56:30 -07:00
Jōshin
8b3e368e9a
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.

The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.

Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.

This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".

We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.

There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-06 20:50:51 -04:00
Brian
df6b384e31
github: add labeler action (#1196)
Bit easier to do this as everything seems to be sorted into logical
folders. You may need to add new labels to support this however.
2024-06-06 05:51:36 -07:00
Brian
280bdec817
github: add issue template (#1195)
copy paste and adapted from llamafile but added a research template as
well for the more technical research tickets
2024-06-06 05:51:06 -07:00
Jōshin
2c5e7ec547
Add terminating :vi on some modelines
Noticed because the settings they specified weren't getting picked up by
editor sessions in those files.
2024-06-05 20:36:55 -07:00
Jōshin
fdcb8b2f7e
add formatting commit 2024-06-05 16:36:34 -07:00
Jōshin
04c6bc478e
vim C++ filetype is still spelled "cpp" 2024-06-05 16:34:47 -07:00
Justine Tunney
cc2c1893c5
Fix some nits 2024-06-05 04:05:49 -07:00
Justine Tunney
3093f0e467
Release Cosmopolitan v3.4.0 2024-06-05 03:07:03 -07:00
Justine Tunney
3609f65de3
Make malloc() go 200x faster
If pthread_create() is linked into the binary, then the cosmo runtime
will create an independent dlmalloc arena for each core. Whenever the
malloc() function is used it will index `g_heaps[sched_getcpu() / 2]`
to find the arena with the greatest hyperthread / numa locality. This
may be configured via an environment variable. For example if you say
`export COSMOPOLITAN_HEAP_COUNT=1` then you can restore the old ways.
Your process may be configured to have anywhere between 1 - 128 heaps

We need this revision because it makes multithreaded C++ applications
faster. For example, an HTTP server I'm working on that makes extreme
use of the STL went from 16k to 2000k requests per second, after this
change was made. To understand why, try out the malloc_test benchmark
which calls malloc() + realloc() in a loop across many threads, which
sees a a 250x improvement in process clock time and 200x on wall time

The tradeoff is this adds ~25ns of latency to individual malloc calls
compared to MODE=tiny, once the cosmo runtime has transitioned into a
fully multi-threaded state. If you don't need malloc() to be scalable
then cosmo provides many options for you. For starters the heap count
variable above can be set to put the process back in single heap mode
plus you can go even faster still, if you include tinymalloc.inc like
many of the programs in tool/build/.. are already doing since that'll
shave tens of kb off your binary footprint too. Theres also MODE=tiny
which is configured to use just 1 plain old dlmalloc arena by default

Another tradeoff is we need more memory now (except in MODE=tiny), to
track the provenance of memory allocation. This is so allocations can
be freely shared across threads, and because OSes can reschedule code
to different CPUs at any time.
2024-06-05 02:02:14 -07:00
Justine Tunney
9906f299bb
Refactor and improve CTL and other code 2024-06-04 05:45:48 -07:00
Justine Tunney
1d8f37a2f0
Fix the MODE=tiny builds 2024-06-03 10:36:38 -07:00
Justine Tunney
e677460d14
Delete .vscode folder
It hasn't been maintained in years. I'm tired of the root level of our
project having an advertisement for Microsoft Visual Studio Code. Your
preferred editor should be Emacs or Vim.
2024-06-03 09:40:45 -07:00
Justine Tunney
4937843f70
Introduce Cosmopolitan Templates Library (CTL) 2024-06-03 09:21:59 -07:00
Justine Tunney
b003888696
Make __demangle() heap 10% more compact 2024-06-02 16:18:55 -07:00
Justine Tunney
2ca491dc56
Write more __demangle() tests 2024-06-02 07:37:15 -07:00
Justine Tunney
9aa353d88b
Document __demangle() and fix a const func ptr bug 2024-06-02 04:15:48 -07:00
Justine Tunney
c67faf61df
Delete some unintentional code 2024-06-01 20:36:58 -07:00
Justine Tunney
165c6b37e2
Add C++ demangling to privileged runtime
Cosmo will now print C++ symbols correctly in --ftrace logs and
backtraces. Doing this required reducing the memory requirement
of the __demangle() function by 3x. This was accomplished using
16-bit indices and 16-bit malloc granularity. That puts a limit
on the longest symbol we can successfully decode, which I think
would be around 6553 characters long, given a 65536-byte buffer
2024-06-01 20:10:58 -07:00
Jōshin
dcd626edf8
Add .git-blame-ignore-revs
If you follow the directions in that file then git blame will ignore the
listed commits. A commit should only go in that file if its only changes
were to formatting, particularly on a large part of the codebase (like a
change to .clang-format getting applied to the repo.)

Cribbed from here:

https://www.stefanjudis.com/today-i-learned/how-to-exclude-commits-from-git-blame/
2024-06-01 20:10:48 -07:00
Jōshin
f032b5570b
Run clang-format (#1197) 2024-06-01 16:30:43 -04:00
Justine Tunney
ea081b262c
Add some noexcept annotations 2024-06-01 03:19:53 -07:00
Justine Tunney
fae1c32267
Encode ±INFINITY as ±1e5000
The V8 behavior of encoding infinity as null doesn't make sense to me.
Using ±1e5000 is better, because JSON.parse decodes it as INFINITY and
the information is preserved. This could be a breaking change for some
2024-06-01 03:19:50 -07:00
Justine Tunney
9b6718ac99
Improve backtraces
We're now able to rewind the instruction pointer in x86 backtraces. This
helps ensure addr2line cannot print information about unrelated adjacent
code. I've restored -fno-schedule-insns2 in most cases because it really
does cause unpredictable breakage for backtraces.
2024-05-30 15:23:11 -07:00
Justine Tunney
cd672e251f
Improve crash signal reporting on Windows
This change fixes a bug where exiting a crash signal handler on Windows
after adding the signal to uc_sigmask, but not correcting the CPU state
would cause the signal handler to loop infinitely, causing process hang

Another issue is that very tiny programs, that don't link posix signals
would not have their SIGILL / SIGSEGV / etc. status reported to Cosmo's
bash shell when terminating on crash. That's fixed by a tiny handler in
WinMain() that knows how to map WIN32 crash codes to the POSIX flavors.
2024-05-30 14:04:10 -07:00
Justine Tunney
500a47bc2f
Fix undefined behavior in unit test
Fixes #1194
2024-05-29 20:31:46 -07:00
Justine Tunney
e4d25d68e4
Drop support for Windows 8
Microsoft caused some very gentle breakages for Cosmopolitan. They
removed the version information from the PEB which caused uname to
report WINDOWS 0.0.0. We should have called GetVersionExW but that
doesn't really exist anymore either. Windows policy is now to give
whatever version we used in ape/ape.S. Windows8 has been EOL since
2023-01-10 so lets avoid our modern executables being relegated to
legacy infrastructure. Requiring Windows 10+ going forward lets us
remove runtime compatibility bloat from the codebase. Further note
Cosmopolitan maintains a Windows Vista branch on GitHub, so anyone
preferring the older versions, can still have a future with Cosmo.

Another neat thing this fixes is UTF-8 support in the console. The
changes Microsoft made broke the if statement that enabled UTF8 in
terminals. This explains why bug reports had broken arrows. In the
future this should be less of an issue, since the PEB code is gone
which means we more strictly conform to only Microsoft's WIN32 API
2024-05-29 19:37:47 -07:00
Justine Tunney
f31a98d50a
Fix bug with realpath() on Windows 2024-05-29 18:47:01 -07:00
Justine Tunney
2816df59b2
Increase tinymalloc granularity 2024-05-29 18:26:01 -07:00
Justine Tunney
a05ce3ad9d
Support avx512f + vpclmulqdq crc32() acceleration
Cosmo's _Cz_crc32() function now goes 73 GiB/s on Threadripper. This
will significantly improve the performance of the PKZIP file format.
This algorithm is also used by apelink, to create deterministic ids.
2024-05-29 10:13:37 -07:00
Justine Tunney
7c8df05042
Improve -march=native micro-architecture detection 2024-05-29 10:12:49 -07:00
Justine Tunney
4c77acdfcf
Add LoadZipArgs() to <cosmo.h> 2024-05-29 10:12:20 -07:00
Justine Tunney
b74b974cfd
Introduce #include <tinygetopt.h>
The normal getopt() function is bloated because it links printf(). This
change exports the original authentic bsd getopt function, that cosmo's
always used internally so cosmocc users don't need to include internals
2024-05-29 10:11:17 -07:00
Justine Tunney
07cef612c3
Make dlmalloc 2.4x faster for multithreading
This change adds a TLS freelist for small dynamic memory allocations.
Cosmopolitan's TIB is now 512 bytes in size. Single-threaded malloc()
performance isn't impacted by this, until pthread_create() is called.
Single-threaded programs may also want to consider using:

    #include "libc/mem/tinymalloc.inc"

Which will shave 30k off the executable size and sometimes go faster.
2024-05-28 11:18:34 -07:00
Justine Tunney
deaef81463
Favor siginfo_t over struct siginfo 2024-05-28 02:34:17 -07:00
Justine Tunney
c638eabfe0
Fix compiler warning 2024-05-27 02:23:24 -07:00
Justine Tunney
8e68384e15
Upgrade to 2022-era LLVM LIBCXX 2024-05-27 02:12:27 -07:00
Justine Tunney
2f4ca71f26
Release Cosmopolitan v3.3.10 2024-05-26 22:13:45 -07:00
Justine Tunney
07004ebf04
Upgrade to superconfigure z0.0.42 2024-05-26 22:12:25 -07:00
Justine Tunney
086d7006da
Improve crash handler on XNU
This avoids an issue where a crash signal could cause the MacOS process
to freeze and consume all CPU rather than dying as it rightfully should
2024-05-26 18:42:09 -07:00
Gavin Hayes
0a51241f7a
ntspawn: fix initializing NtStartupInfoEx (#1190) 2024-05-26 20:54:09 -04:00
Justine Tunney
c68f6599e5
Fix definition of getpeername on FreeBSD
We were using the COMPAT magic number, which was recently removed.
2024-05-26 17:03:22 -07:00
Justine Tunney
af3f62a71a
Ensure io requests are always capped at 0x7ffff000
This gives us the Linux behavior across platforms.

Fixes #1189
2024-05-26 16:53:13 -07:00