Commit graph

61 commits

Author SHA1 Message Date
Justine Tunney
53b9f83e1c Make redbean SSL more tunable
This change enables SSL compression. It significantly reduces the
network load of the testing infrastructure, for free, since this
revision didn't need to change any runit protocol code. However we
turn it off by default in redbean since no browsers support it.

It turns out that some TLSv1.0 clients (e.g. curl command on RHEL5) will
send an SSLv2-style ClientHello. These types of clients are usually ten+
years old and were designed to interop with servers ten years older than
them. Your redbean is now able to interop with these clients even though
redbean doesn't actually support SSLv2 or SSLv3. Please note that the -B
flag may be passed to disable this along with TLSv1.0, TLSv1.1, 3DES, &c

The following Lua APIs have been added to redbean:

  - ProgramSslCompression(bool)
  - ProgramSslCiphersuite(name:str)
  - ProgramSslPresharedKey(key:str,identity:str)

Lastly the DHE ciphersuites have been enabled. IANA recommends DHE and
with old clients like RHEL5 it's the only perfect forward secrecy they
implement.
2021-08-09 07:38:57 -07:00
Justine Tunney
ea83cc0ad0 Make stronger crypto nearly as fast
One of the disadvantages of x25519 and ℘256 is it only provides 126 bits
of security, so that seems like a weak link in the chain, if we're using
ECDHE-ECDSA-AES256-GCM-SHA384. The U.S. government wants classified data
to be encrypted using a curve at least as strong as ℘384, which provides
192 bits of security, but if you read the consensus of stack exchange it
would give you the impression that ℘384 is three times slower.

This change (as well as the previous one) makes ℘384 three times as fast
by tuning its modulus and multiplication subroutines with new tests that
should convincingly show: the optimized code behaves the same way as the
old code. Some of the diff noise from the previous change is now removed
too, so that our vendored fork can be more easily compared with upstream
sources. So you can now have stronger cryptography without compromises.

℘384 modulus Justine                        l:         28𝑐          9𝑛𝑠
℘384 modulus MbedTLS NIST                   l:        127𝑐         41𝑛𝑠
℘384 modulus MbedTLS MPI                    l:      1,850𝑐        597𝑛𝑠

The benchmarks above show the improvements made by secp384r1() which is
an important function since it needs to be called 13,000 times whenever
someone establishes a connection to your web server. The same's true of
Mul6x6Adx() which is able to multiply 384-bit numbers in 73 cycles, but
only if your CPU was purchased after 2014 when Broadwell was introduced
2021-07-26 16:19:45 -07:00
Justine Tunney
398f0c16fb Add SNI support to redbean and improve SSL perf
This change makes SSL virtual hosting possible. You can now load
multiple certificates for multiple domains and redbean will just
figure out which one to use, even if you only have 1 ip address.
You can also use a jumbo certificate that lists all your domains
in the the subject alternative names.

This change also makes performance improvements to MbedTLS. Here
are some benchmarks vs. cc1920749e

                                   BEFORE    AFTER   (microsecs)
suite_ssl.com                     2512881   191738 13.11x faster
suite_pkparse.com                   36291     3295 11.01x faster
suite_x509parse.com                854669   120293  7.10x faster
suite_pkwrite.com                    6549     1265  5.18x faster
suite_ecdsa.com                     53347    18778  2.84x faster
suite_pk.com                        49051    18717  2.62x faster
suite_ecdh.com                      19535     9502  2.06x faster
suite_shax.com                      15848     7965  1.99x faster
suite_rsa.com                      353257   184828  1.91x faster
suite_x509write.com                162646    85733  1.90x faster
suite_ecp.com                       20503    11050  1.86x faster
suite_hmac_drbg.no_reseed.com       19528    11417  1.71x faster
suite_hmac_drbg.nopr.com            12460     8010  1.56x faster
suite_mpi.com                      687124   442661  1.55x faster
suite_hmac_drbg.pr.com              11890     7752  1.53x faster

There aren't any special tricks to the performance imporvements.
It's mostly due to code cleanup, assembly and intel instructions
like mulx, adox, and adcx.
2021-07-23 13:56:13 -07:00
Justine Tunney
f3e28aa192 Make SSL handshakes much faster
This change boosts SSL handshake performance from 2,627 to ~10,000 per
second which is the same level of performance as NGINX at establishing
secure connections. That's impressive if we consider that redbean is a
forking frontend application server. This was accomplished by:

  1. Enabling either SSL session caching or SSL tickets. We choose to
     use tickets since they reduce network round trips too and that's
     a more important metric than wrk'ing localhost.

  2. Fixing mbedtls_mpi_sub_abs() which is the most frequently called
     function. It's called about 12,000 times during an SSL handshake
     since it's the basis of most arithmetic operations like addition
     and for some strange reason it was designed to make two needless
     copies in addition to calling malloc and free. That's now fixed.

  3. Improving TLS output buffering during the SSL handshake only, so
     that only a single is write and read system call is needed until
     blocking on the ping pong.

redbean will now do a better job wiping sensitive memory from a child
process as soon as it's not needed. The nice thing about fork is it's
much faster than reverse proxying so the goal is to use the different
address spaces along with setuid() to minimize the risk that a server
key will be compromised in the event that application code is hacked.
2021-07-11 23:17:47 -07:00
Justine Tunney
f8b9bd2b47 Attempt to make LLD happy
Things are a little better. The LLD that comes with Linux seems to work.
Old versions like LLVM 8 haven't been supported since Cosmopolitan v0.2.
Running Clang on Windows with --target=x86_64-pc-linux-gnu doesn't seem
to work. It has something to do with the recently added .zip section in
the linker script. But even if that's removed, LLD on Windows thinks it
is building an EFI application for some reason. Linker scripts are such
a brittle house of cards, even for just ld.bfd alone..

We should just find a way to run our one true musl-cross-make linux gcc
toolchain under Blinkenlights on non-Linux because GCC and Clang are so
nondeterministic, inconsistent, and unreproducible when built for other
operating systems. We need an actually portable compiler/linker that'll
always behave the same way no matter what.

See #180
2021-07-05 19:10:06 -07:00
Justine Tunney
74200a0ea0 Make redbean ssl handshake go a little faster 2021-07-03 05:51:04 -07:00
Justine Tunney
2d79ab6c15 Make sha1 / sha256 / sha512 go faster 2021-06-26 00:11:12 -07:00
Justine Tunney
cc1920749e Add SSL to redbean
Your redbean can now interoperate with clients that require TLS crypto.
This is accomplished using a protocol polyglot that lets us distinguish
between HTTP and HTTPS regardless of the port number. Certificates will
be generated automatically, if none are supplied by the user. Footprint
increases by only a few hundred kb so redbean in MODY=tiny is now 1.0mb

- Add lseek() polyfills for ZIP executable
- Automatically polyfill /tmp/FOO paths on NT
- Fix readdir() / ftw() / nftw() bugs on Windows
- Introduce -B flag for slower SSL that's stronger
- Remove mbedtls features Cosmopolitan doesn't need
- Have base64 decoder support the uri-safe alternative
- Remove Truncated HMAC because it's forbidden by the IETF
- Add all the mbedtls test suites and make them go 3x faster
- Support opendir() / readdir() / closedir() on ZIP executable
- Use Everest for ECDHE-ECDSA because it's so good it's so good
- Add tinier implementation of sha1 since it's not worth the rom
- Add chi-square monte-carlo mean correlation tests for getrandom()
- Source entropy on Windows from the proper interface everyone uses

We're continuing to outperform NGINX and other servers on raw message
throughput. Using SSL means that instead of 1,000,000 qps you can get
around 300,000 qps. However redbean isn't as fast as NGINX yet at SSL
handshakes, since redbean can do 2,627 per second and NGINX does 4.3k

Right now, the SSL UX story works best if you give your redbean a key
signing key since that can be easily generated by openssl using a one
liner then redbean will do all the things that are impossibly hard to
do like signing ecdsa and rsa certificates that'll work in chrome. We
should integrate the let's encrypt acme protocol in the future.

Live Demo: https://redbean.justine.lol/
Root Cert: https://redbean.justine.lol/redbean1.crt
2021-06-24 13:20:50 -07:00
Justine Tunney
87d7010495 Improve performance of bitscanning intrinsics
This change helps spectre more intelligently plan execution, by working
around false output dependencies, impacting ops like popcnt bsr and bsf
2021-06-15 06:29:51 -07:00
Justine Tunney
dc6d11a031 Improve performance of printf functions 2021-04-24 13:58:50 -07:00
Justine Tunney
b107d2709f Add /statusz page to redbean plus other enhancements
redbean improvements:

- Explicitly disable corking
- Simulate Python regex API for Lua
- Send warmup requests in main process on startup
- Add Class-A granular IPv4 network classification
- Add /statusz page so you can monitor your redbean's health
- Fix regressions on OpenBSD/NetBSD caused by recent changes
- Plug Authorization header into Lua GetUser and GetPass APIs
- Recognize X-Forwarded-{For,Host} from local reverse proxies
- Add many additional functions to redbean Lua server page API
- Report resource usage of child processes on `/` listing page
- Introduce `-a` flag for logging child process resource usage
- Introduce `-t MILLIS` flag and `ProgramTimeout(ms)` init API
- Introduce `-H "Header: value"` flag and `ProgramHeader(k,v)` API

Cosmopolitan Libc improvements:

- Make strerror() simpler
- Make inet_pton() not depend on sscanf()
- Fix OpenExecutable() which broke .data section earlier
- Fix stdio in cases where it overflows kernel tty buffer
- Fix bugs in crash reporting w/o .com.dbg binary present
- Add polyfills for SO_LINGER, SO_RCVTIMEO, and SO_SNDTIMEO
- Polyfill TCP_CORK on BSD and XNU using TCP_NOPUSH magnums

New netcat clone in examples/nc.c:

While testing some of the failure conditions for redbean, I noticed that
BusyBox's `nc` command is pretty busted, if you use it as an interactive
tool, rather than having it be part of a pipeline. Unfortunately this'll
only work on UNIX since Windows doesn't let us poll on stdio and sockets
at the same time because I don't think they want tools like this running
on their platform. So if you want forbidden fruit, it's here so enjoy it
2021-04-23 18:53:57 -07:00
Justine Tunney
807706a099 Perform minor fixups
One of those fixups is making sure that AF_LOCAL is equal to AF_UNIX on
the New Technology. See #122
2021-03-13 19:40:04 -08:00
Justine Tunney
33e8fc8687 Expose public garbage collector API for C language
You can now do epic things like this:

    puts(_gc(xasprintf("%d", 123)));

The _gc() API is shorthand for _defer() which works like Go's keyword:

    const char *s = xasprintf("%d", 123);
    _defer(free, s);
    puts(s);

Be sure to always use -fno-omit-frame-pointer which makes code fast too.

Enjoy! See also #114
2021-03-08 10:59:34 -08:00
Justine Tunney
11ec99931b Add Musl multibyte functions
These are standard functions that are needed to help support the Skull
language. Note that normally this codebase uses libc/str/thompike.h

See #105
2021-03-06 09:53:16 -08:00
Justine Tunney
e26bdbec52 Make examples folder somewhat more focused 2021-03-05 06:09:12 -08:00
Justine Tunney
5141d00992 Perform some minor code cleanup 2021-03-04 13:22:32 -08:00
Justine Tunney
3e19b96ab8 Add more POSIX function stubs
Cosmopolitan currently doesn't support threads and it doesn't do
anything fancy in longjmp/setjmp so this change was simple to do

- localeconv
- _setjmp (same as setjmp)
- _longjmp (same as longjmp)
- strcoll (same as strcmp)
- flockfile (does nothing)
- funlockfile (does nothing)
- ftrylockfile (does nothing)

See #61
2021-03-02 03:27:55 -08:00
Justine Tunney
f4298f10c2 Generate ZIP files the same way as Windows 2021-03-01 06:24:11 -08:00
Justine Tunney
d932948fb4 Remove more nonstandard stuff from cosmopolitan.h
Fixes #61
2021-03-01 00:18:23 -08:00
Justine Tunney
94afa982c3 Fix zip executables on MacOS
Here's why we got those `Killed: 11` failures on MacOS after modifying
the contentns of the redbean.com executable. If you were inserting a
small file, such as a HelloWorld.html file, then InfoZIP might have
decreased the size of the executable to less than what the Mach-O
section had been expecting.

That's because when zipobj.com put things like time zone data in the
executable, it aligned each zip file entry on a 64-byte boundary, simply
for the sake of readability in binary dumps. But when InfoZIP edited the
file it would rewrite every entry using ZIP's usual 2-byte alignment.
Thus causing shrinkage.

The solution was to reconfigure the linker script so that zip file bits
that get put into the executable at link-time, such as timezone data,
aren't officially part of the executable image, i.e. we don't want the
operating system to load that part.

The original decision to put the linked zip files into the .data section
was mostly made so that when the executable was run in its .com.dbg form
it would still have the zip entries be accessible, even though there was
tons of GNU debug data following the central directory. We're not going
to be able to do that. The .com executable should be the canonical
executable. We have really good tools for automatically attaching and
configuring GDB correctly with debug symbols even when the .com is run.
We'll have to rely on those in cases where zip embedding is used.

See #53
See #54
See #68
2021-02-27 18:16:59 -08:00
Justine Tunney
40291c9db3 Improve signal handling and math
- Polyfill ucontext_t on FreeBSD/OpenBSD/NetBSD
- Add tests confirming signals can edit CPU state
- Work towards supporting ZIP filesystem on bare metal
- Add more tinymath unit tests for POSIX conformance
- Add X87 and SSE status flags to crash report
- Fix some bugs in blinkenlights
- Fix llvm build breakage
2021-02-25 18:33:33 -08:00
Justine Tunney
cdc54ea1fd Use unsigned leb128 for magnums 2021-02-24 04:00:38 -08:00
Justine Tunney
edd9297eba Support malloc() on bare metal
Your Actually Portable Executables now contains a simple virtual memory
that works similarly to the Linux Kernel in the sense that it maps your
physical memory to negative addresses. This is needed to support mmap()
and malloc(). This functionality has zero code size impact. For example
the MODE=tiny LIFE.COM executable is still only 12KB in size.

The APE bootloader code has also been simplified to improve readibility
and further elevate the elegance by which we're able to support so many
platforms thereby enhancing verifiability so that we may engender trust
in this bootloading process.
2021-02-24 00:53:24 -08:00
Justine Tunney
537c21338b Add UEFI support
This is mutually exclusive with Windows support. Documentation for how
to use it has been written in libc/runtime/efimain.c
2021-02-21 21:33:04 -08:00
Justine Tunney
e75ffde09e Get codebase completely working with LLVM
You can now build Cosmopolitan with Clang:

    make -j8 MODE=llvm
    o/llvm/examples/hello.com

The assembler and linker code is now friendly to LLVM too.
So it's not needed to configure Clang to use binutils under
the hood. If you love LLVM then you can now use pure LLVM.
2021-02-09 02:57:32 -08:00
Justine Tunney
0e36cb3ac4 Improve dead code elimination 2021-02-08 04:04:42 -08:00
Justine Tunney
d7733579d3 Fix Clang support
The amalgamated release is now confirmed to be working with Clang,
including its integrated assembler.

Fixes #41
2021-02-06 00:29:09 -08:00
Justine Tunney
46085797b6 Make ANSI mode closer to being ANSI 2021-02-03 17:14:17 -08:00
Justine Tunney
c843243322 Implement more security stuff
- Support deterministic stacks on OpenBSD
- Support OpenBSD system call origin verification
- Fix overrun by one in chibicc string token allocator
- Get all chibicc tests passing under Address Sanitizer
2021-02-02 20:21:06 -08:00
Justine Tunney
cbfd4ccd1e Make more functions friendly to Address Sanitizer 2021-02-02 03:45:31 -08:00
Justine Tunney
1ff9ab95ac Make C memory safe like Rust
This change enables Address Sanitizer systemically w/ `make MODE=dbg`.
Our version of Rust's `unsafe` keyword is named `noasan` which is used
for two functions that do aligned memory chunking, like `strcpy.c` and
we need to fix the tiny DEFLATE code, but that's it everything else is
fabulous you can have all the fischer price security blankets you need

Best of all is we're now able to use the ASAN data in Blinkenlights to
colorize the memory dumps. See the screenshot below of a test program:

  https://justine.lol/blinkenlights/asan.png

Which is operating on float arrays stored on the stack, with red areas
indicating poisoned memory, and the green areas indicate valid memory.
2021-02-01 03:58:46 -08:00
Justine Tunney
45b72485ad Fix XNU / FreeBSD / OpenBSD / RHEL5 / NT bugs
For the first time ever, all tests in this codebase now pass, when
run automatically on macos, freebsd, openbsd, rhel5, rhel7, alpine
and windows via the network using the runit and runitd build tools

- Fix vfork exec path etc.
- Add XNU opendir() support
- Add OpenBSD opendir() support
- Add Linux history to syscalls.sh
- Use copy_file_range on FreeBSD 13+
- Fix system calls with 7+ arguments
- Fix Windows with greater than 16 FDs
- Fix RUNIT.COM and RUNITD.COM flakiness
- Fix OpenBSD munmap() when files are mapped
- Fix long double so it's actually long on Windows
- Fix OpenBSD truncate() and ftruncate() thunk typo
- Let Windows fcntl() be used on socket files descriptors
- Fix Windows fstat() which had an accidental printf statement
- Fix RHEL5 CLOCK_MONOTONIC by not aliasing to CLOCK_MONOTONIC_RAW

This is wonderful. I never could have dreamed it would be possible
to get it working so well on so many platforms with tiny binaries.

Fixes #31
Fixes #25
Fixes #14
2021-01-25 18:31:17 -08:00
Justine Tunney
92b794002b Fix pmovmskb mnemonic in memeqmask
This change unbreaks test/libc/nexgen32e/memeqmask_test.c which appears
to currently be the only reference to this api. It was originally meant
to be used for speeding up terminal video rendering. We might delete it

Fixes #26
2021-01-16 21:52:14 -08:00
Justine Tunney
8fa47acecc Fix memcpy on sandybridge and ivybridge
This bug impacts folks who purchased Intel chips made in 2011-2012.
We're now using `vxorps` instead of `vpxor` which is great since it
means we do not need to change `X86_HAVE(AVX)` to `X86_HAVE(AVX2)`,
because AVX2 is only available on Haswell and later.

Fixes #16
2021-01-16 21:36:55 -08:00
Justine Tunney
9f68d6eee9 Fix link order in cosmopolitan.a
It turned out that the linker was doing the wrong with the amalgamation
library concerning weak stubs. A regression test has been added and new
binaries have been uploaded to https://justine.lol/cosmopolitan/

Ideally this should be fixed by building a tool that turns multiple .a
files into a single .a file with deduplication. As a workaround for now
the cosmopolitan.a build is restructured to not include LIBC_STUBS which
meant technical debt needed to be paid off where non-stub interfaces
were moved to LIBC_INTRIN and LIBC_NEXGEN32E.

Thank @PerfectProductions in #31 for the report!
2021-01-16 12:05:41 -08:00
Justine Tunney
04f1d89f84 Replace .pushsection directives (#30) 2021-01-10 13:36:31 -08:00
Justine Tunney
37a4c70c36 Change license 2020-12-27 17:18:44 -08:00
Justine Tunney
548dcb9f08 Further refine documentation 2020-12-27 17:05:03 -08:00
Justine Tunney
1bc3a25505 Improve documentation
The Cosmo API documentation page is pretty good now
https://justine.lol/cosmopolitan/documentation.html
2020-12-27 07:02:35 -08:00
Justine Tunney
13437dd19b Auto-generate some documentation 2020-12-26 02:09:07 -08:00
Justine Tunney
95b142e4e5 Make minor improvements 2020-12-23 23:42:56 -08:00
Justine Tunney
9df2cef4c4 Enhance chibicc 2020-12-09 04:00:48 -08:00
Justine Tunney
8da931a7f6 Add chibicc
This program popped up on Hacker News recently. It's the only modern
compiler I've ever seen that doesn't have dependencies and is easily
modified. So I added all of the missing GNU extensions I like to use
which means it might be possible soon to build on non-Linux and have
third party not vendor gcc binaries.
2020-12-06 16:20:21 -08:00
Justine Tunney
e44a0cf6f8 Make improvements 2020-12-01 03:43:40 -08:00
Justine Tunney
3e4fd4b0ad Add epoll and do more release readiness changes
This change also pays off some of the remaining technical debt with
stdio, file descriptors, and memory managemnt polyfills.
2020-11-28 12:01:51 -08:00
Justine Tunney
ea0b5d9d1c Get Cosmopolitan into releasable state
A new rollup tool now exists for flattening out the headers in a way
that works better for our purposes than cpp. A lot of the API clutter
has been removed. APIs that aren't a sure thing in terms of general
recommendation are now marked internal.

There's now a smoke test for the amalgamation archive and gigantic
header file. So we can now guarantee you can use this project on the
easiest difficulty setting without the gigantic repository.

A website is being created, which is currently a work in progress:
https://justine.storage.googleapis.com/cosmopolitan/index.html
2020-11-25 08:19:00 -08:00
Justine Tunney
dba7552c1e Add Conway's Game of Life 2020-11-18 08:26:03 -08:00
Justine Tunney
aea89fe832 Make minor improvements
- Work towards simplifying ape.S startup process
- Rewrote ar because it took minutes to build cosmopolitan.a
2020-11-09 15:41:11 -08:00
Justine Tunney
2d80bbc802 Get binaries closer to running without an o/s
blinkenlights now does a pretty good job emulating what happens when
binaries boot from BIOS into long mode. So it's been much easier to
debug the bare metal process and wrinkle out many issues.
2020-11-02 19:12:47 -08:00
Justine Tunney
feed0d2b0e Add minor improvements and cleanup 2020-10-27 03:39:46 -07:00