Commit graph

51 commits

Author SHA1 Message Date
Justine Tunney
17aea99bb3 Fold LIBC_ALG into LIBC_MEM 2022-08-13 08:32:34 -07:00
Justine Tunney
7cf66bc161 Prevent Make from talking to public Internet
This change introduces the nointernet() function which may be called to
prevent a process and its descendants from communicating with publicly
routable Internet addresses. GNU Make has been modified to always call
this function. In the future Landlock Make will have a way to whitelist
subnets to override this behavior, or disable it entirely. Support is
available for Linux only. Our firewall does not require root access.

Calling nointernet() will return control to the caller inside a new
process that has a SECCOMP BPF filter installed, which traps network
related system calls. Your original process then becomes a permanent
ptrace() supervisor that monitors all processes and threads descending
from the returned child. Whenever a networking system call happens the
kernel will stop the process and wakes up the monitor, which then peeks
into the child memory to read the sockaddr_in to determine if it's ok.

The downside to doing this is that there can be only one supervisor at a
time using ptrace() on a process. So this firewall won't be enabled if
you run make under strace or inside gdb. It also makes testing tricky.
2022-08-12 21:51:39 -07:00
Justine Tunney
8a0a2c0c36 Fold LIBC_RAND into LIBC_STDIO/TINYMATH/INTRIN 2022-08-11 12:32:00 -07:00
Justine Tunney
05b8f82371 Fold LIBC_BITS into LIBC_INTRIN 2022-08-11 12:13:18 -07:00
Justine Tunney
10fd8bdb70 Unbloat the build
This change resurrects ae5d06dc53
2022-08-11 00:15:29 -07:00
Justine Tunney
c1d99676c4 Revert "Unbloat build config"
This reverts commit ae5d06dc53.
2022-08-10 12:44:56 -07:00
Justine Tunney
ae5d06dc53 Unbloat build config
- 10.5% reduction of o//depend dependency graph
- 8.8% reduction in latency of make command
- Fix issue with temporary file cleanup

There's a new -w option in compile.com that turns off the recent
Landlock output path workaround for "good commands" which do not
unlink() the output file like GNU tooling does.

Our new GNU Make unveil sandboxing appears to have zero overhead
in the grand scheme of things. Full builds are pretty fast since
the only thing that's actually slowed us down is probably libcxx

    make -j16 MODE=rel
    RL: took 85,732,063µs wall time
    RL: ballooned to 323,612kb in size
    RL: needed 828,560,521µs cpu (11% kernel)
    RL: caused 39,080,670 page faults (99% memcpy)
    RL: 350,073 context switches (72% consensual)
    RL: performed 0 reads and 11,494,960 write i/o operations

pledge() and unveil() no longer consider ENOSYS to be an error.
These functions have also been added to Python's cosmo module.

This change also removes some WIN32 APIs and System Five magnums
which we're not using and it's doubtful anyone else would be too
2022-08-10 04:43:09 -07:00
Justine Tunney
cf93ecbbb2 Prove that Makefile is fully defined
The whole repository is now buildable with GNU Make Landlock sandboxing.
This proves that no Makefile targets exist which touch files other than
their declared prerequisites. In order to do this, we had to:

  1. Stop code morphing GCC output in package.com and instead run a
     newly introduced FIXUPOBJ.COM command after GCC invocations.

  2. Disable all the crumby Python unit tests that do things like create
     files in the current directory, or rename() files between folders.
     This ended up being a lot of tests, but most of them are still ok.

  3. Introduce an .UNSANDBOXED variable to GNU Make to disable Landlock.
     We currently only do this for things like `make tags`.

  4. This change deletes some GNU Make code that was preventing the
     execve() optimization from working. This means it should no longer
     be necessary in most cases for command invocations to be indirected
     through the cocmd interpreter.

  5. Missing dependencies had to be declared in certain places, in cases
     where they couldn't be automatically determined by MKDEPS.COM

  6. The libcxx header situation has finally been tamed. One of the
     things that makes this difficult is MKDEPS.COM only wants to
     consider the first 64kb of a file, in order to go fast. But libcxx
     likes to have #include lines buried after huge documentation.

  7. An .UNVEIL variable has been introduced to GNU Make just in case
     we ever wish to explicitly specify additional things that need to
     be whitelisted which aren't strictly prerequisites. This works in
     a manner similar to the recently introduced .EXTRA_PREREQS feature.

There's now a new build/bootstrap/make.com prebuilt binary available. It
should no longer be possible to write invalid Makefile code.
2022-08-06 04:05:08 -07:00
Justine Tunney
853b6c3864 Improve system calls
- Wrap clock_getres()
- Wrap sched_setscheduler()
- Make sleep() api conformant
- Polyfill sleep() using select()
- Improve clock_gettime() polyfill
- Make nanosleep() POSIX conformant
- Slightly improve some DNS functions
- Further strengthen pledge() sandboxing
- Improve rounding of timeval / timespec
- Allow layering of pledge() calls on Linux
- Polyfill sched_yield() using select() on XNU
- Delete more system constants we probably don't need
2022-07-08 06:42:03 -07:00
Justine Tunney
17cbe73411 Add finger demo to redbean and fix regression
This change fixes a regression in unix.connect() caused by the recent
addition of UNIX domain sockets. The BSD finger command has been added
to third_party for fun and profit. A new demo has been added to redbean
showing how a protocol as simple as finger can be implemented.
2022-06-23 03:42:05 -07:00
Justine Tunney
91953dd308 Add some more necessary locks 2022-06-12 22:20:59 -07:00
Justine Tunney
99e67c348b Reduce Makefile dependencies by 4% 2022-05-23 15:07:01 -07:00
Justine Tunney
47b3274665 Make improvements
- Add rusage to redbean Lua API
- Add more redbean documentation
- Add pledge() to redbean Lua API
- Polyfill OpenBSD pledge() for Linux
- Increase PATH_MAX limit to 1024 characters
- Untrack sibling processes after fork() on Windows
2022-04-28 09:57:07 -07:00
Justine Tunney
2046c0d2ae Make improvements
- Expand redbean UNIX module
- Expand redbean documentation
- Ensure Lua copyright is embedded in binary
- Increase the PATH_MAX limit especially on NT
- Use column major sorting for linenoise completions
- Fix some suboptimalities in redbean's new UNIX API
- Figured out right flags for Multics newline in raw mode
2022-04-24 10:06:05 -07:00
Justine Tunney
2f56ebfe78 Do code cleanup use duff device linenoise i/o 2022-04-22 18:56:52 -07:00
Gautham
6f658f058b
Change noinline to dontinline (#312)
We defined `noinline` as an abbreviation for the longer version
`__attribute__((__noinline__))` which caused name clashes since
third party codebases often write it as `__attribute__((noinline))`.
2021-11-12 15:12:18 -08:00
Justine Tunney
226aaf3547 Improve memory safety
This commit makes numerous refinements to cosmopolitan memory handling.

The default stack size has been reduced from 2mb to 128kb. A new macro
is now provided so you can easily reconfigure the stack size to be any
value you want. Work around the breaking change by adding to your main:

    STATIC_STACK_SIZE(0x00200000);  // 2mb stack

If you're not sure how much stack you need, then you can use:

    STATIC_YOINK("stack_usage_logging");

After which you can `sort -nr o/$MODE/stack.log`. Based on the unit test
suite, nothing in the Cosmopolitan repository (except for Python) needs
a stack size greater than 30kb. There are also new macros for detecting
the size and address of the stack at runtime, e.g. GetStackAddr(). We
also now support sigaltstack() so if you want to see nice looking crash
reports whenever a stack overflow happens, you can put this in main():

    ShowCrashReports();

Under `make MODE=dbg` and `make MODE=asan` the unit testing framework
will now automatically print backtraces of memory allocations when
things like memory leaks happen. Bugs are now fixed in ASAN global
variable overrun detection. The memtrack and asan runtimes also handle
edge cases now. The new tools helped to identify a few memory leaks,
which are fixed by this change.

This change should fix an issue reported in #288 with ARG_MAX limits.
Fixing this doubled the performance of MKDEPS.COM and AR.COM yet again.
2021-10-13 17:27:13 -07:00
Justine Tunney
39bf41f4eb Make numerous improvements
- Python static hello world now 1.8mb
- Python static fully loaded now 10mb
- Python HTTPS client now uses MbedTLS
- Python REPL now completes import stmts
- Increase stack size for Python for now
- Begin synthesizing posixpath and ntpath
- Restore Python \N{UNICODE NAME} support
- Restore Python NFKD symbol normalization
- Add optimized code path for Intel SHA-NI
- Get more Python unit tests passing faster
- Get Python help() pagination working on NT
- Python hashlib now supports MbedTLS PBKDF2
- Make memcpy/memmove/memcmp/bcmp/etc. faster
- Add Mersenne Twister and Vigna to LIBC_RAND
- Provide privileged __printf() for error code
- Fix zipos opendir() so that it reports ENOTDIR
- Add basic chmod() implementation for Windows NT
- Add Cosmo's best functions to Python cosmo module
- Pin function trace indent depth to that of caller
- Show memory diagram on invalid access in MODE=dbg
- Differentiate stack overflow on crash in MODE=dbg
- Add stb_truetype and tools for analyzing font files
- Upgrade to UNICODE 13 and reduce its binary footprint
- COMPILE.COM now logs resource usage of build commands
- Start implementing basic poll() support on bare metal
- Set getauxval(AT_EXECFN) to GetModuleFileName() on NT
- Add descriptions to strerror() in non-TINY build modes
- Add COUNTBRANCH() macro to help with micro-optimizations
- Make error / backtrace / asan / memory code more unbreakable
- Add fast perfect C implementation of μ-Law and a-Law audio codecs
- Make strtol() functions consistent with other libc implementations
- Improve Linenoise implementation (see also github.com/jart/bestline)
- COMPILE.COM now suppresses stdout/stderr of successful build commands
2021-09-28 01:52:34 -07:00
Justine Tunney
34b68f1945 Make mappings unlimited on NT
This change might also fix fork() in certain cases on NT.
2021-09-04 13:20:47 -07:00
Justine Tunney
00611e9b06 Improve ZIP filesystem and change its prefix
The ZIP filesystem has a breaking change. You now need to use /zip/ to
open() / opendir() / etc. assets within the ZIP structure of your APE
binary, instead of the previous convention of using zip: or zip! URIs.
This is needed because Python likes to use absolute paths, and having
ZIP paths encoded like URIs simply broke too many things.

Many more system calls have been updated to be able to operate on ZIP
files and file descriptors. In particular fcntl() and ioctl() since
Python would do things like ask if a ZIP file is a terminal and get
confused when the old implementation mistakenly said yes, because the
fastest way to guarantee native file descriptors is to dup(2). This
change also improves the async signal safety of zipos and ensures it
doesn't maintain any open file descriptors beyond that which the user
has opened.

This change makes a lot of progress towards adding magic numbers that
are specific to platforms other than Linux. The philosophy here is that,
if you use an operating system like FreeBSD, then you should be able to
take advantage of FreeBSD exclusive features, even if we don't polyfill
them on other platforms. For example, you can now open() a file with the
O_VERIFY flag. If your program runs on other platforms, then Cosmo will
automatically set O_VERIFY to zero. This lets you safely use it without
the need for #ifdef or ifstatements which detract from readability.

One of the blindspots of the ASAN memory hardening we use to offer Rust
like assurances has always been that memory passed to the kernel via
system calls (e.g. writev) can't be checked automatically since the
kernel wasn't built with MODE=asan. This change makes more progress
ensuring that each system call will verify the soundness of memory
before it's passed to the kernel. The code for doing these checks is
fast, particularly for buffers, where it can verify 64 bytes a cycle.

- Correct O_LOOP definition on NT
- Introduce program_executable_name
- Add ASAN guards to more system calls
- Improve termios compatibility with BSDs
- Fix bug in Windows auxiliary value encoding
- Add BSD and XNU specific errnos and open flags
- Add check to ensure build doesn't talk to internet
2021-08-22 01:11:53 -07:00
Justine Tunney
9b29358511 Make whitespace changes
Status lines for Emacs and Vim have been added to Python sources so
they'll be easier to edit using Python's preferred coding style.

Some DNS helper functions have been broken up into multiple files. It's
nice to have one function per file whenever possible, since that way we
don't need -ffunction-sections.  Another reason it's good to have small
source files, is because the build will be enforcing resource limits on
compilation and testing soon.
2021-08-13 03:20:45 -07:00
Gautham
e99a4dcc8c
Add protoent and netent (#209)
The implementations of the getproto* functions follow from the getserv*
functions: same static name allocation, same type of internal function
that opens a file to search, aliases are not written to the struct, same
type of error handling/returns.

This changes also fixes a getaddrinfo AI_PASSIVE memory error. When
getaddrinfo is passed name = NULL and AI_PASSIVE in hints->ai_flags, it was
setting the s_addr value to INADDR_ANY but *not* returning the addrinfo
pointer via *res = ai. This caused a free(NULL) memory error when the caller
tried to free res, because the caller expects res to be a valid pointer to a
struct addrinfo.

Our non-standard API parseport() has been updated to use strtoimax.
strtoimax has an extra parameter endptr to store where the parsing was
terminated. endptr is used in parseport to check if the provided string
was valid.
2021-07-10 12:36:35 -07:00
Gautham
c0bec24fa2
Improve getservbyname and getservbyport (#207)
- support aliases in /etc/services
- use case insensitive comparisons
- add tests
2021-07-05 12:25:26 -07:00
Gautham
0ea2907730
Add getservbyname and getservbyport (#204)
For every call of getservbyname/getservbyport the lookup of
/etc/services is done by opening the file and parsing each line
one-by-one. This is slow, but the implementation is simple.

This change also adds fixes for the gethostbyname function.
2021-07-04 16:31:42 -07:00
Gautham
3fe7b95fd0
Add gethostbyname and gethostbyaddr (#200)
gethostbyname, gethostbyaddr follow simple implementations: they
internally call getaddrinfo and getnameinfo respectively, and fill out
the minimum details. remaining functions are stubs.
2021-07-01 07:55:11 -07:00
Justine Tunney
a68cc690ff Merge HTTP request / response parsing code
This change also fixes a bug so that DNS lookups work correctly when the
first answer is a CNAME record.
2021-06-27 17:04:32 -07:00
Justine Tunney
5144c22189 Add test for ioctl(SIOCGIFCONF) and polyfill on BSDs
- Use nullness checks when calling weakly linked functions.

- Avoid typedef for reasons described in Linux Kernel style guide.

- Avoid enum in in Windows headers. Earlier in Cosmo's history all one
  hundred files in libc/nt/enum/ used to be enums and it resulted in
  gigabytes of DWARF data almost as large as everything else in the
  codebase combined.

- Bitfields aren't our friends. They have frequent ABI breakages,
  inconsistent arithmetic across compilers, and different endianness
  between cpus. Compiler authors also haven't invested much roi into
  making bit fields go fast so they produce poor assembly.

- Use memccpy() instead of strncpy() or snprintf() for length-bounded
  copying of C strings. strncpy() is a misunderstood function and
  snprintf() is awesome but memccpy() deserves more love.
2021-06-25 18:44:04 -07:00
Justine Tunney
cc1920749e Add SSL to redbean
Your redbean can now interoperate with clients that require TLS crypto.
This is accomplished using a protocol polyglot that lets us distinguish
between HTTP and HTTPS regardless of the port number. Certificates will
be generated automatically, if none are supplied by the user. Footprint
increases by only a few hundred kb so redbean in MODY=tiny is now 1.0mb

- Add lseek() polyfills for ZIP executable
- Automatically polyfill /tmp/FOO paths on NT
- Fix readdir() / ftw() / nftw() bugs on Windows
- Introduce -B flag for slower SSL that's stronger
- Remove mbedtls features Cosmopolitan doesn't need
- Have base64 decoder support the uri-safe alternative
- Remove Truncated HMAC because it's forbidden by the IETF
- Add all the mbedtls test suites and make them go 3x faster
- Support opendir() / readdir() / closedir() on ZIP executable
- Use Everest for ECDHE-ECDSA because it's so good it's so good
- Add tinier implementation of sha1 since it's not worth the rom
- Add chi-square monte-carlo mean correlation tests for getrandom()
- Source entropy on Windows from the proper interface everyone uses

We're continuing to outperform NGINX and other servers on raw message
throughput. Using SSL means that instead of 1,000,000 qps you can get
around 300,000 qps. However redbean isn't as fast as NGINX yet at SSL
handshakes, since redbean can do 2,627 per second and NGINX does 4.3k

Right now, the SSL UX story works best if you give your redbean a key
signing key since that can be easily generated by openssl using a one
liner then redbean will do all the things that are impossibly hard to
do like signing ecdsa and rsa certificates that'll work in chrome. We
should integrate the let's encrypt acme protocol in the future.

Live Demo: https://redbean.justine.lol/
Root Cert: https://redbean.justine.lol/redbean1.crt
2021-06-24 13:20:50 -07:00
Gautham
98c53ae526
Simplify getnameinfo (#196)
The getnameinfo implementation requires an address -> name lookup on the
hosts file (ie struct HostsTxt) and the previous implementation used
flags to check whether HostsTxt was sorted according to address or name,
and then re-sorted it if necessary. Now getnameinfo lookup does not
require sorting, it does a simple linear lookup, and so the related code
was simplified

See #172 for discussion.
2021-06-22 12:35:58 -07:00
Gautham
248c6d54bb
Added getnameinfo with only name lookup (#172)
Added necessary constants (DNS_TYPE_PTR, NI_NUMERICHOST etc.).
Implementation of getnameinfo is similar to getaddrinfo, with internal
functions:

* ResolveDnsReverse: performs rDNS query and parses the PTR record
* ResolveHostsReverse: reads /etc/hosts to map hostname to
  address

Earlier, the HOSTS.txt would only need to be sorted at loading time,
because the only kind of lookup was name -> address. Now since address
-> name lookups are also possible, so the HostsTxt struct, the sorting
method (and the related tests) was changed to reflect this.
2021-06-09 19:35:44 -07:00
Justine Tunney
4864565198 Make minor improvements 2021-05-15 21:53:26 -07:00
Justine Tunney
690be544da Make redbean StoreAsset() work better
- Better UBSAN error messages
- POSIX Advisory Locks polyfills
- Move redbean manual to /.help.txt
- System call memory safety in ASAN mode
- Character classification now does UNICODE
2021-05-14 05:44:37 -07:00
Justine Tunney
da36e7e256 Make major improvements to stdio
Buffering now has optimal performance, bugs have been fixed, and some
missing apis have been introduced. This implementation is also now more
production worthy since it's less brittle now in terms of system errors.
That's going to help redbean since lua i/o is all based on stdio.

See #97
2021-03-26 22:31:41 -07:00
fabriziobertocci
5af25f687a
Rename eai2str to gai_strerror (#131) 2021-03-20 20:48:40 -07:00
Justine Tunney
d932948fb4 Remove more nonstandard stuff from cosmopolitan.h
Fixes #61
2021-03-01 00:18:23 -08:00
Justine Tunney
2c15efc249 Add curl example (#42)
make -j8 o//examples/curl.com
    o//examples/curl.com http://justine.lol/ape.html
2021-02-18 19:20:41 -08:00
Justine Tunney
e75ffde09e Get codebase completely working with LLVM
You can now build Cosmopolitan with Clang:

    make -j8 MODE=llvm
    o/llvm/examples/hello.com

The assembler and linker code is now friendly to LLVM too.
So it's not needed to configure Clang to use binutils under
the hood. If you love LLVM then you can now use pure LLVM.
2021-02-09 02:57:32 -08:00
Justine Tunney
46085797b6 Make ANSI mode closer to being ANSI 2021-02-03 17:14:17 -08:00
Justine Tunney
9f68d6eee9 Fix link order in cosmopolitan.a
It turned out that the linker was doing the wrong with the amalgamation
library concerning weak stubs. A regression test has been added and new
binaries have been uploaded to https://justine.lol/cosmopolitan/

Ideally this should be fixed by building a tool that turns multiple .a
files into a single .a file with deduplication. As a workaround for now
the cosmopolitan.a build is restructured to not include LIBC_STUBS which
meant technical debt needed to be paid off where non-stub interfaces
were moved to LIBC_INTRIN and LIBC_NEXGEN32E.

Thank @PerfectProductions in #31 for the report!
2021-01-16 12:05:41 -08:00
Justine Tunney
0e85b136ae Fix Windows 7 support (#19)
This change pays off technical debt with the function -> DLL mappings in
libc/nt/master.sh, which was originally defined based on binary analysis
on Windows 10. It's now been updated so the kernel32/kernelbase/advapi32
imports should be exactly as they are written, on the MSDN documentation
and that wouldn't have been easy without Geoff Chappell's work thank him

https://www.geoffchappell.com/studies/windows/win32/index.htm
2020-12-28 13:52:02 -08:00
Justine Tunney
37a4c70c36 Change license 2020-12-27 17:18:44 -08:00
Justine Tunney
1bc3a25505 Improve documentation
The Cosmo API documentation page is pretty good now
https://justine.lol/cosmopolitan/documentation.html
2020-12-27 07:02:35 -08:00
Justine Tunney
1fc91f3580 Fold conv package into fmt
Both packages had nearly identical dependency requirements, so merging
them should help reduce the complexity of the build graph.
2020-12-09 16:52:00 -08:00
Justine Tunney
e44a0cf6f8 Make improvements 2020-12-01 03:43:40 -08:00
Justine Tunney
3e4fd4b0ad Add epoll and do more release readiness changes
This change also pays off some of the remaining technical debt with
stdio, file descriptors, and memory managemnt polyfills.
2020-11-28 12:01:51 -08:00
Justine Tunney
ea0b5d9d1c Get Cosmopolitan into releasable state
A new rollup tool now exists for flattening out the headers in a way
that works better for our purposes than cpp. A lot of the API clutter
has been removed. APIs that aren't a sure thing in terms of general
recommendation are now marked internal.

There's now a smoke test for the amalgamation archive and gigantic
header file. So we can now guarantee you can use this project on the
easiest difficulty setting without the gigantic repository.

A website is being created, which is currently a work in progress:
https://justine.storage.googleapis.com/cosmopolitan/index.html
2020-11-25 08:19:00 -08:00
Justine Tunney
feed0d2b0e Add minor improvements and cleanup 2020-10-27 03:39:46 -07:00
Justine Tunney
7327c345f9 Get address sanitizer mostly working 2020-09-03 05:44:37 -07:00
Justine Tunney
bd29223891 Fix bugs and have emulator emulate itself 2020-08-31 05:17:31 -07:00
Justine Tunney
f4f4caab0e Add x86_64-linux-gnu emulator
I wanted a tiny scriptable meltdown proof way to run userspace programs
and visualize how program execution impacts memory. It helps to explain
how things like Actually Portable Executable works. It can show you how
the GCC generated code is going about manipulating matrices and more. I
didn't feel fully comfortable with Qemu and Bochs because I'm not smart
enough to understand them. I wanted something like gVisor but with much
stronger levels of assurances. I wanted a single binary that'll run, on
all major operating systems with an embedded GPL barrier ZIP filesystem
that is tiny enough to transpile to JavaScript and run in browsers too.

https://justine.storage.googleapis.com/emulator625.mp4
2020-08-25 04:43:42 -07:00