cosmopolitan/libc/stdio/nftw.c

191 lines
5.9 KiB
C
Raw Normal View History

Improve ZIP filesystem and change its prefix The ZIP filesystem has a breaking change. You now need to use /zip/ to open() / opendir() / etc. assets within the ZIP structure of your APE binary, instead of the previous convention of using zip: or zip! URIs. This is needed because Python likes to use absolute paths, and having ZIP paths encoded like URIs simply broke too many things. Many more system calls have been updated to be able to operate on ZIP files and file descriptors. In particular fcntl() and ioctl() since Python would do things like ask if a ZIP file is a terminal and get confused when the old implementation mistakenly said yes, because the fastest way to guarantee native file descriptors is to dup(2). This change also improves the async signal safety of zipos and ensures it doesn't maintain any open file descriptors beyond that which the user has opened. This change makes a lot of progress towards adding magic numbers that are specific to platforms other than Linux. The philosophy here is that, if you use an operating system like FreeBSD, then you should be able to take advantage of FreeBSD exclusive features, even if we don't polyfill them on other platforms. For example, you can now open() a file with the O_VERIFY flag. If your program runs on other platforms, then Cosmo will automatically set O_VERIFY to zero. This lets you safely use it without the need for #ifdef or ifstatements which detract from readability. One of the blindspots of the ASAN memory hardening we use to offer Rust like assurances has always been that memory passed to the kernel via system calls (e.g. writev) can't be checked automatically since the kernel wasn't built with MODE=asan. This change makes more progress ensuring that each system call will verify the soundness of memory before it's passed to the kernel. The code for doing these checks is fast, particularly for buffers, where it can verify 64 bytes a cycle. - Correct O_LOOP definition on NT - Introduce program_executable_name - Add ASAN guards to more system calls - Improve termios compatibility with BSDs - Fix bug in Windows auxiliary value encoding - Add BSD and XNU specific errnos and open flags - Add check to ensure build doesn't talk to internet
2021-08-22 08:04:18 +00:00
/*-*- mode:c;indent-tabs-mode:t;c-basic-offset:8;tab-width:8;coding:utf-8 -*-│
vi: set et ft=c ts=8 tw=8 fenc=utf-8 :vi
Musl Libc
Copyright © 2005-2014 Rich Felker, et al.
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
#include "libc/calls/calls.h"
#include "libc/calls/struct/dirent.h"
#include "libc/calls/weirdtypes.h"
#include "libc/errno.h"
Make improvements - We now serialize the file descriptor table when spawning / executing processes on Windows. This means you can now inherit more stuff than just standard i/o. It's needed by bash, which duplicates the console to file descriptor #255. We also now do a better job serializing the environment variables, so you're less likely to encounter E2BIG when using your bash shell. We also no longer coerce environ to uppercase - execve() on Windows now remotely controls its parent process to make them spawn a replacement for itself. Then it'll be able to terminate immediately once the spawn succeeds, without having to linger around for the lifetime as a shell process for proxying the exit code. When process worker thread running in the parent sees the child die, it's given a handle to the new child, to replace it in the process table. - execve() and posix_spawn() on Windows will now provide CreateProcess an explicit handle list. This allows us to remove handle locks which enables better fork/spawn concurrency, with seriously correct thread safety. Other codebases like Go use the same technique. On the other hand fork() still favors the conventional WIN32 inheritence approach which can be a little bit messy, but is *controlled* by guaranteeing perfectly clean slates at both the spawning and execution boundaries - sigset_t is now 64 bits. Having it be 128 bits was a mistake because there's no reason to use that and it's only supported by FreeBSD. By using the system word size, signal mask manipulation on Windows goes very fast. Furthermore @asyncsignalsafe funcs have been rewritten on Windows to take advantage of signal masking, now that it's much more pleasant to use. - All the overlapped i/o code on Windows has been rewritten for pretty good signal and cancelation safety. We're now able to ensure overlap data structures are cleaned up so long as you don't longjmp() out of out of a signal handler that interrupted an i/o operation. Latencies are also improved thanks to the removal of lots of "busy wait" code. Waits should be optimal for everything except poll(), which shall be the last and final demon we slay in the win32 i/o horror show. - getrusage() on Windows is now able to report RUSAGE_CHILDREN as well as RUSAGE_SELF, thanks to aggregation in the process manager thread.
2023-10-08 12:36:18 +00:00
#include "libc/limits.h"
Make improvements - Every unit test now passes on Apple Silicon. The final piece of this puzzle was porting our POSIX threads cancelation support, since that works differently on ARM64 XNU vs. AMD64. Our semaphore support on Apple Silicon is also superior now compared to AMD64, thanks to the grand central dispatch library which lets *NSYNC locks go faster. - The Cosmopolitan runtime is now more stable, particularly on Windows. To do this, thread local storage is mandatory at all runtime levels, and the innermost packages of the C library is no longer being built using ASAN. TLS is being bootstrapped with a 128-byte TIB during the process startup phase, and then later on the runtime re-allocates it either statically or dynamically to support code using _Thread_local. fork() and execve() now do a better job cooperating with threads. We can now check how much stack memory is left in the process or thread when functions like kprintf() / execve() etc. call alloca(), so that ENOMEM can be raised, reduce a buffer size, or just print a warning. - POSIX signal emulation is now implemented the same way kernels do it with pthread_kill() and raise(). Any thread can interrupt any other thread, regardless of what it's doing. If it's blocked on read/write then the killer thread will cancel its i/o operation so that EINTR can be returned in the mark thread immediately. If it's doing a tight CPU bound operation, then that's also interrupted by the signal delivery. Signal delivery works now by suspending a thread and pushing context data structures onto its stack, and redirecting its execution to a trampoline function, which calls SetThreadContext(GetCurrentThread()) when it's done. - We're now doing a better job managing locks and handles. On NetBSD we now close semaphore file descriptors in forked children. Semaphores on Windows can now be canceled immediately, which means mutexes/condition variables will now go faster. Apple Silicon semaphores can be canceled too. We're now using Apple's pthread_yield() funciton. Apple _nocancel syscalls are now used on XNU when appropriate to ensure pthread_cancel requests aren't lost. The MbedTLS library has been updated to support POSIX thread cancelations. See tool/build/runitd.c for an example of how it can be used for production multi-threaded tls servers. Handles on Windows now leak less often across processes. All i/o operations on Windows are now overlapped, which means file pointers can no longer be inherited across dup() and fork() for the time being. - We now spawn a thread on Windows to deliver SIGCHLD and wakeup wait4() which means, for example, that posix_spawn() now goes 3x faster. POSIX spawn is also now more correct. Like Musl, it's now able to report the failure code of execve() via a pipe although our approach favors using shared memory to do that on systems that have a true vfork() function. - We now spawn a thread to deliver SIGALRM to threads when setitimer() is used. This enables the most precise wakeups the OS makes possible. - The Cosmopolitan runtime now uses less memory. On NetBSD for example, it turned out the kernel would actually commit the PT_GNU_STACK size which caused RSS to be 6mb for every process. Now it's down to ~4kb. On Apple Silicon, we reduce the mandatory upstream thread size to the smallest possible size to reduce the memory overhead of Cosmo threads. The examples directory has a program called greenbean which can spawn a web server on Linux with 10,000 worker threads and have the memory usage of the process be ~77mb. The 1024 byte overhead of POSIX-style thread-local storage is now optional; it won't be allocated until the pthread_setspecific/getspecific functions are called. On Windows, the threads that get spawned which are internal to the libc implementation use reserve rather than commit memory, which shaves a few hundred kb. - sigaltstack() is now supported on Windows, however it's currently not able to be used to handle stack overflows, since crash signals are still generated by WIN32. However the crash handler will still switch to the alt stack, which is helpful in environments with tiny threads. - Test binaries are now smaller. Many of the mandatory dependencies of the test runner have been removed. This ensures many programs can do a better job only linking the the thing they're testing. This caused the test binaries for LIBC_FMT for example, to decrease from 200kb to 50kb - long double is no longer used in the implementation details of libc, except in the APIs that define it. The old code that used long double for time (instead of struct timespec) has now been thoroughly removed. - ShowCrashReports() is now much tinier in MODE=tiny. Instead of doing backtraces itself, it'll just print a command you can run on the shell using our new `cosmoaddr2line` program to view the backtrace. - Crash report signal handling now works in a much better way. Instead of terminating the process, it now relies on SA_RESETHAND so that the default SIG_IGN behavior can terminate the process if necessary. - Our pledge() functionality has now been fully ported to AARCH64 Linux.
2023-09-19 03:44:45 +00:00
#include "libc/runtime/stack.h"
#include "libc/stdio/ftw.h"
#include "libc/str/str.h"
#include "libc/sysv/consts/o.h"
#include "libc/sysv/consts/s.h"
2022-11-05 01:19:05 +00:00
#include "libc/thread/thread.h"
asm(".ident\t\"\\n\\n\
Musl libc (MIT License)\\n\
Copyright 2005-2014 Rich Felker, et. al.\"");
asm(".include \"libc/disclaimer.inc\"");
// clang-format off
struct history
{
struct history *chain;
dev_t dev;
ino_t ino;
int level;
int base;
};
static int do_nftw(char *path,
int fn(const char *, const struct stat *, int, struct FTW *),
int fd_limit,
int flags,
struct history *h)
{
size_t l = strlen(path), j = l && path[l-1]=='/' ? l-1 : l;
struct stat st;
struct history new;
int type;
int r;
int dfd=-1;
int err=0;
struct FTW lev;
if ((flags & FTW_PHYS) ? lstat(path, &st) : stat(path, &st) < 0) {
if (!(flags & FTW_PHYS) && errno==ENOENT && !lstat(path, &st))
type = FTW_SLN;
else if (errno != EACCES) return -1;
else type = FTW_NS;
} else if (S_ISDIR(st.st_mode)) {
if (flags & FTW_DEPTH) type = FTW_DP;
else type = FTW_D;
} else if (S_ISLNK(st.st_mode)) {
if (flags & FTW_PHYS) type = FTW_SL;
else type = FTW_SLN;
} else {
type = FTW_F;
}
if ((flags & FTW_MOUNT) && h && st.st_dev != h->dev)
return 0;
new.chain = h;
new.dev = st.st_dev;
new.ino = st.st_ino;
new.level = h ? h->level+1 : 0;
new.base = j+1;
lev.level = new.level;
if (h) {
lev.base = h->base;
} else {
size_t k;
for (k=j; k && path[k]=='/'; k--);
for (; k && path[k-1]!='/'; k--);
lev.base = k;
}
if (type == FTW_D || type == FTW_DP) {
Add SSL to redbean Your redbean can now interoperate with clients that require TLS crypto. This is accomplished using a protocol polyglot that lets us distinguish between HTTP and HTTPS regardless of the port number. Certificates will be generated automatically, if none are supplied by the user. Footprint increases by only a few hundred kb so redbean in MODY=tiny is now 1.0mb - Add lseek() polyfills for ZIP executable - Automatically polyfill /tmp/FOO paths on NT - Fix readdir() / ftw() / nftw() bugs on Windows - Introduce -B flag for slower SSL that's stronger - Remove mbedtls features Cosmopolitan doesn't need - Have base64 decoder support the uri-safe alternative - Remove Truncated HMAC because it's forbidden by the IETF - Add all the mbedtls test suites and make them go 3x faster - Support opendir() / readdir() / closedir() on ZIP executable - Use Everest for ECDHE-ECDSA because it's so good it's so good - Add tinier implementation of sha1 since it's not worth the rom - Add chi-square monte-carlo mean correlation tests for getrandom() - Source entropy on Windows from the proper interface everyone uses We're continuing to outperform NGINX and other servers on raw message throughput. Using SSL means that instead of 1,000,000 qps you can get around 300,000 qps. However redbean isn't as fast as NGINX yet at SSL handshakes, since redbean can do 2,627 per second and NGINX does 4.3k Right now, the SSL UX story works best if you give your redbean a key signing key since that can be easily generated by openssl using a one liner then redbean will do all the things that are impossibly hard to do like signing ecdsa and rsa certificates that'll work in chrome. We should integrate the let's encrypt acme protocol in the future. Live Demo: https://redbean.justine.lol/ Root Cert: https://redbean.justine.lol/redbean1.crt
2021-06-24 19:31:26 +00:00
dfd = open(path, O_RDONLY | O_DIRECTORY);
err = errno;
if (dfd < 0 && err == EACCES) type = FTW_DNR;
if (!fd_limit) close(dfd);
}
if (!(flags & FTW_DEPTH) && (r=fn(path, &st, type, &lev)))
return r;
for (; h; h = h->chain)
if (h->dev == st.st_dev && h->ino == st.st_ino)
return 0;
if ((type == FTW_D || type == FTW_DP) && fd_limit) {
if (dfd < 0) {
errno = err;
return -1;
}
DIR *d = fdopendir(dfd);
if (d) {
struct dirent *de;
while ((de = readdir(d))) {
if (de->d_name[0] == '.'
&& (!de->d_name[1]
|| (de->d_name[1]=='.'
&& !de->d_name[2]))) continue;
Make improvements - We now serialize the file descriptor table when spawning / executing processes on Windows. This means you can now inherit more stuff than just standard i/o. It's needed by bash, which duplicates the console to file descriptor #255. We also now do a better job serializing the environment variables, so you're less likely to encounter E2BIG when using your bash shell. We also no longer coerce environ to uppercase - execve() on Windows now remotely controls its parent process to make them spawn a replacement for itself. Then it'll be able to terminate immediately once the spawn succeeds, without having to linger around for the lifetime as a shell process for proxying the exit code. When process worker thread running in the parent sees the child die, it's given a handle to the new child, to replace it in the process table. - execve() and posix_spawn() on Windows will now provide CreateProcess an explicit handle list. This allows us to remove handle locks which enables better fork/spawn concurrency, with seriously correct thread safety. Other codebases like Go use the same technique. On the other hand fork() still favors the conventional WIN32 inheritence approach which can be a little bit messy, but is *controlled* by guaranteeing perfectly clean slates at both the spawning and execution boundaries - sigset_t is now 64 bits. Having it be 128 bits was a mistake because there's no reason to use that and it's only supported by FreeBSD. By using the system word size, signal mask manipulation on Windows goes very fast. Furthermore @asyncsignalsafe funcs have been rewritten on Windows to take advantage of signal masking, now that it's much more pleasant to use. - All the overlapped i/o code on Windows has been rewritten for pretty good signal and cancelation safety. We're now able to ensure overlap data structures are cleaned up so long as you don't longjmp() out of out of a signal handler that interrupted an i/o operation. Latencies are also improved thanks to the removal of lots of "busy wait" code. Waits should be optimal for everything except poll(), which shall be the last and final demon we slay in the win32 i/o horror show. - getrusage() on Windows is now able to report RUSAGE_CHILDREN as well as RUSAGE_SELF, thanks to aggregation in the process manager thread.
2023-10-08 12:36:18 +00:00
if (strlen(de->d_name) >= PATH_MAX-l) {
errno = ENAMETOOLONG;
closedir(d);
return -1;
}
path[j]='/';
strcpy(path+j+1, de->d_name);
if ((r=do_nftw(path, fn, fd_limit-1, flags, &new))) {
closedir(d);
return r;
}
}
closedir(d);
} else {
close(dfd);
return -1;
}
}
path[l] = 0;
if ((flags & FTW_DEPTH) && (r=fn(path, &st, type, &lev)))
return r;
return 0;
}
/**
* Walks file tree.
*
* @return 0 on success, -1 on error, or non-zero `fn` result
* @see examples/walk.c for example
Make improvements - This change fixes a bug that allowed unbuffered printf() output (to streams like stderr) to be truncated. This regression was introduced some time between now and the last release. - POSIX specifies all functions as thread safe by default. This change works towards cleaning up our use of the @threadsafe / @threadunsafe documentation annotations to reflect that. The goal is (1) to use @threadunsafe to document functions which POSIX say needn't be thread safe, and (2) use @threadsafe to document functions that we chose to implement as thread safe even though POSIX didn't mandate it. - Tidy up the clock_gettime() implementation. We're now trying out a cleaner approach to system call support that aims to maintain the Linux errno convention as long as possible. This also fixes bugs that existed previously, where the vDSO errno wasn't being translated properly. The gettimeofday() system call is now a wrapper for clock_gettime(), which reduces bloat in apps that use both. - The recently-introduced improvements to the execute bit on Windows has had bugs fixed. access(X_OK) on a directory on Windows now succeeds. fstat() will now perform the MZ/#! ReadFile() operation correctly. - Windows.h is no longer included in libc/isystem/, because it confused PCRE's build system into thinking Cosmopolitan is a WIN32 platform. Cosmo's Windows.h polyfill was never even really that good, since it only defines a subset of the subset of WIN32 APIs that Cosmo defines. - The setlongerjmp() / longerjmp() APIs are removed. While they're nice APIs that are superior to the standardized setjmp / longjmp functions, they weren't superior enough to not be dead code in the monorepo. If you use these APIs, please file an issue and they'll be restored. - The .com appending magic has now been removed from APE Loader.
2023-10-03 02:25:19 +00:00
* @threadsafe
*/
int nftw(const char *dirpath,
int fn(const char *fpath,
const struct stat *st,
int typeflag,
struct FTW *ftwbuf),
int fd_limit,
int flags)
{
Make improvements - We now serialize the file descriptor table when spawning / executing processes on Windows. This means you can now inherit more stuff than just standard i/o. It's needed by bash, which duplicates the console to file descriptor #255. We also now do a better job serializing the environment variables, so you're less likely to encounter E2BIG when using your bash shell. We also no longer coerce environ to uppercase - execve() on Windows now remotely controls its parent process to make them spawn a replacement for itself. Then it'll be able to terminate immediately once the spawn succeeds, without having to linger around for the lifetime as a shell process for proxying the exit code. When process worker thread running in the parent sees the child die, it's given a handle to the new child, to replace it in the process table. - execve() and posix_spawn() on Windows will now provide CreateProcess an explicit handle list. This allows us to remove handle locks which enables better fork/spawn concurrency, with seriously correct thread safety. Other codebases like Go use the same technique. On the other hand fork() still favors the conventional WIN32 inheritence approach which can be a little bit messy, but is *controlled* by guaranteeing perfectly clean slates at both the spawning and execution boundaries - sigset_t is now 64 bits. Having it be 128 bits was a mistake because there's no reason to use that and it's only supported by FreeBSD. By using the system word size, signal mask manipulation on Windows goes very fast. Furthermore @asyncsignalsafe funcs have been rewritten on Windows to take advantage of signal masking, now that it's much more pleasant to use. - All the overlapped i/o code on Windows has been rewritten for pretty good signal and cancelation safety. We're now able to ensure overlap data structures are cleaned up so long as you don't longjmp() out of out of a signal handler that interrupted an i/o operation. Latencies are also improved thanks to the removal of lots of "busy wait" code. Waits should be optimal for everything except poll(), which shall be the last and final demon we slay in the win32 i/o horror show. - getrusage() on Windows is now able to report RUSAGE_CHILDREN as well as RUSAGE_SELF, thanks to aggregation in the process manager thread.
2023-10-08 12:36:18 +00:00
char pathbuf[PATH_MAX+1];
int r, cs;
size_t l;
if (fd_limit <= 0) return 0;
l = strlen(dirpath);
Make improvements - We now serialize the file descriptor table when spawning / executing processes on Windows. This means you can now inherit more stuff than just standard i/o. It's needed by bash, which duplicates the console to file descriptor #255. We also now do a better job serializing the environment variables, so you're less likely to encounter E2BIG when using your bash shell. We also no longer coerce environ to uppercase - execve() on Windows now remotely controls its parent process to make them spawn a replacement for itself. Then it'll be able to terminate immediately once the spawn succeeds, without having to linger around for the lifetime as a shell process for proxying the exit code. When process worker thread running in the parent sees the child die, it's given a handle to the new child, to replace it in the process table. - execve() and posix_spawn() on Windows will now provide CreateProcess an explicit handle list. This allows us to remove handle locks which enables better fork/spawn concurrency, with seriously correct thread safety. Other codebases like Go use the same technique. On the other hand fork() still favors the conventional WIN32 inheritence approach which can be a little bit messy, but is *controlled* by guaranteeing perfectly clean slates at both the spawning and execution boundaries - sigset_t is now 64 bits. Having it be 128 bits was a mistake because there's no reason to use that and it's only supported by FreeBSD. By using the system word size, signal mask manipulation on Windows goes very fast. Furthermore @asyncsignalsafe funcs have been rewritten on Windows to take advantage of signal masking, now that it's much more pleasant to use. - All the overlapped i/o code on Windows has been rewritten for pretty good signal and cancelation safety. We're now able to ensure overlap data structures are cleaned up so long as you don't longjmp() out of out of a signal handler that interrupted an i/o operation. Latencies are also improved thanks to the removal of lots of "busy wait" code. Waits should be optimal for everything except poll(), which shall be the last and final demon we slay in the win32 i/o horror show. - getrusage() on Windows is now able to report RUSAGE_CHILDREN as well as RUSAGE_SELF, thanks to aggregation in the process manager thread.
2023-10-08 12:36:18 +00:00
if (l > PATH_MAX) {
errno = ENAMETOOLONG;
return -1;
}
memcpy(pathbuf, dirpath, l+1);
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &cs);
r = do_nftw(pathbuf, fn, fd_limit, flags, NULL);
pthread_setcancelstate(cs, 0);
return r;
}