Commit graph

54 commits

Author SHA1 Message Date
Gautham
d7ff346b52
Add some Python 3.7 backports (#306)
* make.com now uses stack size of 2mb
* optimize _PyCFunction_FastCallKeywords
* backport python@cpython/7fc252adfbedece75f2330bcfdadbf84dee7836f
2021-10-29 22:54:14 -07:00
Justine Tunney
903cc38c37 Revert "Make MODE=tiny not depend on default build"
This reverts commit 30cd28b1f8.
2021-10-26 17:14:38 -07:00
Justine Tunney
30cd28b1f8 Make MODE=tiny not depend on default build
This issue was spotted by @ahgamut in #292.
2021-10-25 16:26:22 -07:00
Gautham
49db877fbe
Minimize Python startup imports (#292)
* get_exports_list should return list
* remove unintentional `CC=clang` in makefile
* avoid importing sysconfig during startup

site.py requires only a couple of functions from sysconfig, but needs to
load the entirety of sysconfig to get those functions. This commit
makes it such that sysconfig is imported only when sys.platform is darwin.

* remove redundant constants from stat module

The constants are only there in case the C implementation (ie the _stat
module) is not available. With Cosmopolitan the _stat module is always
available. The entire Lib/stat.py file can be removed if the Windows-based
constants can be moved into the Modules/_stat.c.

* minimal changes to os.py

python checks os-based assumptions at startup, some of  which can be
bypassed since this is Cosmopolitan Python.
2021-10-25 14:04:04 -07:00
Justine Tunney
67b5200a0b Add MODE=optlinux build mode (#141) 2021-10-14 19:36:49 -07:00
Justine Tunney
226aaf3547 Improve memory safety
This commit makes numerous refinements to cosmopolitan memory handling.

The default stack size has been reduced from 2mb to 128kb. A new macro
is now provided so you can easily reconfigure the stack size to be any
value you want. Work around the breaking change by adding to your main:

    STATIC_STACK_SIZE(0x00200000);  // 2mb stack

If you're not sure how much stack you need, then you can use:

    STATIC_YOINK("stack_usage_logging");

After which you can `sort -nr o/$MODE/stack.log`. Based on the unit test
suite, nothing in the Cosmopolitan repository (except for Python) needs
a stack size greater than 30kb. There are also new macros for detecting
the size and address of the stack at runtime, e.g. GetStackAddr(). We
also now support sigaltstack() so if you want to see nice looking crash
reports whenever a stack overflow happens, you can put this in main():

    ShowCrashReports();

Under `make MODE=dbg` and `make MODE=asan` the unit testing framework
will now automatically print backtraces of memory allocations when
things like memory leaks happen. Bugs are now fixed in ASAN global
variable overrun detection. The memtrack and asan runtimes also handle
edge cases now. The new tools helped to identify a few memory leaks,
which are fixed by this change.

This change should fix an issue reported in #288 with ARG_MAX limits.
Fixing this doubled the performance of MKDEPS.COM and AR.COM yet again.
2021-10-13 17:27:13 -07:00
Gautham
d852640a1e
Add Python ftrace contextmanager (#285) 2021-10-13 11:00:25 -07:00
Justine Tunney
7061c79c22 Make fixes, improvements, and chibicc python bindings
- python now mixes audio 10x faster
- python octal notation is restored
- chibicc now builds code 3x faster
- chibicc now has help documentation
- chibicc can now generate basic python bindings
- linenoise now supports some paredit-like features

See #141
2021-10-08 08:41:57 -07:00
Justine Tunney
28997f3acb Make mkdeps.com go faster
This program usually runs once at the begininng of each GNU Make
invocation. It generates an o//depend file with 170,000 lines of
Makefile code to define source -> headers relationships.

This change makes that take 650 milliseconds rather than 1,100ms
by improving the performance of strstr(), using longsort(), plus
migrating to the new append library.
2021-10-04 06:46:46 -07:00
Justine Tunney
725f4d79f6 Apply fixes and speedups 2021-10-04 03:23:31 -07:00
Justine Tunney
7521bf9e73 Add stack overflow checking to Python 2021-10-02 10:50:41 -07:00
Justine Tunney
47a53e143b Productionize new APE loader and more
The APE_NO_MODIFY_SELF loader payload has been moved out of the examples
folder and improved so that it works on BSD systems, and permits general
elf program headers. This brings its quality up enough that it should be
acceptable to use by default for many programs, e.g. Python, Lua, SQLite
and Python. It's the responsibility of the user to define an appropriate
TMPDIR if /tmp is considered an adversarial environment. Mac OS shall be
supported by APE_NO_MODIFY_SELF soon.

Fixes and improvements have been made to program_executable_name as it's
now the one true way to get the absolute path of the executing image.

This change fixes a memory leak in linenoise history loading, introduced
by performance optimizations in 51904e2687
This change fixes a longstanding regression with Mach system calls, that
23ae9dfceb back in February which impacted
our sched_yield() implementation, which is why no one noticed until now.

The Blinkenlights PC emulator has been improved. We now fix rendering on
XNU and BSD by not making the assumption that the kernel terminal driver
understands UTF8 since that seems to break its internal modeling of \r\n
which is now being addressed by using \e[𝑦H instead. The paneling is now
more compact in real mode so you won't need to make your font as tiny if
you're only emulating an 8086 program. The CLMUL ISA is now emulated too

This change also makes improvement to time. CLOCK_MONOTONIC now does the
right thing on Windows NT. The nanosecond time module functions added in
Python 3.7 have been backported.

This change doubles the performance of Argon2 password stretching simply
by not using its copy_block and xor_block helper functions, as they were
trivial to inline thus resulting in us needing to iterate over each 1024
byte block four fewer times.

This change makes code size improvements. _PyUnicode_ToNumeric() was 64k
in size and now it's 10k. The CJK codec lookup tables now use lazy delta
zigzag deflate (δzd) encoding which reduces their size from 600k to 200k
plus the code bloat caused by macro abuse in _decimal.c is now addressed
so our fully-loaded statically-linked hermetically-sealed Python virtual
interpreter container is now 9.4 megs in the default build mode and 5.5m
in MODE=tiny which leaves plenty of room for chibicc.

The pydoc web server now accommodates the use case of people who work by
SSH'ing into a different machine w/ python.com -m pydoc -p8080 -h0.0.0.0

Finally Python Capsulae delenda est and won't be supported in the future
2021-10-02 08:27:03 -07:00
Justine Tunney
9cb54218ab Add error checks to Python objectifier (#281)
PYOBJ.COM was failing when statically analyzing _pyio.py in MODE=dbg
because co_consts contained a big number, which dirtied the interpreter
exception state. We now do comprehensive error checking w/ Python API.

The -DSTACK_FRAME_UNLIMITED CPPFLAG has been removed from DES since its
self test function has been fixed to use heap memory rather than making
aggressive use of the stack.

This change also fixes a regression with function tracing (the --ftrace
flag a.k.a. ftrace_install() a.k.a. cosmo.ftrace) in ASAN build modes.
Lastly, the _tracemalloc module should now always be available for use
in MODE=dbg.
2021-10-02 06:17:17 -07:00
Gautham
57f0eed382
Fix Pyston speedups (#281)
We remove (i.e. hide behind a debug ifdef) the recursion checking methods,
and the memory hooks and memory allocator methods. ASAN mode has no
PYMALLOC, so we need a macro. Fix build break with des.c stack allocation.
2021-10-02 01:28:51 -07:00
Gautham
2fe8571010
Add zipimport hook at the end just in case (#280)
In Python, the zipimport path hook is usually the first entry in
sys.path_hooks, so that any zip files in sys.path can be handled
correctly. In the APE, the zipimport hook was removed because it was
relatively slow compared to Cosmopolitan when it came to handling
imports from the APE's internal zip store.

However, some python scripts (for example when pip installs some
packages) modify sys.path to consider a local zip file, and then attempt
to import from it. This change prevents potential "unable to import"
errors in such cases, so that Actually Portable Python can be more of a
drop-in improved replacement.
2021-09-29 01:50:34 -07:00
Justine Tunney
39bf41f4eb Make numerous improvements
- Python static hello world now 1.8mb
- Python static fully loaded now 10mb
- Python HTTPS client now uses MbedTLS
- Python REPL now completes import stmts
- Increase stack size for Python for now
- Begin synthesizing posixpath and ntpath
- Restore Python \N{UNICODE NAME} support
- Restore Python NFKD symbol normalization
- Add optimized code path for Intel SHA-NI
- Get more Python unit tests passing faster
- Get Python help() pagination working on NT
- Python hashlib now supports MbedTLS PBKDF2
- Make memcpy/memmove/memcmp/bcmp/etc. faster
- Add Mersenne Twister and Vigna to LIBC_RAND
- Provide privileged __printf() for error code
- Fix zipos opendir() so that it reports ENOTDIR
- Add basic chmod() implementation for Windows NT
- Add Cosmo's best functions to Python cosmo module
- Pin function trace indent depth to that of caller
- Show memory diagram on invalid access in MODE=dbg
- Differentiate stack overflow on crash in MODE=dbg
- Add stb_truetype and tools for analyzing font files
- Upgrade to UNICODE 13 and reduce its binary footprint
- COMPILE.COM now logs resource usage of build commands
- Start implementing basic poll() support on bare metal
- Set getauxval(AT_EXECFN) to GetModuleFileName() on NT
- Add descriptions to strerror() in non-TINY build modes
- Add COUNTBRANCH() macro to help with micro-optimizations
- Make error / backtrace / asan / memory code more unbreakable
- Add fast perfect C implementation of μ-Law and a-Law audio codecs
- Make strtol() functions consistent with other libc implementations
- Improve Linenoise implementation (see also github.com/jart/bestline)
- COMPILE.COM now suppresses stdout/stderr of successful build commands
2021-09-28 01:52:34 -07:00
Justine Tunney
b5f743cdc3 Begin incorporating Python unit tests into build
We now build a separate APE binary for each test so they can run in
parallel. We've got 148 tests running fast and stable so far.
2021-09-12 21:04:44 -07:00
Justine Tunney
51904e2687 Improve Python and Linenoise
This change reinvents all the GNU Readline features I discovered that I
couldn't live without, e.g. UTF-8, CTRL-R search and CTRL-Y yanking. It
now feels just as good in terms of user interface from the subconscious
workflow perspective. It's real nice to finally have an embeddable line
reader that's actually good with a 30 kb footprint and a bsd-2 license.

This change adds a directory to the examples folder, explaining how the
new Python compiler may be used.  Some of the bugs with Python binaries
have been addressed but overall it's still a work in progress.
2021-09-11 22:30:37 -07:00
Justine Tunney
559b024e1d Decentralize Python native module linkage
We can now link even smaller Python binaries. For example, the hello.com
program in the Python build directory is a compiled linked executable of
hello.py which just prints hello world. Using decentralized sections, we
can make that binary 1.9mb in size (noting that python.com is 6.3 megs!)

This works for nontrivial programs too. For example, say we want an APE
binary that's equivalent to python.com -m http.server. Our makefile now
builds such a binary using the new launcher and it's only 3.2mb in size
since Python sources get turned into ELF objects, which tell our linker
that we need things like native hashing algorithm code.
2021-09-07 11:40:11 -07:00
Justine Tunney
dfa0359b50 Exclude .py files in MODE=rel / tiny 2021-09-06 19:34:57 -07:00
Justine Tunney
4f41f2184d Improve Python tree-shaking 2021-09-06 19:24:10 -07:00
Justine Tunney
44c87b83ff Implement tree-shaking for Python sources 2021-09-05 01:20:03 -07:00
Justine Tunney
81287b7ec0 Introduce Python objectifier (#259) 2021-09-04 15:44:00 -07:00
Justine Tunney
a81192e0b9 Fix some build breaks 2021-09-04 02:29:57 -07:00
Gautham
27f7ffd4fd
Add speedups from pyston (#264)
This should make Python go 30% faster. It does that by trading
away some debuggability, like _tracemalloc. It can be re-enabled
using `make MODE=dbg`.
2021-09-04 02:21:37 -07:00
Justine Tunney
5b60e5a37d Fix termios struct on Linux
The termios::c_cc field turned out to be incorrectly defined on Linux
due to some confusion between the glibc and kernel definitions. We'll
be using the kernel definition, since it has the strongest consensus.

Fields have been have been added to struct stat for BSD compatibility
such as st_birthtim, plus the GLIBC compatibility of isystem/sys/stat
has been improved.
2021-09-03 22:19:41 -07:00
Justine Tunney
50937be752 Fix select() on Windows for timeout (#141) 2021-08-26 15:59:55 -07:00
Justine Tunney
3085ac7837 Improve system call support 2021-08-25 21:36:17 -07:00
Gautham
63b867bd2f
Added _multiprocessing to Python (#259)
Also changed some PYTHON_YOINKs so that http.server would work in
MODE=tiny.
2021-08-25 19:45:59 -07:00
Justine Tunney
7d25fb0090 Import some Lua documentation
I personally find it easier to read the documentation in Emacs
using JavaDoc style comments.
2021-08-22 15:03:04 -07:00
Justine Tunney
00611e9b06 Improve ZIP filesystem and change its prefix
The ZIP filesystem has a breaking change. You now need to use /zip/ to
open() / opendir() / etc. assets within the ZIP structure of your APE
binary, instead of the previous convention of using zip: or zip! URIs.
This is needed because Python likes to use absolute paths, and having
ZIP paths encoded like URIs simply broke too many things.

Many more system calls have been updated to be able to operate on ZIP
files and file descriptors. In particular fcntl() and ioctl() since
Python would do things like ask if a ZIP file is a terminal and get
confused when the old implementation mistakenly said yes, because the
fastest way to guarantee native file descriptors is to dup(2). This
change also improves the async signal safety of zipos and ensures it
doesn't maintain any open file descriptors beyond that which the user
has opened.

This change makes a lot of progress towards adding magic numbers that
are specific to platforms other than Linux. The philosophy here is that,
if you use an operating system like FreeBSD, then you should be able to
take advantage of FreeBSD exclusive features, even if we don't polyfill
them on other platforms. For example, you can now open() a file with the
O_VERIFY flag. If your program runs on other platforms, then Cosmo will
automatically set O_VERIFY to zero. This lets you safely use it without
the need for #ifdef or ifstatements which detract from readability.

One of the blindspots of the ASAN memory hardening we use to offer Rust
like assurances has always been that memory passed to the kernel via
system calls (e.g. writev) can't be checked automatically since the
kernel wasn't built with MODE=asan. This change makes more progress
ensuring that each system call will verify the soundness of memory
before it's passed to the kernel. The code for doing these checks is
fast, particularly for buffers, where it can verify 64 bytes a cycle.

- Correct O_LOOP definition on NT
- Introduce program_executable_name
- Add ASAN guards to more system calls
- Improve termios compatibility with BSDs
- Fix bug in Windows auxiliary value encoding
- Add BSD and XNU specific errnos and open flags
- Add check to ensure build doesn't talk to internet
2021-08-22 01:11:53 -07:00
Justine Tunney
8af197560e Improve Libc by making Python work even better
Actually Portable Python is now outperforming the Python binaries
that come bundled with Linux distros, at things like HTTP serving.
You can now have a fully featured Python install in just one .com
file that runs on six operating systems and is about 10mb in size.
With tuning, the tiniest is ~1mb. We've got most of the libraries
working, including pysqlite, and the repl now feels very pleasant.
The things you can't do quite yet are: threads and shared objects
but that can happen in the future, if the community falls in love
with this project and wants to see it developed further. Changes:

- Add siginterrupt()
- Add sqlite3 to Python
- Add issymlink() helper
- Make GetZipCdir() faster
- Add tgamma() and finite()
- Add legacy function lutimes()
- Add readlink() and realpath()
- Use heap allocations when appropriate
- Reorganize Python into two-stage build
- Save Lua / Python shell history to dotfile
- Integrate Python Lib embedding into linkage
- Make isregularfile() and isdirectory() go faster
- Make Python shell auto-completion work perfectly
- Make crash reports work better if changed directory
- Fix Python+NT open() / access() flag overflow error
- Disable Python tests relating to \N{LONG NAME} syntax
- Have Python REPL copyright() show all notice embeddings

The biggest technical challenge at the moment is working around
when Python tries to be too clever about filenames.
2021-08-18 22:16:23 -07:00
Gautham
98ccbf44b1 Tell _frozen_importlib to consider bytecode first (#248) 2021-08-18 21:58:05 -07:00
Justine Tunney
ebb8c85496 Experiment with making Python go faster
The goal is to put the compiled pyc files in the APE ZIP.
2021-08-18 21:57:11 -07:00
Justine Tunney
bc464a8898 Fix a few more Python tests 2021-08-16 23:47:47 -07:00
Justine Tunney
59e1c245d1 Get more Python tests passing (#141) 2021-08-16 15:26:31 -07:00
Justine Tunney
5029e20bef Improve linenoise and get it working on Windows
Some progress has been made on introducing completion but there's been
difficulties using the Python C API to get local shell variables.
2021-08-15 14:34:05 -07:00
Justine Tunney
228fb7428b Improve isystem includes and magic numbers 2021-08-14 23:36:36 -07:00
Justine Tunney
1e5bd4d23e Ues linenoise in Lua, Python, and SQLite 2021-08-14 11:26:23 -07:00
Justine Tunney
e963d9c8e3 Add cpu / mem / fsz limits to build system
Thanks to all the refactorings we now have the ability to enforce
reasonable limitations on the amount of resources any individual
compile or test can consume. Those limits are currently:

- `-C 8` seconds of 3.1ghz CPU time
- `-M 256mebibytes` of virtual memory
- `-F 100megabyte` limit on file size

Only one file currently needs to exceed these limits:

    o/$(MODE)/third_party/python/Objects/unicodeobject.o: \
        QUOTA += -C16  # overrides cpu limit to 16 seconds

This change introduces a new sizetol() function to LIBC_FMT for parsing
byte or bit size strings with Si unit suffixes. Functions like atoi()
have been rewritten too.
2021-08-13 23:40:53 -07:00
Justine Tunney
9b29358511 Make whitespace changes
Status lines for Emacs and Vim have been added to Python sources so
they'll be easier to edit using Python's preferred coding style.

Some DNS helper functions have been broken up into multiple files. It's
nice to have one function per file whenever possible, since that way we
don't need -ffunction-sections.  Another reason it's good to have small
source files, is because the build will be enforcing resource limits on
compilation and testing soon.
2021-08-13 03:20:45 -07:00
Gautham
1aa0df696c
Test changes to Actually Portable Python (#240)
- Add missing `os.pipe` and `os.getuid`
- Commented out _dummy_thread from Lib/threading.py so tests
  don't simulate multi-threading and waste time/error out
- Revert test_hashlib to avoid blake2
2021-08-13 02:24:43 -07:00
Justine Tunney
b420ed8248 Undiamond Python headers
This change gets the Python codebase into a state where it conforms to
the conventions of this codebase. It's now possible to include headers
from Python, without worrying about ordering. Python has traditionally
solved that problem by "diamonding" everything in Python.h, but that's
problematic since it means any change to any Python header invalidates
all the build artifacts. Lastly it makes tooling not work. Since it is
hard to explain to Emacs when I press C-c C-h to add an import line it
shouldn't add the header that actually defines the symbol, and instead
do follow the nonstandard Python convention.

Progress has been made on letting Python load source code from the zip
executable structure via the standard C library APIs. System calss now
recognizes zip!FILENAME alternative URIs as equivalent to zip:FILENAME
since Python uses colon as its delimiter.

Some progress has been made on embedding the notice license terms into
the Python object code. This is easier said than done since Python has
an extremely complicated ownership story.

- Some termios APIs have been added
- Implement rewinddir() dirstream API
- GetCpuCount() API added to Cosmopolitan Libc
- More bugs in Cosmopolitan Libc have been fixed
- zipobj.com now has flags for mangling the path
- Fixed bug a priori with sendfile() on certain BSDs
- Polyfill F_DUPFD and F_DUPFD_CLOEXEC across platforms
- FIOCLEX / FIONCLEX now polyfilled for fast O_CLOEXEC changes
- APE now supports a hybrid solution to no-self-modify for builds
- Many BSD-only magnums added, e.g. O_SEARCH, O_SHLOCK, SF_NODISKIO
2021-08-12 14:07:40 -07:00
Gautham
9454788223
Fix sysconfig check for build vars (#238) 2021-08-11 23:21:54 -07:00
Justine Tunney
d26d7ae0e4 Perform build and magnum tuning
Building o//third_party/python now takes 5 seconds on my PC

This change works towards modifying Python to use runtime dispatching
when appropriate. For example, when loading the magnums in the socket
module, it's a good idea to check if the magnum is zero, because that
means the local system platform doesn't support it.
2021-08-10 10:26:13 -07:00
Justine Tunney
b703eee96e Fix obvious Python performance suboptimality 2021-08-09 10:41:06 -07:00
Justine Tunney
5a441ea57f Remove Python gitignore 2021-08-09 09:15:23 -07:00
Justine Tunney
10aade69e3 Remove wildcard from Python build config
It's important for build performance to use := rather than = notation so
that $(wildcard foo/*) isn't a lazily evaluated lambda. In the case of
Python where we need a lot of tuning and excludes, it should help to
spell things out a bit more to just not use wildcard for now.
2021-08-09 08:59:18 -07:00
Justine Tunney
798d542f15 Fix build and delete superfluous files
- Make Python make formatting pristine
- Add missing `#pragma weak` to Python source
- Fix Clang script flake due to missing directory
2021-08-09 06:57:14 -07:00
ahgamut
295b3d6ca5 removed unnecessary libs from zip store 2021-08-09 05:39:42 -07:00