This change reinvents all the GNU Readline features I discovered that I
couldn't live without, e.g. UTF-8, CTRL-R search and CTRL-Y yanking. It
now feels just as good in terms of user interface from the subconscious
workflow perspective. It's real nice to finally have an embeddable line
reader that's actually good with a 30 kb footprint and a bsd-2 license.
This change adds a directory to the examples folder, explaining how the
new Python compiler may be used. Some of the bugs with Python binaries
have been addressed but overall it's still a work in progress.
We can now link even smaller Python binaries. For example, the hello.com
program in the Python build directory is a compiled linked executable of
hello.py which just prints hello world. Using decentralized sections, we
can make that binary 1.9mb in size (noting that python.com is 6.3 megs!)
This works for nontrivial programs too. For example, say we want an APE
binary that's equivalent to python.com -m http.server. Our makefile now
builds such a binary using the new launcher and it's only 3.2mb in size
since Python sources get turned into ELF objects, which tell our linker
that we need things like native hashing algorithm code.
The termios::c_cc field turned out to be incorrectly defined on Linux
due to some confusion between the glibc and kernel definitions. We'll
be using the kernel definition, since it has the strongest consensus.
Fields have been have been added to struct stat for BSD compatibility
such as st_birthtim, plus the GLIBC compatibility of isystem/sys/stat
has been improved.
The ZIP filesystem has a breaking change. You now need to use /zip/ to
open() / opendir() / etc. assets within the ZIP structure of your APE
binary, instead of the previous convention of using zip: or zip! URIs.
This is needed because Python likes to use absolute paths, and having
ZIP paths encoded like URIs simply broke too many things.
Many more system calls have been updated to be able to operate on ZIP
files and file descriptors. In particular fcntl() and ioctl() since
Python would do things like ask if a ZIP file is a terminal and get
confused when the old implementation mistakenly said yes, because the
fastest way to guarantee native file descriptors is to dup(2). This
change also improves the async signal safety of zipos and ensures it
doesn't maintain any open file descriptors beyond that which the user
has opened.
This change makes a lot of progress towards adding magic numbers that
are specific to platforms other than Linux. The philosophy here is that,
if you use an operating system like FreeBSD, then you should be able to
take advantage of FreeBSD exclusive features, even if we don't polyfill
them on other platforms. For example, you can now open() a file with the
O_VERIFY flag. If your program runs on other platforms, then Cosmo will
automatically set O_VERIFY to zero. This lets you safely use it without
the need for #ifdef or ifstatements which detract from readability.
One of the blindspots of the ASAN memory hardening we use to offer Rust
like assurances has always been that memory passed to the kernel via
system calls (e.g. writev) can't be checked automatically since the
kernel wasn't built with MODE=asan. This change makes more progress
ensuring that each system call will verify the soundness of memory
before it's passed to the kernel. The code for doing these checks is
fast, particularly for buffers, where it can verify 64 bytes a cycle.
- Correct O_LOOP definition on NT
- Introduce program_executable_name
- Add ASAN guards to more system calls
- Improve termios compatibility with BSDs
- Fix bug in Windows auxiliary value encoding
- Add BSD and XNU specific errnos and open flags
- Add check to ensure build doesn't talk to internet
Actually Portable Python is now outperforming the Python binaries
that come bundled with Linux distros, at things like HTTP serving.
You can now have a fully featured Python install in just one .com
file that runs on six operating systems and is about 10mb in size.
With tuning, the tiniest is ~1mb. We've got most of the libraries
working, including pysqlite, and the repl now feels very pleasant.
The things you can't do quite yet are: threads and shared objects
but that can happen in the future, if the community falls in love
with this project and wants to see it developed further. Changes:
- Add siginterrupt()
- Add sqlite3 to Python
- Add issymlink() helper
- Make GetZipCdir() faster
- Add tgamma() and finite()
- Add legacy function lutimes()
- Add readlink() and realpath()
- Use heap allocations when appropriate
- Reorganize Python into two-stage build
- Save Lua / Python shell history to dotfile
- Integrate Python Lib embedding into linkage
- Make isregularfile() and isdirectory() go faster
- Make Python shell auto-completion work perfectly
- Make crash reports work better if changed directory
- Fix Python+NT open() / access() flag overflow error
- Disable Python tests relating to \N{LONG NAME} syntax
- Have Python REPL copyright() show all notice embeddings
The biggest technical challenge at the moment is working around
when Python tries to be too clever about filenames.
Thanks to all the refactorings we now have the ability to enforce
reasonable limitations on the amount of resources any individual
compile or test can consume. Those limits are currently:
- `-C 8` seconds of 3.1ghz CPU time
- `-M 256mebibytes` of virtual memory
- `-F 100megabyte` limit on file size
Only one file currently needs to exceed these limits:
o/$(MODE)/third_party/python/Objects/unicodeobject.o: \
QUOTA += -C16 # overrides cpu limit to 16 seconds
This change introduces a new sizetol() function to LIBC_FMT for parsing
byte or bit size strings with Si unit suffixes. Functions like atoi()
have been rewritten too.
Status lines for Emacs and Vim have been added to Python sources so
they'll be easier to edit using Python's preferred coding style.
Some DNS helper functions have been broken up into multiple files. It's
nice to have one function per file whenever possible, since that way we
don't need -ffunction-sections. Another reason it's good to have small
source files, is because the build will be enforcing resource limits on
compilation and testing soon.
- Add missing `os.pipe` and `os.getuid`
- Commented out _dummy_thread from Lib/threading.py so tests
don't simulate multi-threading and waste time/error out
- Revert test_hashlib to avoid blake2
This change gets the Python codebase into a state where it conforms to
the conventions of this codebase. It's now possible to include headers
from Python, without worrying about ordering. Python has traditionally
solved that problem by "diamonding" everything in Python.h, but that's
problematic since it means any change to any Python header invalidates
all the build artifacts. Lastly it makes tooling not work. Since it is
hard to explain to Emacs when I press C-c C-h to add an import line it
shouldn't add the header that actually defines the symbol, and instead
do follow the nonstandard Python convention.
Progress has been made on letting Python load source code from the zip
executable structure via the standard C library APIs. System calss now
recognizes zip!FILENAME alternative URIs as equivalent to zip:FILENAME
since Python uses colon as its delimiter.
Some progress has been made on embedding the notice license terms into
the Python object code. This is easier said than done since Python has
an extremely complicated ownership story.
- Some termios APIs have been added
- Implement rewinddir() dirstream API
- GetCpuCount() API added to Cosmopolitan Libc
- More bugs in Cosmopolitan Libc have been fixed
- zipobj.com now has flags for mangling the path
- Fixed bug a priori with sendfile() on certain BSDs
- Polyfill F_DUPFD and F_DUPFD_CLOEXEC across platforms
- FIOCLEX / FIONCLEX now polyfilled for fast O_CLOEXEC changes
- APE now supports a hybrid solution to no-self-modify for builds
- Many BSD-only magnums added, e.g. O_SEARCH, O_SHLOCK, SF_NODISKIO
Building o//third_party/python now takes 5 seconds on my PC
This change works towards modifying Python to use runtime dispatching
when appropriate. For example, when loading the magnums in the socket
module, it's a good idea to check if the magnum is zero, because that
means the local system platform doesn't support it.
It's important for build performance to use := rather than = notation so
that $(wildcard foo/*) isn't a lazily evaluated lambda. In the case of
Python where we need a lot of tuning and excludes, it should help to
spell things out a bit more to just not use wildcard for now.
Modules/Setup and Modules/Setup.local contain the build recipes for
various extensions, wrote a custom script to translate them for
python.mk. Modules/config.c needs to be changed if any extensions are
removed or added.
Most of the source modifications are for missing headers or compile time
build vars like ABIFLAGS.
Created separate mk files for the C extensions and the Python stdlib.
Can use find for adding the python files to the APE ZIP store, but right
now necessary files are just hardcoded.
python.com loads but some build configs are still missing (showing 1 Jan
1970 as time of compilation).
These are the commits from
https://github.com/ahgamut/cpython/tree/cosmo_py36 squashed for
simplicity.
Also included is the pyconfig.h used for compilation. The pyconfig.h has
to be changed manually in case Cosmopolitan gets new features.