Initial import

2025-10-26 03:00:57 +00:00 · 2020-06-15 07:18:57 -07:00 · 2020-06-15 07:18:57 -07:00 · c91b3c5006
commit c91b3c5006
14915 changed files with 590219 additions and 0 deletions
--- a/third_party/avir/LICENSE
+++ b/third_party/avir/LICENSE
@ -0,0 +1,26 @@
+AVIR License Agreement
+
+The MIT License (MIT)
+
+AVIR Copyright (c) 2015-2019 Aleksey Vaneev
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+
+Please credit the author of this library in your documentation in the
+following way: "AVIR image resizing algorithm designed by Aleksey Vaneev"
--- a/third_party/avir/README.cosmo
+++ b/third_party/avir/README.cosmo
@ -0,0 +1,5 @@
+commit 7dd9515ef6aed6fb6d565ee12754703bdc46b3b0
+Author: Aleksey Vaneev <aleksey.vaneev@gmail.com>
+Date:   Mon Jul 29 07:43:23 2019 +0300
+
+    Version 2.4 release.
--- a/third_party/avir/README.md
+++ b/third_party/avir/README.md
@ -0,0 +1,367 @@
+# AVIR #
+## Introduction ##
+Keywords: image resize, image resizer, image resizing, image scaling,
+image scaler, image resize c++, image resizer c++
+
+Please consider supporting the author on [Patreon](https://www.patreon.com/aleksey_vaneev).
+
+Me, Aleksey Vaneev, is happy to offer you an open source image resizing /
+scaling library which has reached a production level of quality, and is
+ready to be incorporated into any project. This library features routines
+for both down- and upsizing of 8- and 16-bit, 1 to 4-channel images. Image
+resizing routines were implemented in multi-platform C++ code, and have a
+high level of optimality. Beside resizing, this library offers a sub-pixel
+shift operation. Built-in sRGB gamma correction is available.
+
+The resizing algorithm at first produces 2X upsized image (relative to the
+source image size, or relative to the destination image size if downsizing is
+performed) and then performs interpolation using a bank of sinc function-based
+fractional delay filters. At the last stage a correction filter is applied
+which fixes smoothing introduced at previous steps.
+
+The resizing algorithm was designed to provide the best visual quality. The
+author even believes this algorithm provides the "ultimate" level of
+quality (for an orthogonal resizing) which cannot be increased further: no
+math exists to provide a better frequency response, better anti-aliasing
+quality and at the same time having less ringing artifacts: these are 3
+elements that define any resizing algorithm's quality; in AVIR practice these
+elements have a high correlation to each other, so they can be represented by
+a single parameter (AVIR offers several parameter sets with varying quality).
+Algorithm's time performance turned out to be very good as well (for the
+"ultimate" image quality).
+
+An important element utilized by this algorithm is the so called Peaked Cosine
+window function, which is applied over sinc function in all filters. Please
+consult the documentation for more details.
+
+Note that since AVIR implements orthogonal resizing, it may exhibit diagonal
+aliasing artifacts. These artifacts are usually suppressed by EWA or radial
+filtering techniques. EWA-like technique is not implemented in AVIR, because
+it requires considerably more computing resources and may produce a blurred
+image.
+
+As a bonus, a faster `LANCIR` image resizing algorithm is also offered as a
+part of this library. But the main focus of this documentation is the original
+AVIR image resizing algorithm.
+
+AVIR does not offer affine and non-linear image transformations "out of the
+box". Since upsizing is a relatively fast operation in AVIR (required time
+scales linearly with the output image area), affine and non-linear
+transformations can be implemented in steps: 4- to 8-times upsizing,
+transformation via bilinear interpolation, downsizing (linear proportional
+affine transformations can probably skip the downsizing step). This should not
+compromise the transformation quality much as bilinear interpolation's
+problems will mostly reside in spectral area without useful signal, with a
+maximum of 0.7 dB high-frequency attenuation for 4-times upsizing, and 0.17 dB
+attenuation for 8-times upsizing. This approach is probably as time efficient
+as performing a high-quality transform over the input image directly (the only
+serious drawback is the increased memory requirement). Note that affine
+transformations that change image proportions should first apply proportion
+change during upsizing.
+
+*AVIR is devoted to women. Your digital photos can look good at any size!*
+
+## Requirements ##
+C++ compiler and system with efficient "float" floating point (24-bit
+mantissa) type support. This library can also internally use the "double" and
+SIMD floating point types during resizing if needed. This library does not
+have dependencies beside the standard C library.
+
+## Links ##
+* [Documentation](https://www.voxengo.com/public/avir/Documentation/)
+
+## Usage Information ##
+The image resizer is represented by the `avir::CImageResizer<>` class, which
+is a single front-end class for the whole library. Basically, you do not need
+to use nor understand any other classes beside this class.
+
+The code of the library resides in the "avir" C++ namespace, effectively
+isolating it from all other code. The code is thread-safe. You need just
+a single resizer object per running application, at any time, even when
+resizing images concurrently.
+
+To resize images in your application, simply add 3 lines of code:
+
+    #include "avir.h"
+    avir :: CImageResizer<> ImageResizer( 8 );
+    ImageResizer.resizeImage( InBuf, 640, 480, 0, OutBuf, 1024, 768, 3, 0 );
+    (multi-threaded operation requires additional coding, see the documentation)
+
+For low-ringing performance:
+
+    avir :: CImageResizer<> ImageResizer( 8, 0, avir :: CImageResizerParamsLR() );
+
+To use the built-in gamma correction, an object of the
+`avir::CImageResizerVars` class with its variable `UseSRGBGamma` set to "true"
+should be supplied to the `resizeImage()` function. Note that the gamma
+correction is applied to all channels (e.g. alpha-channel) in the current
+implementation.
+
+    avir :: CImageResizerVars Vars;
+    Vars.UseSRGBGamma = true;
+
+Dithering (error-diffusion dither which is perceptually good) can be enabled
+this way:
+
+    typedef avir :: fpclass_def< float, float,
+        avir :: CImageResizerDithererErrdINL< float > > fpclass_dith;
+    avir :: CImageResizer< fpclass_dith > ImageResizer( 8 );
+
+The library is able to process images of any bit depth: this includes 8-bit,
+16-bit, float and double types. Larger integer and signed integer types are
+not supported. Supported source and destination image sizes are only limited
+by the available system memory.
+
+The code of this library was commented in the [Doxygen](http://www.doxygen.org/)
+style. To generate the documentation locally you may run the
+`doxygen ./other/avirdoxy.txt` command from the library's directory. Note that
+the code was suitably documented allowing you to make modifications, and to
+gain full understanding of the algorithm.
+
+Preliminary tests show that this library (compiled with Intel C++ Compiler
+18.2 with AVX2 instructions enabled, without explicit SIMD resizing code) can
+resize 8-bit RGB 5184x3456 (17.9 Mpixel) 3-channel image down to 1920x1280
+(2.5 Mpixel) image in 245 milliseconds, utilizing a single thread, on Intel
+Core i7-7700K processor-based system without overclocking. This scales down to
+74 milliseconds if 8 threads are utilized.
+
+Multi-threaded operation is not provided by this library "out of the box".
+The multi-threaded (horizontally-threaded) infrastructure is available, but
+requires additional system-specific interfacing code for engagement.
+
+## SIMD Usage Information ##
+This library is capable of using SIMD floating point types for internal
+variables. This means that up to 4 color channels can be processed in
+parallel. Since the default interleaved processing algorithm itself remains
+non-SIMD, the use of SIMD internal types is not practical for 1- and 2-channel
+image resizing (due to overhead). SIMD internal type can be used this way:
+
+    #include "avir_float4_sse.h"
+    avir :: CImageResizer< avir :: fpclass_float4 > ImageResizer( 8 );
+
+For 1-channel and 2-channel image resizing when AVX instructions are allowed
+it may be reasonable to utilize de-interleaved SIMD processing algorithm.
+While it gives no performance benefit if the "float4" SSE processing type is
+used, it offers some performance boost if the "float8" AVX processing type is
+used (given dithering is not performed, or otherwise performance is reduced at
+the dithering stage since recursive dithering cannot be parallelized). The
+internal type remains non-SIMD "float". De-interleaved algorithm can be used
+this way:
+
+    #include "avir_float8_avx.h"
+    avir :: CImageResizer< avir :: fpclass_float8_dil > ImageResizer( 8 );
+
+It's important to note that on the latest Intel processors (i7-7700K and
+probably later) the use of the aforementioned SIMD-specific resizing code may
+not be justifiable, or may be even counter-productive due to many factors:
+memory bandwidth bottleneck, increased efficiency of processor's circuitry
+utilization and out-of-order execution, automatic SIMD optimizations performed
+by the compiler. This is at least true when compiling 64-bit code with Intel
+C++ Compiler 18.2 with /QxSSE4.2, or especially with the /QxCORE-AVX2 option.
+SSE-specific resizing code may still be a little bit more efficient for
+4-channel image resizing.
+
+## Notes ##
+This library was tested for compatibility with [GNU C++](http://gcc.gnu.org/),
+[Microsoft Visual C++](http://www.microsoft.com/visualstudio/eng/products/visual-studio-express-products)
+and [Intel C++](http://software.intel.com/en-us/c-compilers) compilers, on 32-
+and 64-bit Windows, macOS and CentOS Linux. The code was also tested with
+Dr.Memory/Win32 for the absence of uninitialized or unaddressable memory
+accesses.
+
+All code is fully "inline", without the need to compile any source files. The
+memory footprint of the library itself is very modest, except that the size of
+the temporary image buffers depends on the input and output image sizes, and
+is proportionally large.
+
+The "heart" of resizing algorithm's quality resides in the parameters defined
+via the `avir::CImageResizerParams` structure. While the default set of
+parameters that offers a good quality was already provided, there is
+(probably) still a place for improvement exists, and the default parameters
+may change in a future update. If you need to recall an exact set of
+parameters, simply save them locally for a later use.
+
+When the algorithm is run with no resizing applied (k=1), the result of
+resizing will not be an exact, but a very close copy of the source image. The
+reason for such inexactness is that the image is always low-pass filtered at
+first to reduce aliasing during subsequent resizing, and at last filtered by a
+correction filter. Such approach allows algorithm to maintain a stable level
+of quality regardless of the resizing "k" factor used.
+
+This library includes a binary command line tool "imageresize" for major
+desktop platforms. This tool was designed to be used as a demonstration of
+library's performance, and as a reference, it is multi-threaded (the `-t`
+switch can be used to control the number of threads utilized). This tool uses
+plain "float" processing (no explicit SIMD) and relies on automatic compiler
+optimization (with Win64 binary being the "main" binary as it was compiled
+with the best ICC optimization options for the time being). This tool uses the
+following libraries:
+* turbojpeg Copyright (c) 2009-2013 D. R. Commander
+* libpng Copyright (c) 1998-2013 Glenn Randers-Pehrson
+* zlib Copyright (c) 1995-2013 Jean-loup Gailly and Mark Adler
+
+Note that you can enable gamma-correction with the `-g` switch. However,
+sometimes gamma-correction produces "greenish/reddish/bluish haze" since
+low-amplitude oscillations produced by resizing at object boundaries are
+amplified by gamma correction. This can also have an effect of reduced
+contrast.
+
+## Interpolation Discussion ##
+The use of certain low-pass filters and 2X upsampling in this library is
+hardly debatable, because they are needed to attain a certain anti-aliasing
+effect and keep ringing artifacts low. But the use of sinc function-based
+interpolation filter that is 18 taps-long (may be higher, up to 36 taps in
+practice) can be questioned, because even in 0th order case such
+interpolation filter requires 18 multiply-add operations. Comparatively, an
+optimal Hermite or cubic interpolation spline requires 8 multiply and 11 add
+operations.
+
+One of the reasons 18-tap filter is preferred, is because due to memory
+bandwidth limitations using a lower-order filter does not provide any
+significant performance increase (e.g. 14-tap filter is less than 5% more
+efficient overall). At the same time, in comparison to cubic spline, 18-tap
+filter embeds a low-pass filter that rejects signal above 0.5\*pi (provides
+additional anti-aliasing filtering), and this filter has a consistent shape at
+all fractional offsets. Splines have a varying low-pass filter shape at
+different fractional offsets (e.g. no low-pass filtering at 0.0 offset,
+and maximal low-pass filtering at 0.5 offset). 18-tap filter also offers a
+superior stop-band attenuation which almost guarantees absence of artifacts if
+the image is considerably sharpened afterwards.
+
+## Why 2X upsizing in AVIR? ##
+Classic approaches to image resizing do not perform an additional 2X upsizing.
+So, why such upsizing is needed at all in AVIR? Indeed, image resizing can be
+implemented using a single interpolation filter which is applied to the source
+image directly. However, such approach has limitations:
+
+First of all, speaking about non-2X-upsized resizing, during upsizing the
+interpolation filter has to be tuned to a frequency close to pi (Nyquist) in
+order to reduce high-frequency smoothing: this reduces the space left for
+filter optimization. Beside that, during downsizing, a filter that performs
+well and predictable when tuned to frequencies close to the Nyquist frequency,
+may become distorted in its spectral shape when it is tuned to lower
+frequencies. That is why it is usually a good idea to have filter's stop-band
+begin below Nyquist so that the transition band's shape remains stable at any
+lower-frequency setting. At the same time, this requirement complicates a
+further corrective filtering, because correction filter may become too steep
+at the point where the stop-band begins.
+
+Secondly, speaking about non-2X-upsized resizing, filter has to be very short
+(with a base length of 5-7 taps, further multiplied by the resizing factor) or
+otherwise the ringing artifacts will be very strong: it is a general rule that
+the steeper the filter is around signal frequencies being removed the higher
+the ringing artifacts are. That is why it is preferred to move steep
+transitions into the spectral area with a quieter signal. A short filter also
+means it cannot provide a strong "beyond-Nyquist" stop-band attenuation, so an
+interpolated image will look a bit edgy or not very clean due to stop-band
+artifacts.
+
+To sum up, only additional controlled 2X upsizing provides enough spectral
+space to design interpolation filter without visible ringing artifacts yet
+providing a strong stop-band attenuation and stable spectral characteristics
+(good at any resizing "k" factor). Moreover, 2X upsizing becomes very
+important in maintaining a good resizing quality when downsizing and upsizing
+by small "k" factors, in the range 0.5 to 2: resizing approaches that do not
+perform 2X upsizing usually cannot design a good interpolation filter for such
+factors just because there is not enough spectral space available.
+
+## Why Peaked Cosine in AVIR? ##
+First of all, AVIR is a general solution to image resizing problem. That is
+why it should not be directly compared to "spline interpolation" or "Lanczos
+resampling", because the latter two are only means to design interpolation
+filters, and they can be implemented in a variety of ways, even in sub-optimal
+ways. Secondly, with only a minimal effort AVIR can be changed to use any
+existing interpolation formula and any window function, but this is just not
+needed.
+
+An effort was made to compare Peaked Cosine to Lanczos window function, and
+here is the author's opinion. Peaked Cosine has two degrees of freedom whereas
+Lanczos has one degree of freedom. While both functions can be used with
+acceptable results, Peaked Cosine window function used in automatic parameter
+optimization really pushes the limits of frequency response linearity,
+anti-aliasing strength (stop-band attenuation) and low-ringing performance
+which Lanczos cannot usually achieve. This is true at least when using a
+general-purpose downhill simplex optimization method. Lanczos window has good
+(but not better) characteristics in several special cases (certain "k"
+factors) which makes it of limited use in a general solution such as AVIR.
+
+Among other window functions (Kaiser, Gaussian, Cauchy, Poisson, generalized
+cosine windows) there are no better candidates as well. It looks like Peaked
+Cosine function's scalability (it retains stable, almost continously-variable
+spectral characteristics at any window parameter values), and its ability to
+create "desirable" pass-band ripple in the frequency response near the cutoff
+point contribute to its better overall quality. Somehow Peaked Cosine window
+function optimization manages to converge to reasonable states in most cases
+(that is why AVIR library comes with a set of equally robust, but distinctive
+parameter sets) whereas all other window functions tend to produce
+unpredictable optimization results.
+
+The only disadvantage of Peaked Cosine window function is that usable filters
+windowed by this function tend to be longer than "usual" (with Kaiser window
+being the "golden standard" for filter length per decibel of stop-band
+attenuation). This is a price that should be paid for stable spectral
+characteristics.
+
+## LANCIR ##
+
+As a part of AVIR library, the `CLancIR` class is also offered which is an
+optimal implementation of *Lanczos* image resizing filter. This class has a
+similar programmatic interface to AVIR, but it is not thread-safe: each
+executing thread should have its own `CLancIR` object. This class was designed
+for cases of batch processing of same-sized frames like in video encoding.
+
+LANCIR offers up to 200% faster image resizing in comparison to AVIR. The
+quality difference is, however, debatable. Note that while LANCIR can take
+8- and 16-bit and float image buffers, its precision is limited to 8-bit
+resizing.
+
+LANCIR should be seen as a bonus and as some kind of quality comparison.
+LANCIR uses Lanczos filter "a" parameter equal to 3 which is similar to AVIR's
+default setting.
+
+## Change log ##
+Version 2.4:
+
+* Removed outdated `_mm_reset()` function calls from the SIMD code.
+* Changed `float4 round()` to use SSE2 rounding features, avoiding use of
+64-bit registers.
+
+Version 2.3:
+
+* Implemented CLancIR image resizing algorithm.
+* Fixed a minor image offset on image upsizing.
+
+Version 2.2:
+
+* Released AVIR under a permissive MIT license agreement.
+
+Version 2.1:
+
+* Fixed error-diffusion dither problems introduced in the previous version.
+* Added the `-1` switch to the `imageresize` to enable 1-bit output for
+dither's quality evaluation (use together with the `-d` switch).
+* Added the `--algparams=` switch to the `imageresize` to control resizing
+quality (replaces the `--low-ring` switch).
+* Added `avir :: CImageResizerParamsULR` parameter set for lowest-ringing
+performance possible (not considerably different to
+`avir :: CImageResizerParamsLR`, but a bit lower ringing).
+
+Version 2.0:
+
+* Minor inner loop optimizations.
+* Lifted the supported image size constraint by switching buffer addressing to
+`size_t` from `int`, now image size is limited by the available system memory.
+* Added several useful switches to the `imageresize` utility.
+* Now `imageresize` does not apply gamma-correction by default.
+* Fixed scaling of bit depth-reduction operation.
+* Improved error-diffusion dither's signal-to-noise ratio.
+* Compiled binaries with AVX2 instruction set (SSE4 for macOS).
+
+## Users ##
+This library is used by:
+
+  * [Contaware.com](http://www.contaware.com/)
+
+Please drop me a note at aleksey.vaneev@gmail.com and I will include a link to
+your software product to the list of users. This list is important at
+maintaining confidence in this library among the interested parties.
--- a/third_party/avir/avir.h
+++ b/third_party/avir/avir.h
--- a/third_party/avir/avir.mk
+++ b/third_party/avir/avir.mk
@ -0,0 +1,71 @@
+#-*-mode:makefile-gmake;indent-tabs-mode:t;tab-width:8;coding:utf-8-*-┐
+#───vi: set et ft=make ts=8 tw=8 fenc=utf-8 :vi───────────────────────┘
+
+PKGS += THIRD_PARTY_AVIR
+
+THIRD_PARTY_AVIR_ARTIFACTS += THIRD_PARTY_AVIR_A
+THIRD_PARTY_AVIR = $(THIRD_PARTY_AVIR_A_DEPS) $(THIRD_PARTY_AVIR_A)
+THIRD_PARTY_AVIR_A = o/$(MODE)/third_party/avir/avir.a
+THIRD_PARTY_AVIR_A_CHECKS = $(THIRD_PARTY_AVIR_A).pkg
+THIRD_PARTY_AVIR_A_FILES := $(wildcard third_party/avir/*)
+THIRD_PARTY_AVIR_A_SRCS_S = $(filter %.S,$(THIRD_PARTY_AVIR_A_FILES))
+THIRD_PARTY_AVIR_A_SRCS_C = $(filter %.c,$(THIRD_PARTY_AVIR_A_FILES))
+THIRD_PARTY_AVIR_A_SRCS_X = $(filter %.cc,$(THIRD_PARTY_AVIR_A_FILES))
+
+THIRD_PARTY_AVIR_A_HDRS =				\
+	$(filter %.h,$(THIRD_PARTY_AVIR_A_FILES))	\
+	$(filter %.hpp,$(THIRD_PARTY_AVIR_A_FILES))
+
+THIRD_PARTY_AVIR_A_SRCS =				\
+	$(THIRD_PARTY_AVIR_A_SRCS_S)			\
+	$(THIRD_PARTY_AVIR_A_SRCS_C)			\
+	$(THIRD_PARTY_AVIR_A_SRCS_X)
+
+THIRD_PARTY_AVIR_A_OBJS =				\
+	$(THIRD_PARTY_AVIR_A_SRCS:%=o/$(MODE)/%.zip.o)	\
+	$(THIRD_PARTY_AVIR_A_SRCS_S:%.S=o/$(MODE)/%.o)	\
+	$(THIRD_PARTY_AVIR_A_SRCS_C:%.c=o/$(MODE)/%.o)	\
+	$(THIRD_PARTY_AVIR_A_SRCS_X:%.cc=o/$(MODE)/%.o)
+
+THIRD_PARTY_AVIR_A_DIRECTDEPS =				\
+	DSP_CORE					\
+	LIBC_NEXGEN32E					\
+	LIBC_BITS					\
+	LIBC_MEM					\
+	LIBC_CALLS					\
+	LIBC_STUBS					\
+	LIBC_SYSV					\
+	LIBC_FMT					\
+	LIBC_UNICODE					\
+	LIBC_LOG					\
+	LIBC_TINYMATH
+
+$(THIRD_PARTY_AVIR_A).pkg:				\
+		$(THIRD_PARTY_AVIR_A_OBJS)		\
+		$(foreach x,$(THIRD_PARTY_AVIR_A_DIRECTDEPS),$($(x)_A).pkg)
+
+$(THIRD_PARTY_AVIR_A):					\
+		third_party/avir/			\
+		$(THIRD_PARTY_AVIR_A).pkg		\
+		$(THIRD_PARTY_AVIR_A_OBJS)
+
+#o/$(MODE)/third_party/avir/lanczos1b.o:			\
+		CXX = clang++-10
+
+o/$(MODE)/third_party/avir/lanczos1b.o			\
+o/$(MODE)/third_party/avir/lanczos.o:			\
+		OVERRIDE_CXXFLAGS +=			\
+			$(MATHEMATICAL)
+
+THIRD_PARTY_AVIR_A_DEPS :=				\
+	$(call uniq,$(foreach x,$(THIRD_PARTY_AVIR_A_DIRECTDEPS),$($(x))))
+
+THIRD_PARTY_AVIR_LIBS = $(foreach x,$(THIRD_PARTY_AVIR_ARTIFACTS),$($(x)))
+THIRD_PARTY_AVIR_SRCS = $(foreach x,$(THIRD_PARTY_AVIR_ARTIFACTS),$($(x)_SRCS))
+THIRD_PARTY_AVIR_HDRS = $(foreach x,$(THIRD_PARTY_AVIR_ARTIFACTS),$($(x)_HDRS))
+THIRD_PARTY_AVIR_CHECKS = $(foreach x,$(THIRD_PARTY_AVIR_ARTIFACTS),$($(x)_CHECKS))
+THIRD_PARTY_AVIR_OBJS = $(foreach x,$(THIRD_PARTY_AVIR_ARTIFACTS),$($(x)_OBJS))
+THIRD_PARTY_AVIR_TESTS = $(foreach x,$(THIRD_PARTY_AVIR_ARTIFACTS),$($(x)_TESTS))
+
+.PHONY: o/$(MODE)/third_party/avir
+o/$(MODE)/third_party/avir: $(THIRD_PARTY_AVIR_A_CHECKS)
--- a/third_party/avir/avir1.h
+++ b/third_party/avir/avir1.h
@ -0,0 +1,18 @@
+#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_AVIR1_H_
+#define COSMOPOLITAN_THIRD_PARTY_AVIR_AVIR1_H_
+#if !(__ASSEMBLER__ + __LINKER__ + 0)
+COSMOPOLITAN_C_START_
+
+struct avir1 {
+  void *p;
+};
+
+void avir1init(struct avir1 *self);
+void avir1free(struct avir1 *self);
+void avir1(struct avir1 *resizer, size_t dyn, size_t dxn, void *dst,
+           size_t dstsize, size_t syn, size_t sxn, size_t ssw, const void *src,
+           size_t srcsize);
+
+COSMOPOLITAN_C_END_
+#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
+#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_AVIR1_H_ */
--- a/third_party/avir/avir_dil.h
+++ b/third_party/avir/avir_dil.h
--- a/third_party/avir/avir_float4_sse.h
+++ b/third_party/avir/avir_float4_sse.h
@ -0,0 +1,323 @@
+/* clang-format off */
+//$ nobt
+//$ nocpp
+
+/**
+ * @file avir_float4_sse.h
+ *
+ * @brief Inclusion file for the "float4" type.
+ *
+ * This file includes the "float4" SSE-based type used for SIMD variable
+ * storage and processing.
+ *
+ * AVIR Copyright (c) 2015-2019 Aleksey Vaneev
+ */
+
+#ifndef AVIR_FLOAT4_SSE_INCLUDED
+#define AVIR_FLOAT4_SSE_INCLUDED
+
+#include "third_party/avir/avir.h"
+#include "libc/bits/mmintrin.h"
+#include "libc/bits/xmmintrin.h"
+#include "libc/bits/xmmintrin.h"
+#include "libc/bits/emmintrin.h"
+
+namespace avir {
+
+/**
+ * @brief SIMD packed 4-float type.
+ *
+ * This class implements a packed 4-float type that can be used to perform
+ * parallel computation using SIMD instructions on SSE-enabled processors.
+ * This class can be used as the "fptype" argument of the avir::fpclass_def
+ * class.
+ */
+
+class float4
+{
+public:
+	float4()
+	{
+	}
+
+	float4( const float4& s )
+		: value( s.value )
+	{
+	}
+
+	float4( const __m128 s )
+		: value( s )
+	{
+	}
+
+	float4( const float s )
+		: value( _mm_set1_ps( s ))
+	{
+	}
+
+	float4& operator = ( const float4& s )
+	{
+		value = s.value;
+		return( *this );
+	}
+
+	float4& operator = ( const __m128 s )
+	{
+		value = s;
+		return( *this );
+	}
+
+	float4& operator = ( const float s )
+	{
+		value = _mm_set1_ps( s );
+		return( *this );
+	}
+
+	operator float () const
+	{
+		return( _mm_cvtss_f32( value ));
+	}
+
+	/**
+	 * @param p Pointer to memory from where the value should be loaded,
+	 * should be 16-byte aligned.
+	 * @return float4 value loaded from the specified memory location.
+	 */
+
+	static float4 load( const float* const p )
+	{
+		return( _mm_load_ps( p ));
+	}
+
+	/**
+	 * @param p Pointer to memory from where the value should be loaded,
+	 * may have any alignment.
+	 * @return float4 value loaded from the specified memory location.
+	 */
+
+	static float4 loadu( const float* const p )
+	{
+		return( _mm_loadu_ps( p ));
+	}
+
+	/**
+	 * @param p Pointer to memory from where the value should be loaded,
+	 * may have any alignment.
+	 * @param lim The maximum number of elements to load, >0.
+	 * @return float4 value loaded from the specified memory location, with
+	 * elements beyond "lim" set to 0.
+	 */
+
+	static float4 loadu( const float* const p, int lim )
+	{
+		if( lim > 2 )
+		{
+			if( lim > 3 )
+			{
+				return( _mm_loadu_ps( p ));
+			}
+			else
+			{
+				return( _mm_set_ps( 0.0f, p[ 2 ], p[ 1 ], p[ 0 ]));
+			}
+		}
+		else
+		{
+			if( lim == 2 )
+			{
+				return( _mm_set_ps( 0.0f, 0.0f, p[ 1 ], p[ 0 ]));
+			}
+			else
+			{
+				return( _mm_load_ss( p ));
+			}
+		}
+	}
+
+	/**
+	 * Function stores *this value to the specified memory location.
+	 *
+	 * @param[out] p Output memory location, should be 16-byte aligned.
+	 */
+
+	void store( float* const p ) const
+	{
+		_mm_store_ps( p, value );
+	}
+
+	/**
+	 * Function stores *this value to the specified memory location.
+	 *
+	 * @param[out] p Output memory location, may have any alignment.
+	 */
+
+	void storeu( float* const p ) const
+	{
+		_mm_storeu_ps( p, value );
+	}
+
+	/**
+	 * Function stores "lim" lower elements of *this value to the specified
+	 * memory location.
+	 *
+	 * @param[out] p Output memory location, may have any alignment.
+	 * @param lim The number of lower elements to store, >0.
+	 */
+
+	void storeu( float* const p, int lim ) const
+	{
+		if( lim > 2 )
+		{
+			if( lim > 3 )
+			{
+				_mm_storeu_ps( p, value );
+			}
+			else
+			{
+				_mm_storel_pi( (__m64*) p, value );
+				_mm_store_ss( p + 2, _mm_movehl_ps( value, value ));
+			}
+		}
+		else
+		{
+			if( lim == 2 )
+			{
+				_mm_storel_pi( (__m64*) p, value );
+			}
+			else
+			{
+				_mm_store_ss( p, value );
+			}
+		}
+	}
+
+	float4& operator += ( const float4& s )
+	{
+		value = _mm_add_ps( value, s.value );
+		return( *this );
+	}
+
+	float4& operator -= ( const float4& s )
+	{
+		value = _mm_sub_ps( value, s.value );
+		return( *this );
+	}
+
+	float4& operator *= ( const float4& s )
+	{
+		value = _mm_mul_ps( value, s.value );
+		return( *this );
+	}
+
+	float4& operator /= ( const float4& s )
+	{
+		value = _mm_div_ps( value, s.value );
+		return( *this );
+	}
+
+	float4 operator + ( const float4& s ) const
+	{
+		return( _mm_add_ps( value, s.value ));
+	}
+
+	float4 operator - ( const float4& s ) const
+	{
+		return( _mm_sub_ps( value, s.value ));
+	}
+
+	float4 operator * ( const float4& s ) const
+	{
+		return( _mm_mul_ps( value, s.value ));
+	}
+
+	float4 operator / ( const float4& s ) const
+	{
+		return( _mm_div_ps( value, s.value ));
+	}
+
+	/**
+	 * @return Horizontal sum of elements.
+	 */
+
+	float hadd() const
+	{
+		const __m128 v = _mm_add_ps( value, _mm_movehl_ps( value, value ));
+		const __m128 res = _mm_add_ss( v, _mm_shuffle_ps( v, v, 1 ));
+		return( _mm_cvtss_f32( res ));
+	}
+
+	/**
+	 * Function performs in-place addition of a value located in memory and
+	 * the specified value.
+	 *
+	 * @param p Pointer to value where addition happens. May be unaligned.
+	 * @param v Value to add.
+	 */
+
+	static void addu( float* const p, const float4& v )
+	{
+		( loadu( p ) + v ).storeu( p );
+	}
+
+	/**
+	 * Function performs in-place addition of a value located in memory and
+	 * the specified value. Limited to the specfied number of elements.
+	 *
+	 * @param p Pointer to value where addition happens. May be unaligned.
+	 * @param v Value to add.
+	 * @param lim The element number limit, >0.
+	 */
+
+	static void addu( float* const p, const float4& v, const int lim )
+	{
+		( loadu( p, lim ) + v ).storeu( p, lim );
+	}
+
+	__m128 value; ///< Packed value of 4 floats.
+		///<
+};
+
+/**
+ * SIMD rounding function, exact result.
+ *
+ * @param v Value to round.
+ * @return Rounded SIMD value.
+ */
+
+inline float4 round( const float4& v )
+{
+	unsigned int prevrm = _MM_GET_ROUNDING_MODE();
+	_MM_SET_ROUNDING_MODE( _MM_ROUND_NEAREST );
+
+	const __m128 res = _mm_cvtepi32_ps( _mm_cvtps_epi32( v.value ));
+
+	_MM_SET_ROUNDING_MODE( prevrm );
+
+	return( res );
+}
+
+/**
+ * SIMD function "clamps" (clips) the specified packed values so that they are
+ * not lesser than "minv", and not greater than "maxv".
+ *
+ * @param Value Value to clamp.
+ * @param minv Minimal allowed value.
+ * @param maxv Maximal allowed value.
+ * @return The clamped value.
+ */
+
+inline float4 clamp( const float4& Value, const float4& minv,
+	const float4& maxv )
+{
+	return( _mm_min_ps( _mm_max_ps( Value.value, minv.value ), maxv.value ));
+}
+
+typedef fpclass_def< avir :: float4, float > fpclass_float4; ///<
+	///< Class that can be used as the "fpclass" template parameter of the
+	///< avir::CImageResizer class to perform calculation using default
+	///< interleaved algorithm, using SIMD float4 type.
+	///<
+
+} // namespace avir
+
+#endif // AVIR_FLOAT4_SSE_INCLUDED
--- a/third_party/avir/avir_float8_avx.h
+++ b/third_party/avir/avir_float8_avx.h
@ -0,0 +1,365 @@
+/* clang-format off */
+//$ nobt
+//$ nocpp
+
+/**
+ * @file avir_float8_avx.h
+ *
+ * @brief Inclusion file for the "float8" type.
+ *
+ * This file includes the "float8" AVX-based type used for SIMD variable
+ * storage and processing.
+ *
+ * AVIR Copyright (c) 2015-2019 Aleksey Vaneev
+ */
+
+#ifndef AVIR_FLOAT8_AVX_INCLUDED
+#define AVIR_FLOAT8_AVX_INCLUDED
+
+#include "libc/bits/mmintrin.h"
+#include "libc/bits/avxintrin.h"
+#include "libc/bits/smmintrin.h"
+#include "libc/bits/pmmintrin.h"
+#include "libc/bits/avx2intrin.h"
+#include "libc/bits/xmmintrin.h"
+#include "third_party/avir/avir_dil.h"
+
+namespace avir {
+
+/**
+ * @brief SIMD packed 8-float type.
+ *
+ * This class implements a packed 8-float type that can be used to perform
+ * parallel computation using SIMD instructions on AVX-enabled processors.
+ * This class can be used as the "fptype" argument of the avir::fpclass_def
+ * or avir::fpclass_def_dil class.
+ */
+
+class float8
+{
+public:
+	float8()
+	{
+	}
+
+	float8( const float8& s )
+		: value( s.value )
+	{
+	}
+
+	float8( const __m256 s )
+		: value( s )
+	{
+	}
+
+	float8( const float s )
+		: value( _mm256_set1_ps( s ))
+	{
+	}
+
+	float8& operator = ( const float8& s )
+	{
+		value = s.value;
+		return( *this );
+	}
+
+	float8& operator = ( const __m256 s )
+	{
+		value = s;
+		return( *this );
+	}
+
+	float8& operator = ( const float s )
+	{
+		value = _mm256_set1_ps( s );
+		return( *this );
+	}
+
+	operator float () const
+	{
+		return( _mm_cvtss_f32( _mm256_extractf128_ps( value, 0 )));
+	}
+
+	/**
+	 * @param p Pointer to memory from where the value should be loaded,
+	 * should be 32-byte aligned.
+	 * @return float8 value loaded from the specified memory location.
+	 */
+
+	static float8 load( const float* const p )
+	{
+		return( _mm256_load_ps( p ));
+	}
+
+	/**
+	 * @param p Pointer to memory from where the value should be loaded,
+	 * may have any alignment.
+	 * @return float8 value loaded from the specified memory location.
+	 */
+
+	static float8 loadu( const float* const p )
+	{
+		return( _mm256_loadu_ps( p ));
+	}
+
+	/**
+	 * @param p Pointer to memory from where the value should be loaded,
+	 * may have any alignment.
+	 * @param lim The maximum number of elements to load, >0.
+	 * @return float8 value loaded from the specified memory location, with
+	 * elements beyond "lim" set to 0.
+	 */
+
+	static float8 loadu( const float* const p, const int lim )
+	{
+		__m128 lo;
+		__m128 hi;
+
+		if( lim > 4 )
+		{
+			lo = _mm_loadu_ps( p );
+			hi = loadu4( p + 4, lim - 4 );
+		}
+		else
+		{
+			lo = loadu4( p, lim );
+			hi = _mm_setzero_ps();
+		}
+
+		return( _mm256_insertf128_ps( _mm256_castps128_ps256( lo ), hi, 1 ));
+	}
+
+	/**
+	 * Function stores *this value to the specified memory location.
+	 *
+	 * @param[out] p Output memory location, should be 32-byte aligned.
+	 */
+
+	void store( float* const p ) const
+	{
+		_mm256_store_ps( p, value );
+	}
+
+	/**
+	 * Function stores *this value to the specified memory location.
+	 *
+	 * @param[out] p Output memory location, may have any alignment.
+	 */
+
+	void storeu( float* const p ) const
+	{
+		_mm256_storeu_ps( p, value );
+	}
+
+	/**
+	 * Function stores "lim" lower elements of *this value to the specified
+	 * memory location.
+	 *
+	 * @param[out] p Output memory location, may have any alignment.
+	 * @param lim The number of lower elements to store, >0.
+	 */
+
+	void storeu( float* p, int lim ) const
+	{
+		__m128 v;
+
+		if( lim > 4 )
+		{
+			_mm_storeu_ps( p, _mm256_extractf128_ps( value, 0 ));
+			v = _mm256_extractf128_ps( value, 1 );
+			p += 4;
+			lim -= 4;
+		}
+		else
+		{
+			v = _mm256_extractf128_ps( value, 0 );
+		}
+
+		if( lim > 2 )
+		{
+			if( lim > 3 )
+			{
+				_mm_storeu_ps( p, v );
+			}
+			else
+			{
+				_mm_storel_pi( (__m64*) p, v );
+				_mm_store_ss( p + 2, _mm_movehl_ps( v, v ));
+			}
+		}
+		else
+		{
+			if( lim == 2 )
+			{
+				_mm_storel_pi( (__m64*) p, v );
+			}
+			else
+			{
+				_mm_store_ss( p, v );
+			}
+		}
+	}
+
+	float8& operator += ( const float8& s )
+	{
+		value = _mm256_add_ps( value, s.value );
+		return( *this );
+	}
+
+	float8& operator -= ( const float8& s )
+	{
+		value = _mm256_sub_ps( value, s.value );
+		return( *this );
+	}
+
+	float8& operator *= ( const float8& s )
+	{
+		value = _mm256_mul_ps( value, s.value );
+		return( *this );
+	}
+
+	float8& operator /= ( const float8& s )
+	{
+		value = _mm256_div_ps( value, s.value );
+		return( *this );
+	}
+
+	float8 operator + ( const float8& s ) const
+	{
+		return( _mm256_add_ps( value, s.value ));
+	}
+
+	float8 operator - ( const float8& s ) const
+	{
+		return( _mm256_sub_ps( value, s.value ));
+	}
+
+	float8 operator * ( const float8& s ) const
+	{
+		return( _mm256_mul_ps( value, s.value ));
+	}
+
+	float8 operator / ( const float8& s ) const
+	{
+		return( _mm256_div_ps( value, s.value ));
+	}
+
+	/**
+	 * @return Horizontal sum of elements.
+	 */
+
+	float hadd() const
+	{
+		__m128 v = _mm_add_ps( _mm256_extractf128_ps( value, 0 ),
+			_mm256_extractf128_ps( value, 1 ));
+
+		v = _mm_hadd_ps( v, v );
+		v = _mm_hadd_ps( v, v );
+		return( _mm_cvtss_f32( v ));
+	}
+
+	/**
+	 * Function performs in-place addition of a value located in memory and
+	 * the specified value.
+	 *
+	 * @param p Pointer to value where addition happens. May be unaligned.
+	 * @param v Value to add.
+	 */
+
+	static void addu( float* const p, const float8& v )
+	{
+		( loadu( p ) + v ).storeu( p );
+	}
+
+	/**
+	 * Function performs in-place addition of a value located in memory and
+	 * the specified value. Limited to the specfied number of elements.
+	 *
+	 * @param p Pointer to value where addition happens. May be unaligned.
+	 * @param v Value to add.
+	 * @param lim The element number limit, >0.
+	 */
+
+	static void addu( float* const p, const float8& v, const int lim )
+	{
+		( loadu( p, lim ) + v ).storeu( p, lim );
+	}
+
+	__m256 value; ///< Packed value of 8 floats.
+		///<
+
+private:
+	/**
+	 * @param p Pointer to memory from where the value should be loaded,
+	 * may have any alignment.
+	 * @param lim The maximum number of elements to load, >0.
+	 * @return __m128 value loaded from the specified memory location, with
+	 * elements beyond "lim" set to 0.
+	 */
+
+	static __m128 loadu4( const float* const p, const int lim )
+	{
+		if( lim > 2 )
+		{
+			if( lim > 3 )
+			{
+				return( _mm_loadu_ps( p ));
+			}
+			else
+			{
+				return( _mm_set_ps( 0.0f, p[ 2 ], p[ 1 ], p[ 0 ]));
+			}
+		}
+		else
+		{
+			if( lim == 2 )
+			{
+				return( _mm_set_ps( 0.0f, 0.0f, p[ 1 ], p[ 0 ]));
+			}
+			else
+			{
+				return( _mm_load_ss( p ));
+			}
+		}
+	}
+};
+
+/**
+ * SIMD rounding function, exact result.
+ *
+ * @param v Value to round.
+ * @return Rounded SIMD value.
+ */
+
+inline float8 round( const float8& v )
+{
+	return( _mm256_round_ps( v.value,
+		( _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC )));
+}
+
+/**
+ * SIMD function "clamps" (clips) the specified packed values so that they are
+ * not lesser than "minv", and not greater than "maxv".
+ *
+ * @param Value Value to clamp.
+ * @param minv Minimal allowed value.
+ * @param maxv Maximal allowed value.
+ * @return The clamped value.
+ */
+
+inline float8 clamp( const float8& Value, const float8& minv,
+	const float8& maxv )
+{
+	return( _mm256_min_ps( _mm256_max_ps( Value.value, minv.value ),
+		maxv.value ));
+}
+
+typedef fpclass_def_dil< float, avir :: float8 > fpclass_float8_dil; ///<
+	///< Class that can be used as the "fpclass" template parameter of the
+	///< avir::CImageResizer class to perform calculation using
+	///< de-interleaved SIMD algorithm, using SIMD float8 type.
+	///<
+
+} // namespace avir
+
+#endif // AVIR_FLOAT8_AVX_INCLUDED
--- a/third_party/avir/lancir.h
+++ b/third_party/avir/lancir.h
--- a/third_party/avir/lanczos.cc
+++ b/third_party/avir/lanczos.cc
@ -0,0 +1,40 @@
+/*-*-mode:c++;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8-*-│
+│vi: set net ft=c++ ts=2 sts=2 sw=2 fenc=utf-8                              :vi│
+╞══════════════════════════════════════════════════════════════════════════════╡
+│ Copyright 2020 Justine Alexandra Roberts Tunney                              │
+│                                                                              │
+│ This program is free software; you can redistribute it and/or modify         │
+│ it under the terms of the GNU General Public License as published by         │
+│ the Free Software Foundation; version 2 of the License.                      │
+│                                                                              │
+│ This program is distributed in the hope that it will be useful, but          │
+│ WITHOUT ANY WARRANTY; without even the implied warranty of                   │
+│ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU             │
+│ General Public License for more details.                                     │
+│                                                                              │
+│ You should have received a copy of the GNU General Public License            │
+│ along with this program; if not, write to the Free Software                  │
+│ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA                │
+│ 02110-1301 USA                                                               │
+╚─────────────────────────────────────────────────────────────────────────────*/
+#include "libc/limits.h"
+#include "libc/log/check.h"
+#include "libc/log/log.h"
+#include "third_party/avir/lanczos.h"
+namespace {
+#include "third_party/avir/lancir.h"
+}  // namespace
+
+/**
+ * Does Lanczos interpolation.
+ * @note computers w/o AVX2+FMA need to call BilinearScale()
+ */
+void lanczos(unsigned dyn, unsigned dxn, void *restrict dst, unsigned syn,
+             unsigned sxn, const void *restrict src, unsigned sw) {
+  avir::CLancIR lanczos;
+  DCHECK_ALIGNED(64, dst);
+  DCHECK_ALIGNED(64, src);
+  LOGF("%10s%5zux×%-5zu→%5zu×%-5zu", "lanczos", sxn, syn, dxn, dyn);
+  lanczos.resizeImage((const float *)src, sxn, syn, sw, (float *)dst, dxn, dyn,
+                      4);
+}
--- a/third_party/avir/lanczos.h
+++ b/third_party/avir/lanczos.h
@ -0,0 +1,13 @@
+#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS_H_
+#define COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS_H_
+#if !(__ASSEMBLER__ + __LINKER__ + 0)
+COSMOPOLITAN_C_START_
+
+void lanczos(unsigned, unsigned, void *, unsigned, unsigned, const void *,
+             unsigned);
+void lanczos3(unsigned, unsigned, void *, unsigned, unsigned, const void *,
+              unsigned);
+
+COSMOPOLITAN_C_END_
+#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
+#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS_H_ */
--- a/third_party/avir/lanczos1.cc
+++ b/third_party/avir/lanczos1.cc
@ -0,0 +1,77 @@
+/*-*-mode:c++;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8-*-│
+│vi: set net ft=c++ ts=2 sts=2 sw=2 fenc=utf-8                              :vi│
+╞══════════════════════════════════════════════════════════════════════════════╡
+│ Copyright 2020 Justine Alexandra Roberts Tunney                              │
+│                                                                              │
+│ This program is free software; you can redistribute it and/or modify         │
+│ it under the terms of the GNU General Public License as published by         │
+│ the Free Software Foundation; version 2 of the License.                      │
+│                                                                              │
+│ This program is distributed in the hope that it will be useful, but          │
+│ WITHOUT ANY WARRANTY; without even the implied warranty of                   │
+│ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU             │
+│ General Public License for more details.                                     │
+│                                                                              │
+│ You should have received a copy of the GNU General Public License            │
+│ along with this program; if not, write to the Free Software                  │
+│ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA                │
+│ 02110-1301 USA                                                               │
+╚─────────────────────────────────────────────────────────────────────────────*/
+#include "libc/bits/xmmintrin.h"
+#include "libc/limits.h"
+#include "libc/log/log.h"
+#include "libc/runtime/runtime.h"
+#include "third_party/avir/lanczos1.h"
+namespace {
+#include "third_party/avir/lanczos1.hpp"
+}  // namespace
+
+void lanczos1init(struct lanczos1 *resizer) {
+  lanczos1free(resizer);
+  resizer->p = new Lanczos1Impl;
+}
+
+void lanczos1free(struct lanczos1 *resizer) {
+  Lanczos1Impl *impl;
+  if (!resizer->p) return;
+  impl = (Lanczos1Impl *)resizer->p;
+  delete impl;
+  resizer->p = nullptr;
+}
+
+/**
+ * Resizes image plane w/ Lanczos interpolation, e.g.
+ *
+ *   struct lanczos1 scaler = {0};
+ *   lanczos1init(&scaler);
+ *   lanczos1(&scaler, ...);
+ *   lanczos1free(&scaler);
+ *
+ * @param dyn is destination height
+ * @param dxn is destination width
+ * @param dst is destination unsigned char array
+ * @param dstsize is number of bytes in dst
+ * @param syn is source height
+ * @param sxn is source width
+ * @param ssw is number of unsigned chars per scanline in src
+ * @param src is source unsigned char array
+ * @param srcsize is number of bytes in src
+ */
+void lanczos1(struct lanczos1 *resizer, size_t dyn, size_t dxn, void *dst,
+              size_t dstsize, size_t syn, size_t sxn, size_t ssw,
+              const void *src, size_t srcsize) {
+  Lanczos1Impl *impl;
+  unsigned int roundhouse;
+  LOGF("%10s%5zux×%-5zu→%5zu×%-5zu", "lanczos1", sxn, syn, dxn, dyn);
+  CHECK_LE(dstsize, INT_MAX);
+  CHECK_LE(srcsize, INT_MAX);
+  CHECK_LE(sizeof(unsigned char) * 1 * dyn * dxn, dstsize);
+  CHECK_LE(sizeof(unsigned char) * 1 * syn * sxn, srcsize);
+  CHECK_LE(sizeof(unsigned char) * syn * ssw, srcsize);
+  roundhouse = _MM_GET_ROUNDING_MODE();
+  _MM_SET_ROUNDING_MODE(_MM_ROUND_NEAREST);
+  impl = (Lanczos1Impl *)resizer->p;
+  impl->lanczos.resizeImage((const unsigned char *)src, sxn, syn, ssw,
+                            (unsigned char *)dst, dxn, dyn, 1);
+  _MM_SET_ROUNDING_MODE(roundhouse);
+}
--- a/third_party/avir/lanczos1.h
+++ b/third_party/avir/lanczos1.h
@ -0,0 +1,18 @@
+#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1_H_
+#define COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1_H_
+#if !(__ASSEMBLER__ + __LINKER__ + 0)
+COSMOPOLITAN_C_START_
+
+struct lanczos1 {
+  void *p;
+};
+
+void lanczos1init(struct lanczos1 *self);
+void lanczos1free(struct lanczos1 *self);
+void lanczos1(struct lanczos1 *self, size_t dyn, size_t dxn, void *dst,
+              size_t dstsize, size_t syn, size_t sxn, size_t ssw,
+              const void *src, size_t srcsize) paramsnonnull();
+
+COSMOPOLITAN_C_END_
+#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
+#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1_H_ */
--- a/third_party/avir/lanczos1.hpp
+++ b/third_party/avir/lanczos1.hpp
@ -0,0 +1,11 @@
+#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1_HPP_
+#define COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1_HPP_
+#include "third_party/avir/lancir.h"
+
+struct Lanczos1Impl {
+  Lanczos1Impl() : lanczos{} {
+  }
+  avir::CLancIR lanczos;
+};
+
+#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1_HPP_ */
--- a/third_party/avir/lanczos1b.cc
+++ b/third_party/avir/lanczos1b.cc
@ -0,0 +1,31 @@
+/*-*-mode:c++;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8-*-│
+│vi: set net ft=c++ ts=2 sts=2 sw=2 fenc=utf-8                              :vi│
+╞══════════════════════════════════════════════════════════════════════════════╡
+│ Copyright 2020 Justine Alexandra Roberts Tunney                              │
+│                                                                              │
+│ This program is free software; you can redistribute it and/or modify         │
+│ it under the terms of the GNU General Public License as published by         │
+│ the Free Software Foundation; version 2 of the License.                      │
+│                                                                              │
+│ This program is distributed in the hope that it will be useful, but          │
+│ WITHOUT ANY WARRANTY; without even the implied warranty of                   │
+│ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU             │
+│ General Public License for more details.                                     │
+│                                                                              │
+│ You should have received a copy of the GNU General Public License            │
+│ along with this program; if not, write to the Free Software                  │
+│ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA                │
+│ 02110-1301 USA                                                               │
+╚─────────────────────────────────────────────────────────────────────────────*/
+#include "libc/bits/bits.h"
+#include "third_party/avir/lanczos1b.h"
+namespace {
+#include "third_party/avir/lancir.h"
+}  // namespace
+
+void lanczos1b(size_t dyn, size_t dxn, unsigned char *restrict dst, size_t syn,
+               size_t sxn, const unsigned char *restrict src) {
+  avir::CLancIR lanczos;
+  LOGF("%10s%5zux×%-5zu→%5zu×%-5zu", "lanczos1b", sxn, syn, dxn, dyn);
+  lanczos.resizeImage(src, sxn, syn, roundup2pow(sxn) * 4, dst, dxn, dyn, 4);
+}
--- a/third_party/avir/lanczos1b.h
+++ b/third_party/avir/lanczos1b.h
@ -0,0 +1,11 @@
+#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1B_H_
+#define COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1B_H_
+#if !(__ASSEMBLER__ + __LINKER__ + 0)
+COSMOPOLITAN_C_START_
+
+void lanczos1b(size_t dyn, size_t dxn, unsigned char *restrict dst, size_t syn,
+               size_t sxn, const unsigned char *restrict src);
+
+COSMOPOLITAN_C_END_
+#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
+#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1B_H_ */
--- a/third_party/avir/lanczos1f.cc
+++ b/third_party/avir/lanczos1f.cc
@ -0,0 +1,63 @@
+/*-*-mode:c++;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8-*-│
+│vi: set net ft=c++ ts=2 sts=2 sw=2 fenc=utf-8                              :vi│
+╞══════════════════════════════════════════════════════════════════════════════╡
+│ Copyright 2020 Justine Alexandra Roberts Tunney                              │
+│                                                                              │
+│ This program is free software; you can redistribute it and/or modify         │
+│ it under the terms of the GNU General Public License as published by         │
+│ the Free Software Foundation; version 2 of the License.                      │
+│                                                                              │
+│ This program is distributed in the hope that it will be useful, but          │
+│ WITHOUT ANY WARRANTY; without even the implied warranty of                   │
+│ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU             │
+│ General Public License for more details.                                     │
+│                                                                              │
+│ You should have received a copy of the GNU General Public License            │
+│ along with this program; if not, write to the Free Software                  │
+│ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA                │
+│ 02110-1301 USA                                                               │
+╚─────────────────────────────────────────────────────────────────────────────*/
+#include "libc/bits/xmmintrin.h"
+#include "libc/runtime/runtime.h"
+#include "third_party/avir/lanczos1f.h"
+namespace {
+#include "third_party/avir/lanczos1f.hpp"
+}  // namespace
+
+void lanczos1finit(struct lanczos1f *resizer) {
+  lanczos1ffree(resizer);
+  resizer->p = new Lanczos1fImpl;
+}
+
+void lanczos1ffree(struct lanczos1f *resizer) {
+  Lanczos1fImpl *impl;
+  if (!resizer->p) return;
+  impl = (Lanczos1fImpl *)resizer->p;
+  delete impl;
+  resizer->p = nullptr;
+}
+
+/**
+ * Resizes image plane w/ Lanczos interpolation, e.g.
+ *
+ *   struct lanczos1f scaler = {0};
+ *   lanczos1finit(&scaler);
+ *   lanczos1f(&scaler, ...);
+ *   lanczos1ffree(&scaler);
+ *
+ * @param dyn is destination height
+ * @param dxn is destination width
+ * @param dst is destination unsigned char array
+ * @param syn is source height
+ * @param sxn is source width
+ * @param ssw is number of unsigned chars per scanline in src
+ * @param src is source unsigned char array
+ */
+void lanczos1f(struct lanczos1f *resizer, size_t dyn, size_t dxn, void *dst,
+               size_t syn, size_t sxn, size_t ssw, const void *src, double ky0,
+               double kx0, double oy, double ox) {
+  Lanczos1fImpl *impl;
+  impl = (Lanczos1fImpl *)resizer->p;
+  impl->lanczos.resizeImage((const float *)src, sxn, syn, ssw, (float *)dst,
+                            dxn, dyn, 1, kx0, ky0, ox, oy);
+}
--- a/third_party/avir/lanczos1f.h
+++ b/third_party/avir/lanczos1f.h
@ -0,0 +1,18 @@
+#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1F_H_
+#define COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1F_H_
+#if !(__ASSEMBLER__ + __LINKER__ + 0)
+COSMOPOLITAN_C_START_
+
+struct lanczos1f {
+  void *p;
+};
+
+void lanczos1finit(struct lanczos1f *);
+void lanczos1ffree(struct lanczos1f *);
+void lanczos1f(struct lanczos1f *, size_t, size_t, void *, size_t, size_t,
+               size_t, const void *, double, double, double, double)
+    paramsnonnull();
+
+COSMOPOLITAN_C_END_
+#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
+#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1F_H_ */
--- a/third_party/avir/lanczos1f.hpp
+++ b/third_party/avir/lanczos1f.hpp
@ -0,0 +1,11 @@
+#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1F_HPP_
+#define COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1F_HPP_
+#include "third_party/avir/lancir.h"
+
+struct Lanczos1fImpl {
+  Lanczos1fImpl() : lanczos{} {
+  }
+  avir::CLancIR lanczos;
+};
+
+#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1F_HPP_ */
--- a/third_party/avir/lanczos3.cc
+++ b/third_party/avir/lanczos3.cc
@ -0,0 +1,30 @@
+/*-*-mode:c++;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8-*-│
+│vi: set net ft=c++ ts=2 sts=2 sw=2 fenc=utf-8                              :vi│
+╞══════════════════════════════════════════════════════════════════════════════╡
+│ Copyright 2020 Justine Alexandra Roberts Tunney                              │
+│                                                                              │
+│ This program is free software; you can redistribute it and/or modify         │
+│ it under the terms of the GNU General Public License as published by         │
+│ the Free Software Foundation; version 2 of the License.                      │
+│                                                                              │
+│ This program is distributed in the hope that it will be useful, but          │
+│ WITHOUT ANY WARRANTY; without even the implied warranty of                   │
+│ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU             │
+│ General Public License for more details.                                     │
+│                                                                              │
+│ You should have received a copy of the GNU General Public License            │
+│ along with this program; if not, write to the Free Software                  │
+│ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA                │
+│ 02110-1301 USA                                                               │
+╚─────────────────────────────────────────────────────────────────────────────*/
+#include "third_party/avir/lanczos.h"
+namespace {
+#include "third_party/avir/lancir.h"
+}
+
+void lanczos3(unsigned dyn, unsigned dxn, void *dst, unsigned syn, unsigned sxn,
+              const void *src, unsigned sw) {
+  avir::CLancIR lanczos;
+  lanczos.resizeImage((const float *)src, sxn, syn, sw, (float *)dst, dxn, dyn,
+                      3, -1, -2);
+}
--- a/third_party/avir/notice.h
+++ b/third_party/avir/notice.h
@ -0,0 +1,11 @@
+#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_NOTICE_H_
+#define COSMOPOLITAN_THIRD_PARTY_AVIR_NOTICE_H_
+#if !(__ASSEMBLER__ + __LINKER__ + 0)
+
+asm(".ident\t\"\\n\\n\
+AVIR (MIT License)\\n\
+Copyright 2015-2019 Aleksey Vaneev\"");
+asm(".include \"libc/disclaimer.inc\"");
+
+#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
+#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_NOTICE_H_ */
--- a/third_party/avir/resize.cc
+++ b/third_party/avir/resize.cc
@ -0,0 +1,48 @@
+/*-*-mode:c++;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8-*-│
+│vi: set net ft=c++ ts=2 sts=2 sw=2 fenc=utf-8                              :vi│
+╞══════════════════════════════════════════════════════════════════════════════╡
+│ Copyright 2020 Justine Alexandra Roberts Tunney                              │
+│                                                                              │
+│ This program is free software; you can redistribute it and/or modify         │
+│ it under the terms of the GNU General Public License as published by         │
+│ the Free Software Foundation; version 2 of the License.                      │
+│                                                                              │
+│ This program is distributed in the hope that it will be useful, but          │
+│ WITHOUT ANY WARRANTY; without even the implied warranty of                   │
+│ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU             │
+│ General Public License for more details.                                     │
+│                                                                              │
+│ You should have received a copy of the GNU General Public License            │
+│ along with this program; if not, write to the Free Software                  │
+│ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA                │
+│ 02110-1301 USA                                                               │
+╚─────────────────────────────────────────────────────────────────────────────*/
+#include "third_party/avir/resize.h"
+namespace {
+#include "third_party/avir/avir_float4_sse.h"
+}  // namespace
+
+struct ResizerImpl {
+  ResizerImpl() : resizer{8, 8, avir::CImageResizerParamsULR()} {}
+  avir::CImageResizer<avir::fpclass_float4> resizer;
+};
+
+void NewResizer(Resizer *resizer, int aResBitDepth, int aSrcBitDepth) {
+  FreeResizer(resizer);
+  resizer->p = new ResizerImpl();
+}
+
+void FreeResizer(Resizer *resizer) {
+  if (!resizer->p) return;
+  delete (ResizerImpl *)resizer->p;
+  resizer->p = nullptr;
+}
+
+void ResizeImage(Resizer *resizer, float *Dest, int DestHeight, int DestWidth,
+                 const float *Src, int SrcHeight, int SrcWidth) {
+  ResizerImpl *impl = (ResizerImpl *)resizer->p;
+  int SrcScanLineSize = 0;
+  double ResizingStep = 0;
+  impl->resizer.resizeImage(Src, SrcWidth, SrcHeight, SrcScanLineSize, Dest,
+                            DestWidth, DestHeight, 4, ResizingStep);
+}
--- a/third_party/avir/resize.h
+++ b/third_party/avir/resize.h
@ -0,0 +1,17 @@
+#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_RESIZE_H_
+#define COSMOPOLITAN_THIRD_PARTY_AVIR_RESIZE_H_
+#if !(__ASSEMBLER__ + __LINKER__ + 0)
+COSMOPOLITAN_C_START_
+
+struct Resizer {
+  void *p;
+};
+
+void FreeResizer(struct Resizer *) paramsnonnull();
+void NewResizer(struct Resizer *, int, int) paramsnonnull();
+void ResizeImage(struct Resizer *, float *, int, int, const float *, int, int)
+    paramsnonnull();
+
+COSMOPOLITAN_C_END_
+#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
+#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_RESIZE_H_ */
--- a/third_party/blas/blas.h
+++ b/third_party/blas/blas.h
@ -0,0 +1,13 @@
+#ifndef COSMOPOLITAN_THIRD_PARTY_BLAS_BLAS_H_
+#define COSMOPOLITAN_THIRD_PARTY_BLAS_BLAS_H_
+#if !(__ASSEMBLER__ + __LINKER__ + 0)
+COSMOPOLITAN_C_START_
+
+int dgemm_(char *transa, char *transb, long *m, long *n, long *k, double *alpha,
+           double *A /*['N'?k:m][1≤m≤lda]*/, long *lda,
+           double *B /*['N'?k:n][1≤n≤ldb]*/, long *ldb, double *beta,
+           double *C /*[n][1≤m≤ldc]*/, long *ldc);
+
+COSMOPOLITAN_C_END_
+#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
+#endif /* COSMOPOLITAN_THIRD_PARTY_BLAS_BLAS_H_ */
--- a/third_party/blas/blas.mk
+++ b/third_party/blas/blas.mk
@ -0,0 +1,59 @@
+#-*-mode:makefile-gmake;indent-tabs-mode:t;tab-width:8;coding:utf-8-*-┐
+#───vi: set et ft=make ts=8 tw=8 fenc=utf-8 :vi───────────────────────┘
+
+PKGS += THIRD_PARTY_BLAS
+
+THIRD_PARTY_BLAS_ARTIFACTS += THIRD_PARTY_BLAS_A
+THIRD_PARTY_BLAS = $(THIRD_PARTY_BLAS_A_DEPS) $(THIRD_PARTY_BLAS_A)
+THIRD_PARTY_BLAS_A = o/$(MODE)/third_party/blas/blas.a
+THIRD_PARTY_BLAS_A_FILES := $(wildcard third_party/blas/*)
+THIRD_PARTY_BLAS_A_HDRS = $(filter %.h,$(THIRD_PARTY_BLAS_A_FILES))
+THIRD_PARTY_BLAS_A_SRCS_S = $(filter %.S,$(THIRD_PARTY_BLAS_A_FILES))
+THIRD_PARTY_BLAS_A_SRCS_C = $(filter %.c,$(THIRD_PARTY_BLAS_A_FILES))
+
+THIRD_PARTY_BLAS_A_SRCS =				\
+	$(THIRD_PARTY_BLAS_A_SRCS_S)			\
+	$(THIRD_PARTY_BLAS_A_SRCS_C)
+
+THIRD_PARTY_BLAS_A_OBJS =				\
+	$(THIRD_PARTY_BLAS_A_SRCS:%=o/$(MODE)/%.zip.o)	\
+	$(THIRD_PARTY_BLAS_A_SRCS_S:%.S=o/$(MODE)/%.o)	\
+	$(THIRD_PARTY_BLAS_A_SRCS_C:%.c=o/$(MODE)/%.o)
+
+THIRD_PARTY_BLAS_A_CHECKS =				\
+	$(THIRD_PARTY_BLAS_A).pkg			\
+	$(THIRD_PARTY_BLAS_A_HDRS:%=o/$(MODE)/%.ok)
+
+THIRD_PARTY_BLAS_A_DIRECTDEPS =				\
+	LIBC_STUBS					\
+	LIBC_NEXGEN32E					\
+	THIRD_PARTY_F2C
+
+THIRD_PARTY_BLAS_A_DEPS :=				\
+	$(call uniq,$(foreach x,$(THIRD_PARTY_BLAS_A_DIRECTDEPS),$($(x))))
+
+$(THIRD_PARTY_BLAS_A_OBJS):				\
+		OVERRIDE_CFLAGS +=			\
+			-O3 #$(MATHEMATICAL)
+
+#$(THIRD_PARTY_BLAS_A_OBJS):				\
+		CC = $(CLANG)
+
+$(THIRD_PARTY_BLAS_A):					\
+		third_party/blas/			\
+		$(THIRD_PARTY_BLAS_A).pkg		\
+		$(THIRD_PARTY_BLAS_A_OBJS)
+
+$(THIRD_PARTY_BLAS_A).pkg:				\
+		$(THIRD_PARTY_BLAS_A_OBJS)		\
+		$(foreach x,$(THIRD_PARTY_BLAS_A_DIRECTDEPS),$($(x)_A).pkg)
+
+THIRD_PARTY_BLAS_LIBS = $(foreach x,$(THIRD_PARTY_BLAS_ARTIFACTS),$($(x)))
+THIRD_PARTY_BLAS_SRCS = $(foreach x,$(THIRD_PARTY_BLAS_ARTIFACTS),$($(x)_SRCS))
+THIRD_PARTY_BLAS_HDRS = $(foreach x,$(THIRD_PARTY_BLAS_ARTIFACTS),$($(x)_HDRS))
+THIRD_PARTY_BLAS_CHECKS = $(foreach x,$(THIRD_PARTY_BLAS_ARTIFACTS),$($(x)_CHECKS))
+THIRD_PARTY_BLAS_OBJS = $(foreach x,$(THIRD_PARTY_BLAS_ARTIFACTS),$($(x)_OBJS))
+$(THIRD_PARTY_BLAS_OBJS): $(BUILD_FILES) third_party/blas/blas.mk
+
+.PHONY: o/$(MODE)/third_party/blas
+o/$(MODE)/third_party/blas: $(THIRD_PARTY_BLAS_CHECKS)
--- a/third_party/blas/dgemm.f
+++ b/third_party/blas/dgemm.f
@ -0,0 +1,384 @@
+*> \brief \b DGEMM
+*
+*  =========== DOCUMENTATION ===========
+*
+* Online html documentation available at
+*            http://www.netlib.org/lapack/explore-html/
+*
+*  Definition:
+*  ===========
+*
+*       SUBROUTINE DGEMM(TRANSA,TRANSB,M,N,K,ALPHA,A,LDA,B,LDB,BETA,C,LDC)
+*
+*       .. Scalar Arguments ..
+*       DOUBLE PRECISION ALPHA,BETA
+*       INTEGER K,LDA,LDB,LDC,M,N
+*       CHARACTER TRANSA,TRANSB
+*       ..
+*       .. Array Arguments ..
+*       DOUBLE PRECISION A(LDA,*),B(LDB,*),C(LDC,*)
+*       ..
+*
+*
+*> \par Purpose:
+*  =============
+*>
+*> \verbatim
+*>
+*> DGEMM  performs one of the matrix-matrix operations
+*>
+*>    C := alpha*op( A )*op( B ) + beta*C,
+*>
+*> where  op( X ) is one of
+*>
+*>    op( X ) = X   or   op( X ) = X**T,
+*>
+*> alpha and beta are scalars, and A, B and C are matrices, with op( A )
+*> an m by k matrix,  op( B )  a  k by n matrix and  C an m by n matrix.
+*> \endverbatim
+*
+*  Arguments:
+*  ==========
+*
+*> \param[in] TRANSA
+*> \verbatim
+*>          TRANSA is CHARACTER*1
+*>           On entry, TRANSA specifies the form of op( A ) to be used in
+*>           the matrix multiplication as follows:
+*>
+*>              TRANSA = 'N' or 'n',  op( A ) = A.
+*>
+*>              TRANSA = 'T' or 't',  op( A ) = A**T.
+*>
+*>              TRANSA = 'C' or 'c',  op( A ) = A**T.
+*> \endverbatim
+*>
+*> \param[in] TRANSB
+*> \verbatim
+*>          TRANSB is CHARACTER*1
+*>           On entry, TRANSB specifies the form of op( B ) to be used in
+*>           the matrix multiplication as follows:
+*>
+*>              TRANSB = 'N' or 'n',  op( B ) = B.
+*>
+*>              TRANSB = 'T' or 't',  op( B ) = B**T.
+*>
+*>              TRANSB = 'C' or 'c',  op( B ) = B**T.
+*> \endverbatim
+*>
+*> \param[in] M
+*> \verbatim
+*>          M is INTEGER
+*>           On entry,  M  specifies  the number  of rows  of the  matrix
+*>           op( A )  and of the  matrix  C.  M  must  be at least  zero.
+*> \endverbatim
+*>
+*> \param[in] N
+*> \verbatim
+*>          N is INTEGER
+*>           On entry,  N  specifies the number  of columns of the matrix
+*>           op( B ) and the number of columns of the matrix C. N must be
+*>           at least zero.
+*> \endverbatim
+*>
+*> \param[in] K
+*> \verbatim
+*>          K is INTEGER
+*>           On entry,  K  specifies  the number of columns of the matrix
+*>           op( A ) and the number of rows of the matrix op( B ). K must
+*>           be at least  zero.
+*> \endverbatim
+*>
+*> \param[in] ALPHA
+*> \verbatim
+*>          ALPHA is DOUBLE PRECISION.
+*>           On entry, ALPHA specifies the scalar alpha.
+*> \endverbatim
+*>
+*> \param[in] A
+*> \verbatim
+*>          A is DOUBLE PRECISION array, dimension ( LDA, ka ), where ka is
+*>           k  when  TRANSA = 'N' or 'n',  and is  m  otherwise.
+*>           Before entry with  TRANSA = 'N' or 'n',  the leading  m by k
+*>           part of the array  A  must contain the matrix  A,  otherwise
+*>           the leading  k by m  part of the array  A  must contain  the
+*>           matrix A.
+*> \endverbatim
+*>
+*> \param[in] LDA
+*> \verbatim
+*>          LDA is INTEGER
+*>           On entry, LDA specifies the first dimension of A as declared
+*>           in the calling (sub) program. When  TRANSA = 'N' or 'n' then
+*>           LDA must be at least  max( 1, m ), otherwise  LDA must be at
+*>           least  max( 1, k ).
+*> \endverbatim
+*>
+*> \param[in] B
+*> \verbatim
+*>          B is DOUBLE PRECISION array, dimension ( LDB, kb ), where kb is
+*>           n  when  TRANSB = 'N' or 'n',  and is  k  otherwise.
+*>           Before entry with  TRANSB = 'N' or 'n',  the leading  k by n
+*>           part of the array  B  must contain the matrix  B,  otherwise
+*>           the leading  n by k  part of the array  B  must contain  the
+*>           matrix B.
+*> \endverbatim
+*>
+*> \param[in] LDB
+*> \verbatim
+*>          LDB is INTEGER
+*>           On entry, LDB specifies the first dimension of B as declared
+*>           in the calling (sub) program. When  TRANSB = 'N' or 'n' then
+*>           LDB must be at least  max( 1, k ), otherwise  LDB must be at
+*>           least  max( 1, n ).
+*> \endverbatim
+*>
+*> \param[in] BETA
+*> \verbatim
+*>          BETA is DOUBLE PRECISION.
+*>           On entry,  BETA  specifies the scalar  beta.  When  BETA  is
+*>           supplied as zero then C need not be set on input.
+*> \endverbatim
+*>
+*> \param[in,out] C
+*> \verbatim
+*>          C is DOUBLE PRECISION array, dimension ( LDC, N )
+*>           Before entry, the leading  m by n  part of the array  C must
+*>           contain the matrix  C,  except when  beta  is zero, in which
+*>           case C need not be set on entry.
+*>           On exit, the array  C  is overwritten by the  m by n  matrix
+*>           ( alpha*op( A )*op( B ) + beta*C ).
+*> \endverbatim
+*>
+*> \param[in] LDC
+*> \verbatim
+*>          LDC is INTEGER
+*>           On entry, LDC specifies the first dimension of C as declared
+*>           in  the  calling  (sub)  program.   LDC  must  be  at  least
+*>           max( 1, m ).
+*> \endverbatim
+*
+*  Authors:
+*  ========
+*
+*> \author Univ. of Tennessee
+*> \author Univ. of California Berkeley
+*> \author Univ. of Colorado Denver
+*> \author NAG Ltd.
+*
+*> \date December 2016
+*
+*> \ingroup double_blas_level3
+*
+*> \par Further Details:
+*  =====================
+*>
+*> \verbatim
+*>
+*>  Level 3 Blas routine.
+*>
+*>  -- Written on 8-February-1989.
+*>     Jack Dongarra, Argonne National Laboratory.
+*>     Iain Duff, AERE Harwell.
+*>     Jeremy Du Croz, Numerical Algorithms Group Ltd.
+*>     Sven Hammarling, Numerical Algorithms Group Ltd.
+*> \endverbatim
+*>
+*  =====================================================================
+      SUBROUTINE DGEMM(TRANSA,TRANSB,M,N,K,ALPHA,A,LDA,B,LDB,BETA,C,LDC)
+*
+*  -- Reference BLAS level3 routine (version 3.7.0) --
+*  -- Reference BLAS is a software package provided by Univ. of Tennessee,    --
+*  -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
+*     December 2016
+*
+*     .. Scalar Arguments ..
+      DOUBLE PRECISION ALPHA,BETA
+      INTEGER K,LDA,LDB,LDC,M,N
+      CHARACTER TRANSA,TRANSB
+*     ..
+*     .. Array Arguments ..
+      DOUBLE PRECISION A(LDA,*),B(LDB,*),C(LDC,*)
+*     ..
+*
+*  =====================================================================
+*
+*     .. External Functions ..
+      LOGICAL LSAME
+      EXTERNAL LSAME
+*     ..
+*     .. External Subroutines ..
+      EXTERNAL XERBLA
+*     ..
+*     .. Intrinsic Functions ..
+      INTRINSIC MAX
+*     ..
+*     .. Local Scalars ..
+      DOUBLE PRECISION TEMP
+      INTEGER I,INFO,J,L,NCOLA,NROWA,NROWB
+      LOGICAL NOTA,NOTB
+*     ..
+*     .. Parameters ..
+      DOUBLE PRECISION ONE,ZERO
+      PARAMETER (ONE=1.0D+0,ZERO=0.0D+0)
+*     ..
+*
+*     Set  NOTA  and  NOTB  as  true if  A  and  B  respectively are not
+*     transposed and set  NROWA, NCOLA and  NROWB  as the number of rows
+*     and  columns of  A  and the  number of  rows  of  B  respectively.
+*
+      NOTA = LSAME(TRANSA,'N')
+      NOTB = LSAME(TRANSB,'N')
+      IF (NOTA) THEN
+          NROWA = M
+          NCOLA = K
+      ELSE
+          NROWA = K
+          NCOLA = M
+      END IF
+      IF (NOTB) THEN
+          NROWB = K
+      ELSE
+          NROWB = N
+      END IF
+*
+*     Test the input parameters.
+*
+      INFO = 0
+      IF ((.NOT.NOTA) .AND. (.NOT.LSAME(TRANSA,'C')) .AND.
+     +    (.NOT.LSAME(TRANSA,'T'))) THEN
+          INFO = 1
+      ELSE IF ((.NOT.NOTB) .AND. (.NOT.LSAME(TRANSB,'C')) .AND.
+     +         (.NOT.LSAME(TRANSB,'T'))) THEN
+          INFO = 2
+      ELSE IF (M.LT.0) THEN
+          INFO = 3
+      ELSE IF (N.LT.0) THEN
+          INFO = 4
+      ELSE IF (K.LT.0) THEN
+          INFO = 5
+      ELSE IF (LDA.LT.MAX(1,NROWA)) THEN
+          INFO = 8
+      ELSE IF (LDB.LT.MAX(1,NROWB)) THEN
+          INFO = 10
+      ELSE IF (LDC.LT.MAX(1,M)) THEN
+          INFO = 13
+      END IF
+      IF (INFO.NE.0) THEN
+          CALL XERBLA('DGEMM ',INFO)
+          RETURN
+      END IF
+*
+*     Quick return if possible.
+*
+      IF ((M.EQ.0) .OR. (N.EQ.0) .OR.
+     +    (((ALPHA.EQ.ZERO).OR. (K.EQ.0)).AND. (BETA.EQ.ONE))) RETURN
+*
+*     And if  alpha.eq.zero.
+*
+      IF (ALPHA.EQ.ZERO) THEN
+          IF (BETA.EQ.ZERO) THEN
+              DO 20 J = 1,N
+                  DO 10 I = 1,M
+                      C(I,J) = ZERO
+   10             CONTINUE
+   20         CONTINUE
+          ELSE
+              DO 40 J = 1,N
+                  DO 30 I = 1,M
+                      C(I,J) = BETA*C(I,J)
+   30             CONTINUE
+   40         CONTINUE
+          END IF
+          RETURN
+      END IF
+*
+*     Start the operations.
+*
+      IF (NOTB) THEN
+          IF (NOTA) THEN
+*
+*           Form  C := alpha*A*B + beta*C.
+*
+              DO 90 J = 1,N
+                  IF (BETA.EQ.ZERO) THEN
+                      DO 50 I = 1,M
+                          C(I,J) = ZERO
+   50                 CONTINUE
+                  ELSE IF (BETA.NE.ONE) THEN
+                      DO 60 I = 1,M
+                          C(I,J) = BETA*C(I,J)
+   60                 CONTINUE
+                  END IF
+                  DO 80 L = 1,K
+                      TEMP = ALPHA*B(L,J)
+                      DO 70 I = 1,M
+                          C(I,J) = C(I,J) + TEMP*A(I,L)
+   70                 CONTINUE
+   80             CONTINUE
+   90         CONTINUE
+          ELSE
+*
+*           Form  C := alpha*A**T*B + beta*C
+*
+              DO 120 J = 1,N
+                  DO 110 I = 1,M
+                      TEMP = ZERO
+                      DO 100 L = 1,K
+                          TEMP = TEMP + A(L,I)*B(L,J)
+  100                 CONTINUE
+                      IF (BETA.EQ.ZERO) THEN
+                          C(I,J) = ALPHA*TEMP
+                      ELSE
+                          C(I,J) = ALPHA*TEMP + BETA*C(I,J)
+                      END IF
+  110             CONTINUE
+  120         CONTINUE
+          END IF
+      ELSE
+          IF (NOTA) THEN
+*
+*           Form  C := alpha*A*B**T + beta*C
+*
+              DO 170 J = 1,N
+                  IF (BETA.EQ.ZERO) THEN
+                      DO 130 I = 1,M
+                          C(I,J) = ZERO
+  130                 CONTINUE
+                  ELSE IF (BETA.NE.ONE) THEN
+                      DO 140 I = 1,M
+                          C(I,J) = BETA*C(I,J)
+  140                 CONTINUE
+                  END IF
+                  DO 160 L = 1,K
+                      TEMP = ALPHA*B(J,L)
+                      DO 150 I = 1,M
+                          C(I,J) = C(I,J) + TEMP*A(I,L)
+  150                 CONTINUE
+  160             CONTINUE
+  170         CONTINUE
+          ELSE
+*
+*           Form  C := alpha*A**T*B**T + beta*C
+*
+              DO 200 J = 1,N
+                  DO 190 I = 1,M
+                      TEMP = ZERO
+                      DO 180 L = 1,K
+                          TEMP = TEMP + A(L,I)*B(J,L)
+  180                 CONTINUE
+                      IF (BETA.EQ.ZERO) THEN
+                          C(I,J) = ALPHA*TEMP
+                      ELSE
+                          C(I,J) = ALPHA*TEMP + BETA*C(I,J)
+                      END IF
+  190             CONTINUE
+  200         CONTINUE
+          END IF
+      END IF
+*
+      RETURN
+*
+*     End of DGEMM .
+*
+      END
--- a/third_party/blas/dgemm_.S
+++ b/third_party/blas/dgemm_.S
--- a/third_party/blas/lsame.c
+++ b/third_party/blas/lsame.c
@ -0,0 +1,155 @@
+/* lsame.f -- translated by f2c (version 20191129).
+   You must link the resulting object file with libf2c:
+        on Microsoft Windows system, link with libf2c.lib;
+        on Linux or Unix systems, link with .../path/to/libf2c.a -lm
+        or, if you install libf2c.a in a standard place, with -lf2c -lm
+        -- in that order, at the end of the command line, as in
+                cc *.o -lf2c -lm
+        Source for libf2c is in /netlib/f2c/libf2c.zip, e.g.,
+
+                http://www.netlib.org/f2c/libf2c.zip
+*/
+
+#include "third_party/f2c/f2c.h"
+
+/* > \brief \b LSAME */
+
+/*  =========== DOCUMENTATION =========== */
+
+/* Online html documentation available at */
+/*            http://www.netlib.org/lapack/explore-html/ */
+
+/*  Definition: */
+/*  =========== */
+
+/*       LOGICAL FUNCTION LSAME(CA,CB) */
+
+/*       .. Scalar Arguments .. */
+/*       CHARACTER CA,CB */
+/*       .. */
+
+/* > \par Purpose: */
+/*  ============= */
+/* > */
+/* > \verbatim */
+/* > */
+/* > LSAME returns .TRUE. if CA is the same letter as CB regardless of */
+/* > case. */
+/* > \endverbatim */
+
+/*  Arguments: */
+/*  ========== */
+
+/* > \param[in] CA */
+/* > \verbatim */
+/* >          CA is CHARACTER*1 */
+/* > \endverbatim */
+/* > */
+/* > \param[in] CB */
+/* > \verbatim */
+/* >          CB is CHARACTER*1 */
+/* >          CA and CB specify the single characters to be compared. */
+/* > \endverbatim */
+
+/*  Authors: */
+/*  ======== */
+
+/* > \author Univ. of Tennessee */
+/* > \author Univ. of California Berkeley */
+/* > \author Univ. of Colorado Denver */
+/* > \author NAG Ltd. */
+
+/* > \date December 2016 */
+
+/* > \ingroup aux_blas */
+
+/*  ===================================================================== */
+logical lsame_(char *ca, char *cb) {
+  /* System generated locals */
+  logical ret_val;
+
+  /* Local variables */
+  static integer inta, intb, zcode;
+
+  /*  -- Reference BLAS level1 routine (version 3.1) -- */
+  /*  -- Reference BLAS is a software package provided by Univ. of Tennessee,
+   * -- */
+  /*  -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
+   */
+  /*     December 2016 */
+
+  /*     .. Scalar Arguments .. */
+  /*     .. */
+
+  /* ===================================================================== */
+
+  /*     .. Intrinsic Functions .. */
+  /*     .. */
+  /*     .. Local Scalars .. */
+  /*     .. */
+
+  /*     Test if the characters are equal */
+
+  ret_val = *(unsigned char *)ca == *(unsigned char *)cb;
+  if (ret_val) {
+    return ret_val;
+  }
+
+  /*     Now test for equivalence if both characters are alphabetic. */
+
+  zcode = 'Z';
+
+  /*     Use 'Z' rather than 'A' so that ASCII can be detected on Prime */
+  /*     machines, on which ICHAR returns a value with bit 8 set. */
+  /*     ICHAR('A') on Prime machines returns 193 which is the same as */
+  /*     ICHAR('A') on an EBCDIC machine. */
+
+  inta = *(unsigned char *)ca;
+  intb = *(unsigned char *)cb;
+
+  if (zcode == 90 || zcode == 122) {
+
+    /*        ASCII is assumed - ZCODE is the ASCII code of either lower or */
+    /*        upper case 'Z'. */
+
+    if (inta >= 97 && inta <= 122) {
+      inta += -32;
+    }
+    if (intb >= 97 && intb <= 122) {
+      intb += -32;
+    }
+
+  } else if (zcode == 233 || zcode == 169) {
+
+    /*        EBCDIC is assumed - ZCODE is the EBCDIC code of either lower or */
+    /*        upper case 'Z'. */
+
+    if (inta >= 129 && inta <= 137 || inta >= 145 && inta <= 153 ||
+        inta >= 162 && inta <= 169) {
+      inta += 64;
+    }
+    if (intb >= 129 && intb <= 137 || intb >= 145 && intb <= 153 ||
+        intb >= 162 && intb <= 169) {
+      intb += 64;
+    }
+
+  } else if (zcode == 218 || zcode == 250) {
+
+    /*        ASCII is assumed, on Prime machines - ZCODE is the ASCII code */
+    /*        plus 128 of either lower or upper case 'Z'. */
+
+    if (inta >= 225 && inta <= 250) {
+      inta += -32;
+    }
+    if (intb >= 225 && intb <= 250) {
+      intb += -32;
+    }
+  }
+  ret_val = inta == intb;
+
+  /*     RETURN */
+
+  /*     End of LSAME */
+
+  return ret_val;
+} /* lsame_ */
--- a/third_party/blas/lsame.f
+++ b/third_party/blas/lsame.f
@ -0,0 +1,125 @@
+*> \brief \b LSAME
+*
+*  =========== DOCUMENTATION ===========
+*
+* Online html documentation available at
+*            http://www.netlib.org/lapack/explore-html/
+*
+*  Definition:
+*  ===========
+*
+*       LOGICAL FUNCTION LSAME(CA,CB)
+*
+*       .. Scalar Arguments ..
+*       CHARACTER CA,CB
+*       ..
+*
+*
+*> \par Purpose:
+*  =============
+*>
+*> \verbatim
+*>
+*> LSAME returns .TRUE. if CA is the same letter as CB regardless of
+*> case.
+*> \endverbatim
+*
+*  Arguments:
+*  ==========
+*
+*> \param[in] CA
+*> \verbatim
+*>          CA is CHARACTER*1
+*> \endverbatim
+*>
+*> \param[in] CB
+*> \verbatim
+*>          CB is CHARACTER*1
+*>          CA and CB specify the single characters to be compared.
+*> \endverbatim
+*
+*  Authors:
+*  ========
+*
+*> \author Univ. of Tennessee
+*> \author Univ. of California Berkeley
+*> \author Univ. of Colorado Denver
+*> \author NAG Ltd.
+*
+*> \date December 2016
+*
+*> \ingroup aux_blas
+*
+*  =====================================================================
+      LOGICAL FUNCTION LSAME(CA,CB)
+*
+*  -- Reference BLAS level1 routine (version 3.1) --
+*  -- Reference BLAS is a software package provided by Univ. of Tennessee,    --
+*  -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
+*     December 2016
+*
+*     .. Scalar Arguments ..
+      CHARACTER CA,CB
+*     ..
+*
+* =====================================================================
+*
+*     .. Intrinsic Functions ..
+      INTRINSIC ICHAR
+*     ..
+*     .. Local Scalars ..
+      INTEGER INTA,INTB,ZCODE
+*     ..
+*
+*     Test if the characters are equal
+*
+      LSAME = CA .EQ. CB
+      IF (LSAME) RETURN
+*
+*     Now test for equivalence if both characters are alphabetic.
+*
+      ZCODE = ICHAR('Z')
+*
+*     Use 'Z' rather than 'A' so that ASCII can be detected on Prime
+*     machines, on which ICHAR returns a value with bit 8 set.
+*     ICHAR('A') on Prime machines returns 193 which is the same as
+*     ICHAR('A') on an EBCDIC machine.
+*
+      INTA = ICHAR(CA)
+      INTB = ICHAR(CB)
+*
+      IF (ZCODE.EQ.90 .OR. ZCODE.EQ.122) THEN
+*
+*        ASCII is assumed - ZCODE is the ASCII code of either lower or
+*        upper case 'Z'.
+*
+          IF (INTA.GE.97 .AND. INTA.LE.122) INTA = INTA - 32
+          IF (INTB.GE.97 .AND. INTB.LE.122) INTB = INTB - 32
+*
+      ELSE IF (ZCODE.EQ.233 .OR. ZCODE.EQ.169) THEN
+*
+*        EBCDIC is assumed - ZCODE is the EBCDIC code of either lower or
+*        upper case 'Z'.
+*
+          IF (INTA.GE.129 .AND. INTA.LE.137 .OR.
+     +        INTA.GE.145 .AND. INTA.LE.153 .OR.
+     +        INTA.GE.162 .AND. INTA.LE.169) INTA = INTA + 64
+          IF (INTB.GE.129 .AND. INTB.LE.137 .OR.
+     +        INTB.GE.145 .AND. INTB.LE.153 .OR.
+     +        INTB.GE.162 .AND. INTB.LE.169) INTB = INTB + 64
+*
+      ELSE IF (ZCODE.EQ.218 .OR. ZCODE.EQ.250) THEN
+*
+*        ASCII is assumed, on Prime machines - ZCODE is the ASCII code
+*        plus 128 of either lower or upper case 'Z'.
+*
+          IF (INTA.GE.225 .AND. INTA.LE.250) INTA = INTA - 32
+          IF (INTB.GE.225 .AND. INTB.LE.250) INTB = INTB - 32
+      END IF
+      LSAME = INTA .EQ. INTB
+*
+*     RETURN
+*
+*     End of LSAME
+*
+      END
--- a/third_party/blas/xerbla.c
+++ b/third_party/blas/xerbla.c
@ -0,0 +1,121 @@
+/* xerbla.f -- translated by f2c (version 20191129).
+   You must link the resulting object file with libf2c:
+        on Microsoft Windows system, link with libf2c.lib;
+        on Linux or Unix systems, link with .../path/to/libf2c.a -lm
+        or, if you install libf2c.a in a standard place, with -lf2c -lm
+        -- in that order, at the end of the command line, as in
+                cc *.o -lf2c -lm
+        Source for libf2c is in /netlib/f2c/libf2c.zip, e.g.,
+
+                http://www.netlib.org/f2c/libf2c.zip
+*/
+
+#include "third_party/f2c/f2c.h"
+
+/* Table of constant values */
+
+static integer c__1 = 1;
+
+/* > \brief \b XERBLA */
+
+/*  =========== DOCUMENTATION =========== */
+
+/* Online html documentation available at */
+/*            http://www.netlib.org/lapack/explore-html/ */
+
+/*  Definition: */
+/*  =========== */
+
+/*       SUBROUTINE XERBLA( SRNAME, INFO ) */
+
+/*       .. Scalar Arguments .. */
+/*       CHARACTER*(*)      SRNAME */
+/*       INTEGER            INFO */
+/*       .. */
+
+/* > \par Purpose: */
+/*  ============= */
+/* > */
+/* > \verbatim */
+/* > */
+/* > XERBLA  is an error handler for the LAPACK routines. */
+/* > It is called by an LAPACK routine if an input parameter has an */
+/* > invalid value.  A message is printed and execution stops. */
+/* > */
+/* > Installers may consider modifying the STOP statement in order to */
+/* > call system-specific exception-handling facilities. */
+/* > \endverbatim */
+
+/*  Arguments: */
+/*  ========== */
+
+/* > \param[in] SRNAME */
+/* > \verbatim */
+/* >          SRNAME is CHARACTER*(*) */
+/* >          The name of the routine which called XERBLA. */
+/* > \endverbatim */
+/* > */
+/* > \param[in] INFO */
+/* > \verbatim */
+/* >          INFO is INTEGER */
+/* >          The position of the invalid parameter in the parameter list */
+/* >          of the calling routine. */
+/* > \endverbatim */
+
+/*  Authors: */
+/*  ======== */
+
+/* > \author Univ. of Tennessee */
+/* > \author Univ. of California Berkeley */
+/* > \author Univ. of Colorado Denver */
+/* > \author NAG Ltd. */
+
+/* > \date December 2016 */
+
+/* > \ingroup aux_blas */
+
+/*  ===================================================================== */
+/* Subroutine */ int xerbla_(char *srname, integer *info, ftnlen srname_len) {
+  /* Format strings */
+  static char fmt_9999[] =
+      "(\002 ** On entry to \002,a,\002 parameter num"
+      "ber \002,i2,\002 had \002,\002an illegal value\002)";
+
+  /* Builtin functions */
+  integer s_wsfe(cilist *), do_fio(integer *, char *, ftnlen), e_wsfe(void);
+  /* Subroutine */ int s_stop(char *, ftnlen);
+
+  /* Local variables */
+  extern doublereal trmlen_(char *, ftnlen);
+
+  /* Fortran I/O blocks */
+  static cilist io___1 = {0, 6, 0, fmt_9999, 0};
+
+  /*  -- Reference BLAS level1 routine (version 3.7.0) -- */
+  /*  -- Reference BLAS is a software package provided by Univ. of Tennessee,
+   * -- */
+  /*  -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
+   */
+  /*     December 2016 */
+
+  /*     .. Scalar Arguments .. */
+  /*     .. */
+
+  /* ===================================================================== */
+
+  /*     .. Intrinsic Functions .. */
+  /*     INTRINSIC          LEN_TRIM */
+  /*     .. */
+  /*     .. Executable Statements .. */
+
+  s_wsfe(&io___1);
+  do_fio(&c__1, srname, (integer)trmlen_(srname, srname_len));
+  do_fio(&c__1, (char *)&(*info), (ftnlen)sizeof(integer));
+  e_wsfe();
+
+  s_stop("", (ftnlen)0);
+
+  /*     End of XERBLA */
+
+  return 0;
+} /* xerbla_ */
--- a/third_party/blas/xerbla.f
+++ b/third_party/blas/xerbla.f
@ -0,0 +1,89 @@
+*> \brief \b XERBLA
+*
+*  =========== DOCUMENTATION ===========
+*
+* Online html documentation available at
+*            http://www.netlib.org/lapack/explore-html/
+*
+*  Definition:
+*  ===========
+*
+*       SUBROUTINE XERBLA( SRNAME, INFO )
+*
+*       .. Scalar Arguments ..
+*       CHARACTER*(*)      SRNAME
+*       INTEGER            INFO
+*       ..
+*
+*
+*> \par Purpose:
+*  =============
+*>
+*> \verbatim
+*>
+*> XERBLA  is an error handler for the LAPACK routines.
+*> It is called by an LAPACK routine if an input parameter has an
+*> invalid value.  A message is printed and execution stops.
+*>
+*> Installers may consider modifying the STOP statement in order to
+*> call system-specific exception-handling facilities.
+*> \endverbatim
+*
+*  Arguments:
+*  ==========
+*
+*> \param[in] SRNAME
+*> \verbatim
+*>          SRNAME is CHARACTER*(*)
+*>          The name of the routine which called XERBLA.
+*> \endverbatim
+*>
+*> \param[in] INFO
+*> \verbatim
+*>          INFO is INTEGER
+*>          The position of the invalid parameter in the parameter list
+*>          of the calling routine.
+*> \endverbatim
+*
+*  Authors:
+*  ========
+*
+*> \author Univ. of Tennessee
+*> \author Univ. of California Berkeley
+*> \author Univ. of Colorado Denver
+*> \author NAG Ltd.
+*
+*> \date December 2016
+*
+*> \ingroup aux_blas
+*
+*  =====================================================================
+      SUBROUTINE XERBLA( SRNAME, INFO )
+*
+*  -- Reference BLAS level1 routine (version 3.7.0) --
+*  -- Reference BLAS is a software package provided by Univ. of Tennessee,    --
+*  -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
+*     December 2016
+*
+*     .. Scalar Arguments ..
+      CHARACTER*(*)      SRNAME
+      INTEGER            INFO
+*     ..
+*
+* =====================================================================
+*
+*     .. Intrinsic Functions ..
+*     INTRINSIC          LEN_TRIM
+*     ..
+*     .. Executable Statements ..
+*
+      WRITE( *, FMT = 9999 )SRNAME( 1:TRMLEN( SRNAME ) ), INFO
+*
+      STOP
+*
+ 9999 FORMAT( ' ** On entry to ', A, ' parameter number ', I2, ' had ',
+     $      'an illegal value' )
+*
+*     End of XERBLA
+*
+      END
--- a/third_party/bzip2/.clang-format
+++ b/third_party/bzip2/.clang-format
@ -0,0 +1,125 @@
+---
+Language:        Cpp
+# BasedOnStyle:  WebKit
+AccessModifierOffset: -4
+AlignAfterOpenBracket: DontAlign
+AlignConsecutiveAssignments: false
+AlignConsecutiveDeclarations: false
+AlignEscapedNewlines: Right
+AlignOperands:   false
+AlignTrailingComments: false
+AllowAllArgumentsOnNextLine: true
+AllowAllConstructorInitializersOnNextLine: true
+AllowAllParametersOfDeclarationOnNextLine: true
+AllowShortBlocksOnASingleLine: false
+AllowShortCaseLabelsOnASingleLine: false
+AllowShortFunctionsOnASingleLine: All
+AllowShortLambdasOnASingleLine: All
+AllowShortIfStatementsOnASingleLine: Never
+AllowShortLoopsOnASingleLine: false
+AlwaysBreakAfterDefinitionReturnType: None
+AlwaysBreakAfterReturnType: None
+AlwaysBreakBeforeMultilineStrings: false
+AlwaysBreakTemplateDeclarations: MultiLine
+BinPackArguments: true
+BinPackParameters: true
+BraceWrapping:   
+  AfterCaseLabel:  false
+  AfterClass:      false
+  AfterControlStatement: false
+  AfterEnum:       false
+  AfterFunction:   true
+  AfterNamespace:  false
+  AfterObjCDeclaration: false
+  AfterStruct:     false
+  AfterUnion:      false
+  AfterExternBlock: false
+  BeforeCatch:     false
+  BeforeElse:      false
+  IndentBraces:    false
+  SplitEmptyFunction: true
+  SplitEmptyRecord: true
+  SplitEmptyNamespace: true
+BreakBeforeBinaryOperators: All
+BreakBeforeBraces: WebKit
+BreakBeforeInheritanceComma: false
+BreakInheritanceList: BeforeColon
+BreakBeforeTernaryOperators: true
+BreakConstructorInitializersBeforeComma: false
+BreakConstructorInitializers: BeforeComma
+BreakAfterJavaFieldAnnotations: false
+BreakStringLiterals: true
+ColumnLimit:     80
+CommentPragmas:  '^ IWYU pragma:'
+CompactNamespaces: false
+ConstructorInitializerAllOnOneLineOrOnePerLine: false
+ConstructorInitializerIndentWidth: 4
+ContinuationIndentWidth: 4
+Cpp11BracedListStyle: false
+DerivePointerAlignment: false
+DisableFormat:   false
+ExperimentalAutoDetectBinPacking: false
+FixNamespaceComments: false
+ForEachMacros:   
+  - foreach
+  - Q_FOREACH
+  - BOOST_FOREACH
+IncludeBlocks:   Preserve
+IncludeCategories: 
+  - Regex:           '^"(llvm|llvm-c|clang|clang-c)/'
+    Priority:        2
+  - Regex:           '^(<|"(gtest|gmock|isl|json)/)'
+    Priority:        3
+  - Regex:           '.*'
+    Priority:        1
+IncludeIsMainRegex: '(Test)?$'
+IndentCaseLabels: false
+IndentPPDirectives: None
+IndentWidth:     4
+IndentWrappedFunctionNames: false
+JavaScriptQuotes: Leave
+JavaScriptWrapImports: true
+KeepEmptyLinesAtTheStartOfBlocks: true
+MacroBlockBegin: ''
+MacroBlockEnd:   ''
+MaxEmptyLinesToKeep: 1
+NamespaceIndentation: Inner
+ObjCBinPackProtocolList: Auto
+ObjCBlockIndentWidth: 4
+ObjCSpaceAfterProperty: true
+ObjCSpaceBeforeProtocolList: true
+PenaltyBreakAssignment: 2
+PenaltyBreakBeforeFirstCallParameter: 19
+PenaltyBreakComment: 300
+PenaltyBreakFirstLessLess: 120
+PenaltyBreakString: 1000
+PenaltyBreakTemplateDeclaration: 10
+PenaltyExcessCharacter: 1000000
+PenaltyReturnTypeOnItsOwnLine: 60
+PointerAlignment: Right
+ReflowComments:  true
+SortIncludes:    true
+SortUsingDeclarations: true
+SpaceAfterCStyleCast: false
+SpaceAfterLogicalNot: false
+SpaceAfterTemplateKeyword: true
+SpaceBeforeAssignmentOperators: true
+SpaceBeforeCpp11BracedList: true
+SpaceBeforeCtorInitializerColon: true
+SpaceBeforeInheritanceColon: true
+SpaceBeforeParens: ControlStatements
+SpaceBeforeRangeBasedForLoopColon: true
+SpaceInEmptyParentheses: false
+SpacesBeforeTrailingComments: 1
+SpacesInAngles:  false
+SpacesInContainerLiterals: true
+SpacesInCStyleCastParentheses: false
+SpacesInParentheses: false
+SpacesInSquareBrackets: false
+Standard:        Cpp11
+StatementMacros: 
+  - Q_UNUSED
+  - QT_REQUIRE_VERSION
+TabWidth:        4
+UseTab:          Always
+...
--- a/third_party/bzip2/bzip2.mk
+++ b/third_party/bzip2/bzip2.mk
@ -0,0 +1,35 @@
+#-*-mode:makefile-gmake;indent-tabs-mode:t;tab-width:8;coding:utf-8-*-┐
+#───vi: set et ft=make ts=8 tw=8 fenc=utf-8 :vi───────────────────────┘
+# Description:
+#   bzip2 is a compression format.
+
+PKGS += THIRD_PARTY_BZIP2
+
+THIRD_PARTY_BZIP2_BINS =						\
+	o/$(MODE)/third_party/bzip2/µbunzip2.com			\
+	o/$(MODE)/third_party/bzip2/µbunzip2.com.dbg
+
+THIRD_PARTY_BZIP2_OBJS =						\
+	o/$(MODE)/third_party/bzip2/µbunzip2.o
+
+THIRD_PARTY_BZIP2_DEPS := $(call uniq,					\
+	$(LIBC_STR)							\
+	$(LIBC_STDIO))
+
+$(THIRD_PARTY_BZIP2_OBJS):						\
+	DEFAULT_CPPFLAGS +=						\
+		-DHAVE_CONFIG_H
+
+o/$(MODE)/third_party/bzip2/µbunzip2.com.dbg:				\
+		$(THIRD_PARTY_BZIP2_DEPS)				\
+		$(THIRD_PARTY_BZIP2_OBJS)				\
+		$(CRT)							\
+		$(APE)
+	@$(APELINK)
+
+$(THIRD_PARTY_BZIP2_OBJS):						\
+		$(BUILD_FILES)						\
+		third_party/bzip2/bzip2.mk
+
+.PHONY: o/$(MODE)/third_party/bzip2
+o/$(MODE)/third_party/bzip2: $(THIRD_PARTY_BZIP2_BINS)
--- a/third_party/bzip2/µbunzip2.c
+++ b/third_party/bzip2/µbunzip2.c
@ -0,0 +1,569 @@
+/*	micro-bunzip, a small, simple bzip2 decompression implementation.
+	Copyright 2003 by Rob Landley (rob@landley.net).
+
+	Based on bzip2 decompression code by Julian R Seward (jseward@acm.org),
+	which also acknowledges contributions by Mike Burrows, David Wheeler,
+	Peter Fenwick, Alistair Moffat, Radford Neal, Ian H. Witten,
+	Robert Sedgewick, and Jon L. Bentley.
+
+	I hereby release this code under the GNU Library General Public License
+	(LGPL) version 2, available at http://www.gnu.org/copyleft/lgpl.html
+*/
+
+#include "libc/calls/calls.h"
+#include "libc/mem/mem.h"
+#include "libc/runtime/runtime.h"
+#include "libc/stdio/stdio.h"
+#include "libc/str/str.h"
+#include "libc/sysv/consts/fileno.h"
+
+/* Constants for huffman coding */
+#define MAX_GROUPS 6
+#define GROUP_SIZE 50 /* 64 would have been more efficient */
+#define MAX_HUFCODE_BITS 20 /* Longest huffman code allowed */
+#define MAX_SYMBOLS 258 /* 256 literals + RUNA + RUNB */
+#define SYMBOL_RUNA 0
+#define SYMBOL_RUNB 1
+
+/* Status return values */
+#define RETVAL_OK 0
+#define RETVAL_LAST_BLOCK (-1)
+#define RETVAL_NOT_BZIP_DATA (-2)
+#define RETVAL_UNEXPECTED_INPUT_EOF (-3)
+#define RETVAL_UNEXPECTED_OUTPUT_EOF (-4)
+#define RETVAL_DATA_ERROR (-5)
+#define RETVAL_OUT_OF_MEMORY (-6)
+#define RETVAL_OBSOLETE_INPUT (-7)
+
+/* Other housekeeping constants */
+#define IOBUF_SIZE 4096
+
+char *bunzip_errors[] = { NULL, "Bad file checksum", "Not bzip data",
+	"Unexpected input EOF", "Unexpected output EOF", "Data error",
+	"Out of memory", "Obsolete (pre 0.9.5) bzip format not supported." };
+
+/* This is what we know about each huffman coding group */
+struct group_data {
+	int limit[MAX_HUFCODE_BITS], base[MAX_HUFCODE_BITS], permute[MAX_SYMBOLS];
+	char minLen, maxLen;
+};
+
+/* Structure holding all the housekeeping data, including IO buffers and
+   memory that persists between calls to bunzip */
+typedef struct {
+	/* For I/O error handling */
+	jmp_buf jmpbuf;
+	/* Input stream, input buffer, input bit buffer */
+	int64_t in_fd, inbufCount, inbufPos;
+	unsigned char *inbuf;
+	unsigned int inbufBitCount, inbufBits;
+	/* Output buffer */
+	char outbuf[IOBUF_SIZE];
+	int outbufPos;
+	/* The CRC values stored in the block header and calculated from the data */
+	unsigned int crc32Table[256], headerCRC, dataCRC, totalCRC;
+	/* Intermediate buffer and its size (in bytes) */
+	unsigned int *dbuf, dbufSize;
+	/* State for interrupting output loop */
+	int writePos, writeRun, writeCount, writeCurrent;
+
+	/* These things are a bit too big to go on the stack */
+	unsigned char selectors[32768]; /* nSelectors=15 bits */
+	struct group_data groups[MAX_GROUPS]; /* huffman coding tables */
+} bunzip_data;
+
+/* Return the next nnn bits of input. All reads from the compressed
+   input are done through this function. All reads are big endian */
+static unsigned int get_bits(bunzip_data *bd, char bits_wanted)
+{
+	unsigned int bits = 0;
+
+	/* If we need to get more data from the byte buffer, do so. (Loop
+	   getting one byte at a time to enforce endianness and avoid
+	   unaligned access.) */
+	while (bd->inbufBitCount < bits_wanted) {
+		/* If we need to read more data from file into byte buffer, do so */
+		if (bd->inbufPos == bd->inbufCount) {
+			if (!(bd->inbufCount = read(bd->in_fd, bd->inbuf, IOBUF_SIZE)))
+				longjmp(bd->jmpbuf, RETVAL_UNEXPECTED_INPUT_EOF);
+			bd->inbufPos = 0;
+		}
+		/* Avoid 32-bit overflow (dump bit buffer to top of output) */
+		if (bd->inbufBitCount >= 24) {
+			bits = bd->inbufBits & ((1u << bd->inbufBitCount) - 1);
+			bits_wanted -= bd->inbufBitCount;
+			bits <<= bits_wanted;
+			bd->inbufBitCount = 0;
+		}
+		/* Grab next 8 bits of input from buffer. */
+		bd->inbufBits = (bd->inbufBits << 8) | bd->inbuf[bd->inbufPos++];
+		bd->inbufBitCount += 8;
+	}
+	/* Calculate result */
+	bd->inbufBitCount -= bits_wanted;
+	bits |= (bd->inbufBits >> bd->inbufBitCount) & ((1u << bits_wanted) - 1);
+
+	return bits;
+}
+
+/* Decompress a block of text to into intermediate buffer */
+
+static int read_bunzip_data(bunzip_data *bd)
+{
+	struct group_data *hufGroup;
+	int dbufCount, nextSym, dbufSize, origPtr, groupCount, *base, *limit,
+		selector, i, j, k, t, runPos, symCount, symTotal, nSelectors,
+		byteCount[256];
+	unsigned char uc, symToByte[256], mtfSymbol[256], *selectors;
+	unsigned int *dbuf;
+
+	/* Read in header signature (borrowing mtfSymbol for temp space). */
+	for (i = 0; i < 6; i++)
+		mtfSymbol[i] = get_bits(bd, 8);
+	mtfSymbol[6] = 0;
+	/* Read CRC (which is stored big endian). */
+	bd->headerCRC = get_bits(bd, 32);
+	/* Is this the last block (with CRC for file)? */
+	if (!strcmp((char *)mtfSymbol, "\x17\x72\x45\x38\x50\x90"))
+		return RETVAL_LAST_BLOCK;
+	/* If it's not a valid data block, barf. */
+	if (strcmp((char *)mtfSymbol, "\x31\x41\x59\x26\x53\x59"))
+		return RETVAL_NOT_BZIP_DATA;
+
+	dbuf = bd->dbuf;
+	dbufSize = bd->dbufSize;
+	selectors = bd->selectors;
+	/* We can add support for blockRandomised if anybody complains.
+	   There was some code for this in busybox 1.0.0-pre3, but nobody
+	   ever noticed that it didn't actually work. */
+	if (get_bits(bd, 1))
+		return RETVAL_OBSOLETE_INPUT;
+	if ((origPtr = get_bits(bd, 24)) > dbufSize)
+		return RETVAL_DATA_ERROR;
+	/* mapping table: if some byte values are never used (encoding
+	   things like ascii text), the compression code removes the gaps to
+	   have fewer symbols to deal with, and writes a sparse bitfield
+	   indicating which values were present. We make a translation table
+	   to convert the symbols back to the corresponding bytes. */
+	t = get_bits(bd, 16);
+	memset(symToByte, 0, 256);
+	symTotal = 0;
+	for (i = 0; i < 16; i++) {
+		if (t & (1u << (15 - i))) {
+			k = get_bits(bd, 16);
+			for (j = 0; j < 16; j++)
+				if (k & (1u << (15 - j)))
+					symToByte[symTotal++] = (16 * i) + j;
+		}
+	}
+	/* How many different huffman coding groups does this block use? */
+	groupCount = get_bits(bd, 3);
+	if (groupCount < 2 || groupCount > MAX_GROUPS)
+		return RETVAL_DATA_ERROR;
+	/* nSelectors: Every GROUP_SIZE many symbols we select a new huffman
+	   coding group. Read in the group selector list, which is stored as
+	   MTF encoded bit runs. */
+	if (!(nSelectors = get_bits(bd, 15)))
+		return RETVAL_DATA_ERROR;
+	for (i = 0; i < groupCount; i++)
+		mtfSymbol[i] = i;
+	for (i = 0; i < nSelectors; i++) {
+		/* Get next value */
+		for (j = 0; get_bits(bd, 1); j++)
+			if (j >= groupCount)
+				return RETVAL_DATA_ERROR;
+		/* Decode MTF to get the next selector */
+		uc = mtfSymbol[j];
+		memmove(mtfSymbol + 1, mtfSymbol, j);
+		mtfSymbol[0] = selectors[i] = uc;
+	}
+	/* Read the huffman coding tables for each group, which code for symTotal
+	   literal symbols, plus two run symbols (RUNA, RUNB) */
+	symCount = symTotal + 2;
+	for (j = 0; j < groupCount; j++) {
+		unsigned char length[MAX_SYMBOLS], temp[MAX_HUFCODE_BITS + 1];
+		int minLen, maxLen, pp;
+		/* Read lengths */
+		t = get_bits(bd, 5);
+		for (i = 0; i < symCount; i++) {
+			for (;;) {
+				if (t < 1 || t > MAX_HUFCODE_BITS)
+					return RETVAL_DATA_ERROR;
+				if (!get_bits(bd, 1))
+					break;
+				if (!get_bits(bd, 1))
+					t++;
+				else
+					t--;
+			}
+			length[i] = t;
+		}
+		/* Find largest and smallest lengths in this group */
+		minLen = maxLen = length[0];
+		for (i = 1; i < symCount; i++) {
+			if (length[i] > maxLen)
+				maxLen = length[i];
+			else if (length[i] < minLen)
+				minLen = length[i];
+		}
+		/* Calculate permute[], base[], and limit[] tables from length[].
+		 *
+		 * permute[] is the lookup table for converting huffman coded symbols
+		 * into decoded symbols.  base[] is the amount to subtract from the
+		 * value of a huffman symbol of a given length when using permute[].
+		 *
+		 * limit[] indicates the largest numerical value a symbol with a given
+		 * number of bits can have.  It lets us know when to stop reading.
+		 *
+		 * To use these, keep reading bits until value<=limit[bitcount] or
+		 * you've read over 20 bits (error).  Then the decoded symbol
+		 * equals permute[hufcode_value-base[hufcode_bitcount]].
+		 */
+		hufGroup = bd->groups + j;
+		hufGroup->minLen = minLen;
+		hufGroup->maxLen = maxLen;
+		/* Note that minLen can't be smaller than 1, so we adjust the
+		   base and limit array pointers so we're not always wasting the
+		   first entry. We do this again when using them (during symbol
+		   decoding).*/
+		base = hufGroup->base - 1;
+		limit = hufGroup->limit - 1;
+		/* Calculate permute[] */
+		pp = 0;
+		for (i = minLen; i <= maxLen; i++)
+			for (t = 0; t < symCount; t++)
+				if (length[t] == i)
+					hufGroup->permute[pp++] = t;
+		/* Count cumulative symbols coded for at each bit length */
+		for (i = minLen; i <= maxLen; i++)
+			temp[i] = limit[i] = 0;
+		for (i = 0; i < symCount; i++)
+			temp[length[i]]++;
+		/* Calculate limit[] (the largest symbol-coding value at each
+		   bit length, which is (previous limit<<1)+symbols at this
+		   level), and base[] (number of symbols to ignore at each bit
+		   length, which is limit-cumulative count of symbols coded for
+		   already). */
+		pp = t = 0;
+		for (i = minLen; i < maxLen; i++) {
+			pp += temp[i];
+			limit[i] = pp - 1;
+			pp <<= 1;
+			base[i + 1] = pp - (t += temp[i]);
+		}
+		limit[maxLen] = pp + temp[maxLen] - 1;
+		base[minLen] = 0;
+	}
+	/* We've finished reading and digesting the block header.  Now read this
+	   block's huffman coded symbols from the file and undo the huffman
+	   coding and run length encoding, saving the result into
+	   dbuf[dbufCount++]=uc */
+
+	/* Initialize symbol occurrence counters and symbol mtf table */
+	memset(byteCount, 0, 256 * sizeof(int));
+	for (i = 0; i < 256; i++)
+		mtfSymbol[i] = (unsigned char)i;
+	/* Loop through compressed symbols */
+	runPos = dbufCount = symCount = selector = 0;
+	for (;;) {
+		/* Determine which huffman coding group to use. */
+		if (!(symCount--)) {
+			symCount = GROUP_SIZE - 1;
+			if (selector >= nSelectors)
+				return RETVAL_DATA_ERROR;
+			hufGroup = bd->groups + selectors[selector++];
+			base = hufGroup->base - 1;
+			limit = hufGroup->limit - 1;
+		}
+		/* Read next huffman-coded symbol */
+		i = hufGroup->minLen;
+		j = get_bits(bd, i);
+		for (;;) {
+			if (i > hufGroup->maxLen)
+				return RETVAL_DATA_ERROR;
+			if (j <= limit[i])
+				break;
+			i++;
+
+			j = (j << 1) | get_bits(bd, 1);
+		}
+		/* Huffman decode nextSym (with bounds checking) */
+		j -= base[i];
+		if (j < 0 || j >= MAX_SYMBOLS)
+			return RETVAL_DATA_ERROR;
+		nextSym = hufGroup->permute[j];
+		/* If this is a repeated run, loop collecting data */
+		if (nextSym == SYMBOL_RUNA || nextSym == SYMBOL_RUNB) {
+			/* If this is the start of a new run, zero out counter */
+			if (!runPos) {
+				runPos = 1;
+				t = 0;
+			}
+			/* Neat trick that saves 1 symbol: instead of or-ing 0 or 1
+			   at each bit position, add 1 or 2 instead. For example,
+			   1011 is 1<<0 + 1<<1 + 2<<2. 1010 is 2<<0 + 2<<1 + 1<<2.
+			   You can make any bit pattern that way using 1 less symbol
+			   than the basic or 0/1 method (except all bits 0, which
+			   would use no symbols, but a run of length 0 doesn't mean
+			   anything in this context). Thus space is saved. */
+			if (nextSym == SYMBOL_RUNA)
+				t += runPos;
+			else
+				t += 2 * runPos;
+			runPos <<= 1;
+			continue;
+		}
+		/* When we hit the first non-run symbol after a run, we now know
+		   how many times to repeat the last literal, so append that
+		   many copies to our buffer of decoded symbols (dbuf) now. (The
+		   last literal used is the one at the head of the mtfSymbol
+		   array.) */
+		if (runPos) {
+			runPos = 0;
+			if (dbufCount + t >= dbufSize)
+				return RETVAL_DATA_ERROR;
+
+			uc = symToByte[mtfSymbol[0]];
+			byteCount[uc] += t;
+			while (t--)
+				dbuf[dbufCount++] = uc;
+		}
+		/* Is this the terminating symbol? */
+		if (nextSym > symTotal)
+			break;
+		/* At this point, the symbol we just decoded indicates a new
+		   literal character. Subtract one to get the position in the
+		   MTF array at which this literal is currently to be found.
+		   (Note that the result can't be -1 or 0, because 0 and 1 are
+		   RUNA and RUNB. Another instance of the first symbol in the
+		   mtf array, position 0, would have been handled as part of a
+		   run.) */
+		if (dbufCount >= dbufSize)
+			return RETVAL_DATA_ERROR;
+		i = nextSym - 1;
+		uc = mtfSymbol[i];
+		memmove(mtfSymbol + 1, mtfSymbol, i);
+		mtfSymbol[0] = uc;
+		uc = symToByte[uc];
+		/* We have our literal byte.  Save it into dbuf. */
+		byteCount[uc]++;
+		dbuf[dbufCount++] = (unsigned int)uc;
+	}
+	/* At this point, we've finished reading huffman-coded symbols and
+	   compressed runs from the input stream. There are dbufCount many
+	   of them in dbuf[]. Now undo the Burrows-Wheeler transform on
+	   dbuf. See http://dogma.net/markn/articles/bwt/bwt.htm */
+
+	/* Now we know what dbufCount is, do a better sanity check on origPtr.  */
+	if (origPtr < 0 || origPtr >= dbufCount)
+		return RETVAL_DATA_ERROR;
+	/* Turn byteCount into cumulative occurrence counts of 0 to n-1. */
+	j = 0;
+	for (i = 0; i < 256; i++) {
+		k = j + byteCount[i];
+		byteCount[i] = j;
+		j = k;
+	}
+	/* Figure out what order dbuf would be in if we sorted it. */
+	for (i = 0; i < dbufCount; i++) {
+		uc = (unsigned char)(dbuf[i] & 0xff);
+		dbuf[byteCount[uc]] |= (i << 8);
+		byteCount[uc]++;
+	}
+	/* blockRandomised support would go here. */
+
+	/* Using i as position, j as previous character, t as current character,
+	   and uc as run count */
+	bd->dataCRC = 0xffffffffL;
+	/* Decode first byte by hand to initialize "previous" byte. Note
+	   that it doesn't get output, and if the first three characters are
+	   identical it doesn't qualify as a run (hence uc=255, which will
+	   either wrap to 1 or get reset). */
+	if (dbufCount) {
+		bd->writePos = dbuf[origPtr];
+		bd->writeCurrent = (unsigned char)(bd->writePos & 0xff);
+		bd->writePos >>= 8;
+		bd->writeRun = -1;
+	}
+	bd->writeCount = dbufCount;
+
+	return RETVAL_OK;
+}
+
+/* Flush output buffer to disk */
+static void flush_bunzip_outbuf(bunzip_data *bd, int64_t out_fd)
+{
+	if (bd->outbufPos) {
+		if (write(out_fd, bd->outbuf, bd->outbufPos) != bd->outbufPos)
+			longjmp(bd->jmpbuf, RETVAL_UNEXPECTED_OUTPUT_EOF);
+		bd->outbufPos = 0;
+	}
+}
+
+/* Undo burrows-wheeler transform on intermediate buffer to produce output.
+   If !len, write up to len bytes of data to buf.  Otherwise write to out_fd.
+   Returns len ? bytes written : RETVAL_OK.  Notice all errors negative #'s. */
+static int write_bunzip_data(
+	bunzip_data *bd, int64_t out_fd, char *outbuf, int len)
+{
+	unsigned int *dbuf = bd->dbuf;
+	int count, pos, current, run, copies, outbyte, previous, gotcount = 0;
+
+	for (;;) {
+		/* If last read was short due to end of file, return last block now */
+		if (bd->writeCount < 0)
+			return bd->writeCount;
+		/* If we need to refill dbuf, do it. */
+		if (!bd->writeCount) {
+			int i = read_bunzip_data(bd);
+			if (i) {
+				if (i == RETVAL_LAST_BLOCK) {
+					bd->writeCount = i;
+					return gotcount;
+				} else
+					return i;
+			}
+		}
+		/* Loop generating output */
+		count = bd->writeCount;
+		pos = bd->writePos;
+		current = bd->writeCurrent;
+		run = bd->writeRun;
+		while (count) {
+			/* If somebody (like busybox tar) wants a certain number of
+			   bytes of data from memory instead of written to a file,
+			   humor them */
+			if (len && bd->outbufPos >= len)
+				goto dataus_interruptus;
+			count--;
+			/* Follow sequence vector to undo Burrows-Wheeler transform */
+			previous = current;
+			pos = dbuf[pos];
+			current = pos & 0xff;
+			pos >>= 8;
+			/* Whenever we see 3 consecutive copies of the same byte,
+			   the 4th is a repeat count */
+			if (run++ == 3) {
+				copies = current;
+				outbyte = previous;
+				current = -1;
+			} else {
+				copies = 1;
+				outbyte = current;
+			}
+			/* Output bytes to buffer, flushing to file if necessary */
+			while (copies--) {
+				if (bd->outbufPos == IOBUF_SIZE)
+					flush_bunzip_outbuf(bd, out_fd);
+				bd->outbuf[bd->outbufPos++] = outbyte;
+				bd->dataCRC = (bd->dataCRC << 8)
+					^ bd->crc32Table[(bd->dataCRC >> 24) ^ outbyte];
+			}
+			if (current != previous)
+				run = 0;
+		}
+		/* Decompression of this block completed successfully */
+		bd->dataCRC = ~(bd->dataCRC);
+		bd->totalCRC
+			= ((bd->totalCRC << 1) | (bd->totalCRC >> 31)) ^ bd->dataCRC;
+		/* If this block had a CRC error, force file level CRC error. */
+		if (bd->dataCRC != bd->headerCRC) {
+			bd->totalCRC = bd->headerCRC + 1;
+			return RETVAL_LAST_BLOCK;
+		}
+	dataus_interruptus:
+		bd->writeCount = count;
+		if (len) {
+			gotcount += bd->outbufPos;
+			memcpy(outbuf, bd->outbuf, len);
+			/* If we got enough data, checkpoint loop state and return */
+			if ((len -= bd->outbufPos) < 1) {
+				bd->outbufPos -= len;
+				if (bd->outbufPos)
+					memmove(bd->outbuf, bd->outbuf + len, bd->outbufPos);
+				bd->writePos = pos;
+				bd->writeCurrent = current;
+				bd->writeRun = run;
+				return gotcount;
+			}
+		}
+	}
+}
+
+/* Allocate the structure, read file header. If !len, src_fd contains
+   filehandle to read from. Else inbuf contains data. */
+static int start_bunzip(bunzip_data **bdp, int64_t src_fd, char *inbuf, int len)
+{
+	bunzip_data *bd;
+	unsigned int i, j, c;
+
+	/* Figure out how much data to allocate */
+	i = sizeof(bunzip_data);
+	if (!len)
+		i += IOBUF_SIZE;
+	/* Allocate bunzip_data.  Most fields initialize to zero. */
+	if (!(bd = *bdp = malloc(i)))
+		return RETVAL_OUT_OF_MEMORY;
+	memset(bd, 0, sizeof(bunzip_data));
+	if (len) {
+		bd->inbuf = (unsigned char *)inbuf;
+		bd->inbufCount = len;
+		bd->in_fd = -1;
+	} else {
+		bd->inbuf = (unsigned char *)(bd + 1);
+		bd->in_fd = src_fd;
+	}
+	/* Init the CRC32 table (big endian) */
+	for (i = 0; i < 256; i++) {
+		c = i << 24;
+		for (j = 8; j; j--)
+			c = c & 0x80000000 ? (c << 1) ^ 0x04c11db7 : (c << 1);
+		bd->crc32Table[i] = c;
+	}
+	/* Setup for I/O error handling via longjmp */
+	i = setjmp(bd->jmpbuf);
+	if (i)
+		return i;
+	/* Ensure that file starts with "BZh" */
+	for (i = 0; i < 3; i++)
+		if (get_bits(bd, 8) != "BZh"[i])
+			return RETVAL_NOT_BZIP_DATA;
+	/* Next byte ascii '1'-'9', indicates block size in units of 100k of
+	   uncompressed data.  Allocate intermediate buffer for block. */
+	i = get_bits(bd, 8);
+	if (i < '1' || i > '9')
+		return RETVAL_NOT_BZIP_DATA;
+	bd->dbufSize = 100000 * (i - '0');
+	if (!(bd->dbuf = malloc(bd->dbufSize * sizeof(int))))
+		return RETVAL_OUT_OF_MEMORY;
+	return RETVAL_OK;
+}
+
+/* Example usage: decompress src_fd to dst_fd.  (Stops at end of bzip data,
+   not end of file.) */
+static char *uncompressStream(int64_t src_fd, int64_t dst_fd)
+{
+	bunzip_data *bd;
+	int i;
+	if (!(i = start_bunzip(&bd, src_fd, 0, 0))) {
+		i = write_bunzip_data(bd, dst_fd, 0, 0);
+		if (i == RETVAL_LAST_BLOCK && bd->headerCRC == bd->totalCRC)
+			i = RETVAL_OK;
+	}
+	flush_bunzip_outbuf(bd, dst_fd);
+	if (bd->dbuf)
+		free(bd->dbuf);
+	free(bd);
+	return bunzip_errors[-i];
+}
+
+int main(int argc, char *argv[])
+{
+	char *err;
+	if (!(err = uncompressStream(STDIN_FILENO, STDOUT_FILENO))) {
+		return 0;
+	} else {
+		fprintf(stderr, "\n%s\n", err);
+		return 1;
+	}
+}
--- a/third_party/compiler_rt/absvdi2.c
+++ b/third_party/compiler_rt/absvdi2.c
@ -0,0 +1,32 @@
+/* clang-format off */
+/*===-- absvdi2.c - Implement __absvdi2 -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ *===----------------------------------------------------------------------===
+ *
+ * This file implements __absvdi2 for the compiler_rt library.
+ *
+ *===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+ 
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: absolute value */
+
+/* Effects: aborts if abs(x) < 0 */
+
+COMPILER_RT_ABI di_int
+__absvdi2(di_int a)
+{
+    const int N = (int)(sizeof(di_int) * CHAR_BIT);
+    if (a == ((di_int)1 << (N-1)))
+        compilerrt_abort();
+    const di_int t = a >> (N - 1);
+    return (a ^ t) - t;
+}
--- a/third_party/compiler_rt/absvsi2.c
+++ b/third_party/compiler_rt/absvsi2.c
@ -0,0 +1,32 @@
+/* clang-format off */
+/* ===-- absvsi2.c - Implement __absvsi2 -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __absvsi2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+ 
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: absolute value */
+
+/* Effects: aborts if abs(x) < 0 */
+
+COMPILER_RT_ABI si_int
+__absvsi2(si_int a)
+{
+    const int N = (int)(sizeof(si_int) * CHAR_BIT);
+    if (a == (1 << (N-1)))
+        compilerrt_abort();
+    const si_int t = a >> (N - 1);
+    return (a ^ t) - t;
+}
--- a/third_party/compiler_rt/absvti2.c
+++ b/third_party/compiler_rt/absvti2.c
@ -0,0 +1,37 @@
+/* clang-format off */
+/* ===-- absvti2.c - Implement __absvdi2 -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __absvti2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+#ifdef CRT_HAS_128BIT
+
+/* Returns: absolute value */
+
+/* Effects: aborts if abs(x) < 0 */
+
+COMPILER_RT_ABI ti_int
+__absvti2(ti_int a)
+{
+    const int N = (int)(sizeof(ti_int) * CHAR_BIT);
+    if (a == ((ti_int)1 << (N-1)))
+        compilerrt_abort();
+    const ti_int s = a >> (N - 1);
+    return (a ^ s) - s;
+}
+
+#endif /* CRT_HAS_128BIT */
+
--- a/third_party/compiler_rt/adddf3.c
+++ b/third_party/compiler_rt/adddf3.c
@ -0,0 +1,33 @@
+/* clang-format off */
+//===-- lib/adddf3.c - Double-precision addition ------------------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements double-precision soft-float addition with the IEEE-754
+// default rounding (to nearest, ties to even).
+//
+//===----------------------------------------------------------------------===//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define DOUBLE_PRECISION
+#include "third_party/compiler_rt/fp_add_impl.inc"
+
+COMPILER_RT_ABI double __adddf3(double a, double b){
+    return __addXf3__(a, b);
+}
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI double __aeabi_dadd(double a, double b) {
+  return __adddf3(a, b);
+}
+#else
+AEABI_RTABI double __aeabi_dadd(double a, double b) COMPILER_RT_ALIAS(__adddf3);
+#endif
+#endif
--- a/third_party/compiler_rt/addsf3.c
+++ b/third_party/compiler_rt/addsf3.c
@ -0,0 +1,33 @@
+/* clang-format off */
+//===-- lib/addsf3.c - Single-precision addition ------------------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements single-precision soft-float addition with the IEEE-754
+// default rounding (to nearest, ties to even).
+//
+//===----------------------------------------------------------------------===//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define SINGLE_PRECISION
+#include "third_party/compiler_rt/fp_add_impl.inc"
+
+COMPILER_RT_ABI float __addsf3(float a, float b) {
+    return __addXf3__(a, b);
+}
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI float __aeabi_fadd(float a, float b) {
+  return __addsf3(a, b);
+}
+#else
+AEABI_RTABI float __aeabi_fadd(float a, float b) COMPILER_RT_ALIAS(__addsf3);
+#endif
+#endif
--- a/third_party/compiler_rt/addtf3.c
+++ b/third_party/compiler_rt/addtf3.c
@ -0,0 +1,28 @@
+/* clang-format off */
+//===-- lib/addtf3.c - Quad-precision addition --------------------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements quad-precision soft-float addition with the IEEE-754
+// default rounding (to nearest, ties to even).
+//
+//===----------------------------------------------------------------------===//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define QUAD_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
+#include "third_party/compiler_rt/fp_add_impl.inc"
+
+COMPILER_RT_ABI long double __addtf3(long double a, long double b){
+    return __addXf3__(a, b);
+}
+
+#endif
--- a/third_party/compiler_rt/ashldi3.c
+++ b/third_party/compiler_rt/ashldi3.c
@ -0,0 +1,48 @@
+/* clang-format off */
+/* ====-- ashldi3.c - Implement __ashldi3 -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __ashldi3 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: a << b */
+
+/* Precondition:  0 <= b < bits_in_dword */
+
+COMPILER_RT_ABI di_int
+__ashldi3(di_int a, si_int b)
+{
+    const int bits_in_word = (int)(sizeof(si_int) * CHAR_BIT);
+    dwords input;
+    dwords result;
+    input.all = a;
+    if (b & bits_in_word)  /* bits_in_word <= b < bits_in_dword */
+    {
+        result.s.low = 0;
+        result.s.high = input.s.low << (b - bits_in_word);
+    }
+    else  /* 0 <= b < bits_in_word */
+    {
+        if (b == 0)
+            return a;
+        result.s.low  = input.s.low << b;
+        result.s.high = (input.s.high << b) | (input.s.low >> (bits_in_word - b));
+    }
+    return result.all;
+}
+
+#if defined(__ARM_EABI__)
+AEABI_RTABI di_int __aeabi_llsl(di_int a, si_int b) COMPILER_RT_ALIAS(__ashldi3);
+#endif
--- a/third_party/compiler_rt/ashlti3.c
+++ b/third_party/compiler_rt/ashlti3.c
@ -0,0 +1,48 @@
+/* clang-format off */
+/* ===-- ashlti3.c - Implement __ashlti3 -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __ashlti3 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+#ifdef CRT_HAS_128BIT
+
+/* Returns: a << b */
+
+/* Precondition:  0 <= b < bits_in_tword */
+
+COMPILER_RT_ABI ti_int
+__ashlti3(ti_int a, si_int b)
+{
+    const int bits_in_dword = (int)(sizeof(di_int) * CHAR_BIT);
+    twords input;
+    twords result;
+    input.all = a;
+    if (b & bits_in_dword)  /* bits_in_dword <= b < bits_in_tword */
+    {
+        result.s.low = 0;
+        result.s.high = input.s.low << (b - bits_in_dword);
+    }
+    else  /* 0 <= b < bits_in_dword */
+    {
+        if (b == 0)
+            return a;
+        result.s.low  = input.s.low << b;
+        result.s.high = (input.s.high << b) | (input.s.low >> (bits_in_dword - b));
+    }
+    return result.all;
+}
+
+#endif /* CRT_HAS_128BIT */
--- a/third_party/compiler_rt/ashrdi3.c
+++ b/third_party/compiler_rt/ashrdi3.c
@ -0,0 +1,49 @@
+/* clang-format off */
+/*===-- ashrdi3.c - Implement __ashrdi3 -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __ashrdi3 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: arithmetic a >> b */
+
+/* Precondition:  0 <= b < bits_in_dword */
+
+COMPILER_RT_ABI di_int
+__ashrdi3(di_int a, si_int b)
+{
+    const int bits_in_word = (int)(sizeof(si_int) * CHAR_BIT);
+    dwords input;
+    dwords result;
+    input.all = a;
+    if (b & bits_in_word)  /* bits_in_word <= b < bits_in_dword */
+    {
+        /* result.s.high = input.s.high < 0 ? -1 : 0 */
+        result.s.high = input.s.high >> (bits_in_word - 1);
+        result.s.low = input.s.high >> (b - bits_in_word);
+    }
+    else  /* 0 <= b < bits_in_word */
+    {
+        if (b == 0)
+            return a;
+        result.s.high  = input.s.high >> b;
+        result.s.low = (input.s.high << (bits_in_word - b)) | (input.s.low >> b);
+    }
+    return result.all;
+}
+
+#if defined(__ARM_EABI__)
+AEABI_RTABI di_int __aeabi_lasr(di_int a, si_int b) COMPILER_RT_ALIAS(__ashrdi3);
+#endif
--- a/third_party/compiler_rt/ashrti3.c
+++ b/third_party/compiler_rt/ashrti3.c
@ -0,0 +1,49 @@
+/* clang-format off */
+/* ===-- ashrti3.c - Implement __ashrti3 -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __ashrti3 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+#ifdef CRT_HAS_128BIT
+
+/* Returns: arithmetic a >> b */
+
+/* Precondition:  0 <= b < bits_in_tword */
+
+COMPILER_RT_ABI ti_int
+__ashrti3(ti_int a, si_int b)
+{
+    const int bits_in_dword = (int)(sizeof(di_int) * CHAR_BIT);
+    twords input;
+    twords result;
+    input.all = a;
+    if (b & bits_in_dword)  /* bits_in_dword <= b < bits_in_tword */
+    {
+        /* result.s.high = input.s.high < 0 ? -1 : 0 */
+        result.s.high = input.s.high >> (bits_in_dword - 1);
+        result.s.low = input.s.high >> (b - bits_in_dword);
+    }
+    else  /* 0 <= b < bits_in_dword */
+    {
+        if (b == 0)
+            return a;
+        result.s.high  = input.s.high >> b;
+        result.s.low = (input.s.high << (bits_in_dword - b)) | (input.s.low >> b);
+    }
+    return result.all;
+}
+
+#endif /* CRT_HAS_128BIT */
--- a/third_party/compiler_rt/assembly.h
+++ b/third_party/compiler_rt/assembly.h
@ -0,0 +1,205 @@
+/* clang-format off */
+/* ===-- assembly.h - compiler-rt assembler support macros -----------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file defines macros for use in compiler-rt assembler source.
+ * This file is not part of the interface of this library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+#ifndef COMPILERRT_ASSEMBLY_H
+#define COMPILERRT_ASSEMBLY_H
+
+#if defined(__POWERPC__) || defined(__powerpc__) || defined(__ppc__)
+#define SEPARATOR @
+#else
+#define SEPARATOR ;
+#endif
+
+#if defined(__APPLE__)
+#define HIDDEN(name) .private_extern name
+#define LOCAL_LABEL(name) L_##name
+// tell linker it can break up file at label boundaries
+#define FILE_LEVEL_DIRECTIVE .subsections_via_symbols
+#define SYMBOL_IS_FUNC(name)
+#define CONST_SECTION .const
+
+#define NO_EXEC_STACK_DIRECTIVE
+
+#elif defined(__ELF__)
+
+#define HIDDEN(name) .hidden name
+#define LOCAL_LABEL(name) .L_##name
+#define FILE_LEVEL_DIRECTIVE
+#if defined(__arm__)
+#define SYMBOL_IS_FUNC(name) .type name,%function
+#else
+#define SYMBOL_IS_FUNC(name) .type name,@function
+#endif
+#define CONST_SECTION .section .rodata
+
+#if defined(__GNU__) || defined(__FreeBSD__) || defined(__Fuchsia__) || \
+    defined(__linux__)
+#define NO_EXEC_STACK_DIRECTIVE .section .note.GNU-stack,"",%progbits
+#else
+#define NO_EXEC_STACK_DIRECTIVE
+#endif
+
+#else // !__APPLE__ && !__ELF__
+
+#define HIDDEN(name)
+#define LOCAL_LABEL(name) .L ## name
+#define FILE_LEVEL_DIRECTIVE
+#define SYMBOL_IS_FUNC(name)                                                   \
+  .def name SEPARATOR                                                          \
+    .scl 2 SEPARATOR                                                           \
+    .type 32 SEPARATOR                                                         \
+  .endef
+#define CONST_SECTION .section .rdata,"rd"
+
+#define NO_EXEC_STACK_DIRECTIVE
+
+#endif
+
+#if defined(__arm__)
+
+/*
+ * Determine actual [ARM][THUMB[1][2]] ISA using compiler predefined macros:
+ * - for '-mthumb -march=armv6' compiler defines '__thumb__'
+ * - for '-mthumb -march=armv7' compiler defines '__thumb__' and '__thumb2__'
+ */
+#if defined(__thumb2__) || defined(__thumb__)
+#define DEFINE_CODE_STATE .thumb SEPARATOR
+#define DECLARE_FUNC_ENCODING    .thumb_func SEPARATOR
+#if defined(__thumb2__)
+#define USE_THUMB_2
+#define IT(cond)  it cond
+#define ITT(cond) itt cond
+#define ITE(cond) ite cond
+#else
+#define USE_THUMB_1
+#define IT(cond)
+#define ITT(cond)
+#define ITE(cond)
+#endif // defined(__thumb__2)
+#else // !defined(__thumb2__) && !defined(__thumb__)
+#define DEFINE_CODE_STATE .arm SEPARATOR
+#define DECLARE_FUNC_ENCODING
+#define IT(cond)
+#define ITT(cond)
+#define ITE(cond)
+#endif
+
+#if defined(USE_THUMB_1) && defined(USE_THUMB_2)
+#error "USE_THUMB_1 and USE_THUMB_2 can't be defined together."
+#endif
+
+#if defined(__ARM_ARCH_4T__) || __ARM_ARCH >= 5
+#define ARM_HAS_BX
+#endif
+#if !defined(__ARM_FEATURE_CLZ) && !defined(USE_THUMB_1) &&  \
+    (__ARM_ARCH >= 6 || (__ARM_ARCH == 5 && !defined(__ARM_ARCH_5__)))
+#define __ARM_FEATURE_CLZ
+#endif
+
+#ifdef ARM_HAS_BX
+#define JMP(r) bx r
+#define JMPc(r, c) bx##c r
+#else
+#define JMP(r) mov pc, r
+#define JMPc(r, c) mov##c pc, r
+#endif
+
+// pop {pc} can't switch Thumb mode on ARMv4T
+#if __ARM_ARCH >= 5
+#define POP_PC() pop {pc}
+#else
+#define POP_PC()                                                               \
+  pop {ip};                                                                    \
+  JMP(ip)
+#endif
+
+#if defined(USE_THUMB_2)
+#define WIDE(op) op.w
+#else
+#define WIDE(op) op
+#endif
+#else // !defined(__arm)
+#define DECLARE_FUNC_ENCODING
+#define DEFINE_CODE_STATE
+#endif
+
+#define GLUE2(a, b) a##b
+#define GLUE(a, b) GLUE2(a, b)
+#define SYMBOL_NAME(name) GLUE(__USER_LABEL_PREFIX__, name)
+
+#ifdef VISIBILITY_HIDDEN
+#define DECLARE_SYMBOL_VISIBILITY(name)                                        \
+  HIDDEN(SYMBOL_NAME(name)) SEPARATOR
+#else
+#define DECLARE_SYMBOL_VISIBILITY(name)
+#endif
+
+#define DEFINE_COMPILERRT_FUNCTION(name)                                       \
+  DEFINE_CODE_STATE                                                            \
+  FILE_LEVEL_DIRECTIVE SEPARATOR                                               \
+  .globl SYMBOL_NAME(name) SEPARATOR                                           \
+  SYMBOL_IS_FUNC(SYMBOL_NAME(name)) SEPARATOR                                  \
+  DECLARE_SYMBOL_VISIBILITY(name)                                              \
+  DECLARE_FUNC_ENCODING                                                        \
+  SYMBOL_NAME(name):
+
+#define DEFINE_COMPILERRT_THUMB_FUNCTION(name)                                 \
+  DEFINE_CODE_STATE                                                            \
+  FILE_LEVEL_DIRECTIVE SEPARATOR                                               \
+  .globl SYMBOL_NAME(name) SEPARATOR                                           \
+  SYMBOL_IS_FUNC(SYMBOL_NAME(name)) SEPARATOR                                  \
+  DECLARE_SYMBOL_VISIBILITY(name) SEPARATOR                                    \
+  .thumb_func SEPARATOR                                                        \
+  SYMBOL_NAME(name):
+
+#define DEFINE_COMPILERRT_PRIVATE_FUNCTION(name)                               \
+  DEFINE_CODE_STATE                                                            \
+  FILE_LEVEL_DIRECTIVE SEPARATOR                                               \
+  .globl SYMBOL_NAME(name) SEPARATOR                                           \
+  SYMBOL_IS_FUNC(SYMBOL_NAME(name)) SEPARATOR                                  \
+  HIDDEN(SYMBOL_NAME(name)) SEPARATOR                                          \
+  DECLARE_FUNC_ENCODING                                                        \
+  SYMBOL_NAME(name):
+
+#define DEFINE_COMPILERRT_PRIVATE_FUNCTION_UNMANGLED(name)                     \
+  DEFINE_CODE_STATE                                                            \
+  .globl name SEPARATOR                                                        \
+  SYMBOL_IS_FUNC(name) SEPARATOR                                               \
+  HIDDEN(name) SEPARATOR                                                       \
+  DECLARE_FUNC_ENCODING                                                        \
+  name:
+
+#define DEFINE_COMPILERRT_FUNCTION_ALIAS(name, target)                         \
+  .globl SYMBOL_NAME(name) SEPARATOR                                           \
+  SYMBOL_IS_FUNC(SYMBOL_NAME(name)) SEPARATOR                                  \
+  DECLARE_SYMBOL_VISIBILITY(SYMBOL_NAME(name)) SEPARATOR                       \
+  .set SYMBOL_NAME(name), SYMBOL_NAME(target) SEPARATOR
+
+#if defined(__ARM_EABI__)
+#define DEFINE_AEABI_FUNCTION_ALIAS(aeabi_name, name)                          \
+  DEFINE_COMPILERRT_FUNCTION_ALIAS(aeabi_name, name)
+#else
+#define DEFINE_AEABI_FUNCTION_ALIAS(aeabi_name, name)
+#endif
+
+#ifdef __ELF__
+#define END_COMPILERRT_FUNCTION(name)                                          \
+  .size SYMBOL_NAME(name), . - SYMBOL_NAME(name)
+#else
+#define END_COMPILERRT_FUNCTION(name)
+#endif
+
+#endif /* COMPILERRT_ASSEMBLY_H */
--- a/third_party/compiler_rt/bswapdi2.c
+++ b/third_party/compiler_rt/bswapdi2.c
@ -0,0 +1,30 @@
+/* clang-format off */
+/* ===-- bswapdi2.c - Implement __bswapdi2 ---------------------------------===
+ *
+ *               The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __bswapdi2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+COMPILER_RT_ABI uint64_t __bswapdi2(uint64_t u) {
+  return (
+      (((u)&0xff00000000000000ULL) >> 56) |
+      (((u)&0x00ff000000000000ULL) >> 40) |
+      (((u)&0x0000ff0000000000ULL) >> 24) |
+      (((u)&0x000000ff00000000ULL) >> 8)  |
+      (((u)&0x00000000ff000000ULL) << 8)  |
+      (((u)&0x0000000000ff0000ULL) << 24) |
+      (((u)&0x000000000000ff00ULL) << 40) |
+      (((u)&0x00000000000000ffULL) << 56));
+}
--- a/third_party/compiler_rt/bswapsi2.c
+++ b/third_party/compiler_rt/bswapsi2.c
@ -0,0 +1,26 @@
+/* clang-format off */
+/* ===-- bswapsi2.c - Implement __bswapsi2 ---------------------------------===
+ *
+ *               The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __bswapsi2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+COMPILER_RT_ABI uint32_t __bswapsi2(uint32_t u) {
+  return (
+      (((u)&0xff000000) >> 24) |
+      (((u)&0x00ff0000) >> 8)  |
+      (((u)&0x0000ff00) << 8)  |
+      (((u)&0x000000ff) << 24));
+}
--- a/third_party/compiler_rt/clear_cache.c
+++ b/third_party/compiler_rt/clear_cache.c
@ -0,0 +1,183 @@
+/* clang-format off */
+/* ===-- clear_cache.c - Implement __clear_cache ---------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+#if __APPLE__
+  #include <libkern/OSCacheControl.h>
+#endif
+
+#if defined(_WIN32)
+/* Forward declare Win32 APIs since the GCC mode driver does not handle the
+   newer SDKs as well as needed.  */
+uint32_t FlushInstructionCache(uintptr_t hProcess, void *lpBaseAddress,
+                               uintptr_t dwSize);
+uintptr_t GetCurrentProcess(void);
+#endif
+
+#if defined(__linux__) && defined(__mips__)
+  #if defined(__ANDROID__) && defined(__LP64__)
+    /*
+     * clear_mips_cache - Invalidates instruction cache for Mips.
+     */
+    static void clear_mips_cache(const void* Addr, size_t Size) {
+      __asm__ volatile (
+        ".set push\n"
+        ".set noreorder\n"
+        ".set noat\n"
+        "beq %[Size], $zero, 20f\n"          /* If size == 0, branch around. */
+        "nop\n"
+        "daddu %[Size], %[Addr], %[Size]\n"  /* Calculate end address + 1 */
+        "rdhwr $v0, $1\n"                    /* Get step size for SYNCI.
+                                                $1 is $HW_SYNCI_Step */
+        "beq $v0, $zero, 20f\n"              /* If no caches require
+                                                synchronization, branch
+                                                around. */
+        "nop\n"
+        "10:\n"
+        "synci 0(%[Addr])\n"                 /* Synchronize all caches around
+                                                address. */
+        "daddu %[Addr], %[Addr], $v0\n"      /* Add step size. */
+        "sltu $at, %[Addr], %[Size]\n"       /* Compare current with end
+                                                address. */
+        "bne $at, $zero, 10b\n"              /* Branch if more to do. */
+        "nop\n"
+        "sync\n"                             /* Clear memory hazards. */
+        "20:\n"
+        "bal 30f\n"
+        "nop\n"
+        "30:\n"
+        "daddiu $ra, $ra, 12\n"              /* $ra has a value of $pc here.
+                                                Add offset of 12 to point to the
+                                                instruction after the last nop.
+                                              */
+        "jr.hb $ra\n"                        /* Return, clearing instruction
+                                                hazards. */
+        "nop\n"
+        ".set pop\n"
+        : [Addr] "+r"(Addr), [Size] "+r"(Size)
+        :: "at", "ra", "v0", "memory"
+      );
+    }
+  #endif
+#endif
+
+/*
+ * The compiler generates calls to __clear_cache() when creating 
+ * trampoline functions on the stack for use with nested functions.
+ * It is expected to invalidate the instruction cache for the 
+ * specified range.
+ */
+
+void __clear_cache(void *start, void *end) {
+#if __i386__ || __x86_64__ || defined(_M_IX86) || defined(_M_X64)
+/*
+ * Intel processors have a unified instruction and data cache
+ * so there is nothing to do
+ */
+#elif defined(_WIN32) && (defined(__arm__) || defined(__aarch64__))
+    FlushInstructionCache(GetCurrentProcess(), start, end - start);
+#elif defined(__arm__) && !defined(__APPLE__)
+    #if defined(__FreeBSD__) || defined(__NetBSD__)
+        struct arm_sync_icache_args arg;
+
+        arg.addr = (uintptr_t)start;
+        arg.len = (uintptr_t)end - (uintptr_t)start;
+
+        sysarch(ARM_SYNC_ICACHE, &arg);
+    #elif defined(__linux__)
+    /*
+     * We used to include asm/unistd.h for the __ARM_NR_cacheflush define, but
+     * it also brought many other unused defines, as well as a dependency on
+     * kernel headers to be installed.
+     *
+     * This value is stable at least since Linux 3.13 and should remain so for
+     * compatibility reasons, warranting it's re-definition here.
+     */
+    #define __ARM_NR_cacheflush 0x0f0002
+         register int start_reg __asm("r0") = (int) (intptr_t) start;
+         const register int end_reg __asm("r1") = (int) (intptr_t) end;
+         const register int flags __asm("r2") = 0;
+         const register int syscall_nr __asm("r7") = __ARM_NR_cacheflush;
+         __asm __volatile("svc 0x0"
+                          : "=r"(start_reg)
+                          : "r"(syscall_nr), "r"(start_reg), "r"(end_reg),
+                            "r"(flags));
+         assert(start_reg == 0 && "Cache flush syscall failed.");
+    #else
+        compilerrt_abort();
+    #endif
+#elif defined(__linux__) && defined(__mips__)
+  const uintptr_t start_int = (uintptr_t) start;
+  const uintptr_t end_int = (uintptr_t) end;
+    #if defined(__ANDROID__) && defined(__LP64__)
+        // Call synci implementation for short address range.
+        const uintptr_t address_range_limit = 256;
+        if ((end_int - start_int) <= address_range_limit) {
+            clear_mips_cache(start, (end_int - start_int));
+        } else {
+            syscall(__NR_cacheflush, start, (end_int - start_int), BCACHE);
+        }
+    #else
+        syscall(__NR_cacheflush, start, (end_int - start_int), BCACHE);
+    #endif
+#elif defined(__mips__) && defined(__OpenBSD__)
+  cacheflush(start, (uintptr_t)end - (uintptr_t)start, BCACHE);
+#elif defined(__aarch64__) && !defined(__APPLE__)
+  uint64_t xstart = (uint64_t)(uintptr_t) start;
+  uint64_t xend = (uint64_t)(uintptr_t) end;
+  uint64_t addr;
+
+  // Get Cache Type Info
+  uint64_t ctr_el0;
+  __asm __volatile("mrs %0, ctr_el0" : "=r"(ctr_el0));
+
+  /*
+   * dc & ic instructions must use 64bit registers so we don't use
+   * uintptr_t in case this runs in an IPL32 environment.
+   */
+  const size_t dcache_line_size = 4 << ((ctr_el0 >> 16) & 15);
+  for (addr = xstart & ~(dcache_line_size - 1); addr < xend;
+       addr += dcache_line_size)
+    __asm __volatile("dc cvau, %0" :: "r"(addr));
+  __asm __volatile("dsb ish");
+
+  const size_t icache_line_size = 4 << ((ctr_el0 >> 0) & 15);
+  for (addr = xstart & ~(icache_line_size - 1); addr < xend;
+       addr += icache_line_size)
+    __asm __volatile("ic ivau, %0" :: "r"(addr));
+  __asm __volatile("isb sy");
+#elif defined (__powerpc64__)
+  const size_t line_size = 32;
+  const size_t len = (uintptr_t)end - (uintptr_t)start;
+
+  const uintptr_t mask = ~(line_size - 1);
+  const uintptr_t start_line = ((uintptr_t)start) & mask;
+  const uintptr_t end_line = ((uintptr_t)start + len + line_size - 1) & mask;
+
+  for (uintptr_t line = start_line; line < end_line; line += line_size)
+    __asm__ volatile("dcbf 0, %0" : : "r"(line));
+  __asm__ volatile("sync");
+
+  for (uintptr_t line = start_line; line < end_line; line += line_size)
+    __asm__ volatile("icbi 0, %0" : : "r"(line));
+  __asm__ volatile("isync");
+#else
+    #if __APPLE__
+        /* On Darwin, sys_icache_invalidate() provides this functionality */
+        sys_icache_invalidate(start, end-start);
+    #else
+        compilerrt_abort();
+    #endif
+#endif
+}
--- a/third_party/compiler_rt/clzdi2.c
+++ b/third_party/compiler_rt/clzdi2.c
@ -0,0 +1,43 @@
+/* clang-format off */
+/* ===-- clzdi2.c - Implement __clzdi2 -------------------------------------===
+ *
+ *               The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __clzdi2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: the number of leading 0-bits */
+
+#if !defined(__clang__) &&                                                     \
+    ((defined(__sparc__) && defined(__arch64__)) ||                            \
+     defined(__mips64) ||                                                      \
+     (defined(__riscv) && __SIZEOF_POINTER__ >= 8))
+/* On 64-bit architectures with neither a native clz instruction nor a native
+ * ctz instruction, gcc resolves __builtin_clz to __clzdi2 rather than
+ * __clzsi2, leading to infinite recursion. */
+#define __builtin_clz(a) __clzsi2(a)
+extern si_int __clzsi2(si_int);
+#endif
+
+/* Precondition: a != 0 */
+
+COMPILER_RT_ABI si_int
+__clzdi2(di_int a)
+{
+    dwords x;
+    x.all = a;
+    const si_int f = -(x.s.high == 0);
+    return __builtin_clz((x.s.high & ~f) | (x.s.low & f)) +
+           (f & ((si_int)(sizeof(si_int) * CHAR_BIT)));
+}
--- a/third_party/compiler_rt/clzsi2.c
+++ b/third_party/compiler_rt/clzsi2.c
@ -0,0 +1,56 @@
+/* clang-format off */
+/* ===-- clzsi2.c - Implement __clzsi2 -------------------------------------===
+ *
+ *               The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __clzsi2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: the number of leading 0-bits */
+
+/* Precondition: a != 0 */
+
+COMPILER_RT_ABI si_int
+__clzsi2(si_int a)
+{
+    su_int x = (su_int)a;
+    si_int t = ((x & 0xFFFF0000) == 0) << 4;  /* if (x is small) t = 16 else 0 */
+    x >>= 16 - t;      /* x = [0 - 0xFFFF] */
+    su_int r = t;       /* r = [0, 16] */
+    /* return r + clz(x) */
+    t = ((x & 0xFF00) == 0) << 3;
+    x >>= 8 - t;       /* x = [0 - 0xFF] */
+    r += t;            /* r = [0, 8, 16, 24] */
+    /* return r + clz(x) */
+    t = ((x & 0xF0) == 0) << 2;
+    x >>= 4 - t;       /* x = [0 - 0xF] */
+    r += t;            /* r = [0, 4, 8, 12, 16, 20, 24, 28] */
+    /* return r + clz(x) */
+    t = ((x & 0xC) == 0) << 1;
+    x >>= 2 - t;       /* x = [0 - 3] */
+    r += t;            /* r = [0 - 30] and is even */
+    /* return r + clz(x) */
+/*     switch (x)
+ *     {
+ *     case 0:
+ *         return r + 2;
+ *     case 1:
+ *         return r + 1;
+ *     case 2:
+ *     case 3:
+ *         return r;
+ *     }
+ */
+    return r + ((2 - x) & -((x & 2) == 0));
+}
--- a/third_party/compiler_rt/clzti2.c
+++ b/third_party/compiler_rt/clzti2.c
@ -0,0 +1,36 @@
+/* clang-format off */
+/* ===-- clzti2.c - Implement __clzti2 -------------------------------------===
+ *
+ *      	       The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __clzti2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+#ifdef CRT_HAS_128BIT
+
+/* Returns: the number of leading 0-bits */
+
+/* Precondition: a != 0 */
+
+COMPILER_RT_ABI si_int
+__clzti2(ti_int a)
+{
+    twords x;
+    x.all = a;
+    const di_int f = -(x.s.high == 0);
+    return __builtin_clzll((x.s.high & ~f) | (x.s.low & f)) +
+           ((si_int)f & ((si_int)(sizeof(di_int) * CHAR_BIT)));
+}
+
+#endif /* CRT_HAS_128BIT */
--- a/third_party/compiler_rt/cmpdi2.c
+++ b/third_party/compiler_rt/cmpdi2.c
@ -0,0 +1,54 @@
+/* clang-format off */
+/* ===-- cmpdi2.c - Implement __cmpdi2 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __cmpdi2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: if (a <  b) returns 0
+*           if (a == b) returns 1
+*           if (a >  b) returns 2
+*/
+
+COMPILER_RT_ABI si_int
+__cmpdi2(di_int a, di_int b)
+{
+    dwords x;
+    x.all = a;
+    dwords y;
+    y.all = b;
+    if (x.s.high < y.s.high)
+        return 0;
+    if (x.s.high > y.s.high)
+        return 2;
+    if (x.s.low < y.s.low)
+        return 0;
+    if (x.s.low > y.s.low)
+        return 2;
+    return 1;
+}
+
+#ifdef __ARM_EABI__
+/* Returns: if (a <  b) returns -1
+*           if (a == b) returns  0
+*           if (a >  b) returns  1
+*/
+COMPILER_RT_ABI si_int
+__aeabi_lcmp(di_int a, di_int b)
+{
+	return __cmpdi2(a, b) - 1;
+}
+#endif
+
--- a/third_party/compiler_rt/cmpti2.c
+++ b/third_party/compiler_rt/cmpti2.c
@ -0,0 +1,45 @@
+/* clang-format off */
+/* ===-- cmpti2.c - Implement __cmpti2 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __cmpti2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+#ifdef CRT_HAS_128BIT
+
+/* Returns:  if (a <  b) returns 0
+ *           if (a == b) returns 1
+ *           if (a >  b) returns 2
+ */
+
+COMPILER_RT_ABI si_int
+__cmpti2(ti_int a, ti_int b)
+{
+    twords x;
+    x.all = a;
+    twords y;
+    y.all = b;
+    if (x.s.high < y.s.high)
+        return 0;
+    if (x.s.high > y.s.high)
+        return 2;
+    if (x.s.low < y.s.low)
+        return 0;
+    if (x.s.low > y.s.low)
+        return 2;
+    return 1;
+}
+
+#endif /* CRT_HAS_128BIT */
--- a/third_party/compiler_rt/comparedf2.c
+++ b/third_party/compiler_rt/comparedf2.c
@ -0,0 +1,156 @@
+/* clang-format off */
+//===-- lib/comparedf2.c - Double-precision comparisons -----------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// // This file implements the following soft-float comparison routines:
+//
+//   __eqdf2   __gedf2   __unorddf2
+//   __ledf2   __gtdf2
+//   __ltdf2
+//   __nedf2
+//
+// The semantics of the routines grouped in each column are identical, so there
+// is a single implementation for each, and wrappers to provide the other names.
+//
+// The main routines behave as follows:
+//
+//   __ledf2(a,b) returns -1 if a < b
+//                         0 if a == b
+//                         1 if a > b
+//                         1 if either a or b is NaN
+//
+//   __gedf2(a,b) returns -1 if a < b
+//                         0 if a == b
+//                         1 if a > b
+//                        -1 if either a or b is NaN
+//
+//   __unorddf2(a,b) returns 0 if both a and b are numbers
+//                           1 if either a or b is NaN
+//
+// Note that __ledf2( ) and __gedf2( ) are identical except in their handling of
+// NaN values.
+//
+//===----------------------------------------------------------------------===//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define DOUBLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+enum LE_RESULT {
+    LE_LESS      = -1,
+    LE_EQUAL     =  0,
+    LE_GREATER   =  1,
+    LE_UNORDERED =  1
+};
+
+COMPILER_RT_ABI enum LE_RESULT
+__ledf2(fp_t a, fp_t b) {
+    
+    const srep_t aInt = toRep(a);
+    const srep_t bInt = toRep(b);
+    const rep_t aAbs = aInt & absMask;
+    const rep_t bAbs = bInt & absMask;
+    
+    // If either a or b is NaN, they are unordered.
+    if (aAbs > infRep || bAbs > infRep) return LE_UNORDERED;
+    
+    // If a and b are both zeros, they are equal.
+    if ((aAbs | bAbs) == 0) return LE_EQUAL;
+    
+    // If at least one of a and b is positive, we get the same result comparing
+    // a and b as signed integers as we would with a floating-point compare.
+    if ((aInt & bInt) >= 0) {
+        if (aInt < bInt) return LE_LESS;
+        else if (aInt == bInt) return LE_EQUAL;
+        else return LE_GREATER;
+    }
+    
+    // Otherwise, both are negative, so we need to flip the sense of the
+    // comparison to get the correct result.  (This assumes a twos- or ones-
+    // complement integer representation; if integers are represented in a
+    // sign-magnitude representation, then this flip is incorrect).
+    else {
+        if (aInt > bInt) return LE_LESS;
+        else if (aInt == bInt) return LE_EQUAL;
+        else return LE_GREATER;
+    }
+}
+
+#if defined(__ELF__)
+// Alias for libgcc compatibility
+FNALIAS(__cmpdf2, __ledf2);
+#endif
+
+enum GE_RESULT {
+    GE_LESS      = -1,
+    GE_EQUAL     =  0,
+    GE_GREATER   =  1,
+    GE_UNORDERED = -1   // Note: different from LE_UNORDERED
+};
+
+COMPILER_RT_ABI enum GE_RESULT
+__gedf2(fp_t a, fp_t b) {
+    
+    const srep_t aInt = toRep(a);
+    const srep_t bInt = toRep(b);
+    const rep_t aAbs = aInt & absMask;
+    const rep_t bAbs = bInt & absMask;
+    
+    if (aAbs > infRep || bAbs > infRep) return GE_UNORDERED;
+    if ((aAbs | bAbs) == 0) return GE_EQUAL;
+    if ((aInt & bInt) >= 0) {
+        if (aInt < bInt) return GE_LESS;
+        else if (aInt == bInt) return GE_EQUAL;
+        else return GE_GREATER;
+    } else {
+        if (aInt > bInt) return GE_LESS;
+        else if (aInt == bInt) return GE_EQUAL;
+        else return GE_GREATER;
+    }
+}
+
+COMPILER_RT_ABI int
+__unorddf2(fp_t a, fp_t b) {
+    const rep_t aAbs = toRep(a) & absMask;
+    const rep_t bAbs = toRep(b) & absMask;
+    return aAbs > infRep || bAbs > infRep;
+}
+
+// The following are alternative names for the preceding routines.
+
+COMPILER_RT_ABI enum LE_RESULT
+__eqdf2(fp_t a, fp_t b) {
+    return __ledf2(a, b);
+}
+
+COMPILER_RT_ABI enum LE_RESULT
+__ltdf2(fp_t a, fp_t b) {
+    return __ledf2(a, b);
+}
+
+COMPILER_RT_ABI enum LE_RESULT
+__nedf2(fp_t a, fp_t b) {
+    return __ledf2(a, b);
+}
+
+COMPILER_RT_ABI enum GE_RESULT
+__gtdf2(fp_t a, fp_t b) {
+    return __gedf2(a, b);
+}
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI int __aeabi_dcmpun(fp_t a, fp_t b) {
+  return __unorddf2(a, b);
+}
+#else
+AEABI_RTABI int __aeabi_dcmpun(fp_t a, fp_t b) COMPILER_RT_ALIAS(__unorddf2);
+#endif
+#endif
--- a/third_party/compiler_rt/comparesf2.c
+++ b/third_party/compiler_rt/comparesf2.c
@ -0,0 +1,156 @@
+/* clang-format off */
+//===-- lib/comparesf2.c - Single-precision comparisons -----------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements the following soft-fp_t comparison routines:
+//
+//   __eqsf2   __gesf2   __unordsf2
+//   __lesf2   __gtsf2
+//   __ltsf2
+//   __nesf2
+//
+// The semantics of the routines grouped in each column are identical, so there
+// is a single implementation for each, and wrappers to provide the other names.
+//
+// The main routines behave as follows:
+//
+//   __lesf2(a,b) returns -1 if a < b
+//                         0 if a == b
+//                         1 if a > b
+//                         1 if either a or b is NaN
+//
+//   __gesf2(a,b) returns -1 if a < b
+//                         0 if a == b
+//                         1 if a > b
+//                        -1 if either a or b is NaN
+//
+//   __unordsf2(a,b) returns 0 if both a and b are numbers
+//                           1 if either a or b is NaN
+//
+// Note that __lesf2( ) and __gesf2( ) are identical except in their handling of
+// NaN values.
+//
+//===----------------------------------------------------------------------===//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define SINGLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+enum LE_RESULT {
+    LE_LESS      = -1,
+    LE_EQUAL     =  0,
+    LE_GREATER   =  1,
+    LE_UNORDERED =  1
+};
+
+COMPILER_RT_ABI enum LE_RESULT
+__lesf2(fp_t a, fp_t b) {
+    
+    const srep_t aInt = toRep(a);
+    const srep_t bInt = toRep(b);
+    const rep_t aAbs = aInt & absMask;
+    const rep_t bAbs = bInt & absMask;
+    
+    // If either a or b is NaN, they are unordered.
+    if (aAbs > infRep || bAbs > infRep) return LE_UNORDERED;
+    
+    // If a and b are both zeros, they are equal.
+    if ((aAbs | bAbs) == 0) return LE_EQUAL;
+    
+    // If at least one of a and b is positive, we get the same result comparing
+    // a and b as signed integers as we would with a fp_ting-point compare.
+    if ((aInt & bInt) >= 0) {
+        if (aInt < bInt) return LE_LESS;
+        else if (aInt == bInt) return LE_EQUAL;
+        else return LE_GREATER;
+    }
+    
+    // Otherwise, both are negative, so we need to flip the sense of the
+    // comparison to get the correct result.  (This assumes a twos- or ones-
+    // complement integer representation; if integers are represented in a
+    // sign-magnitude representation, then this flip is incorrect).
+    else {
+        if (aInt > bInt) return LE_LESS;
+        else if (aInt == bInt) return LE_EQUAL;
+        else return LE_GREATER;
+    }
+}
+
+#if defined(__ELF__)
+// Alias for libgcc compatibility
+FNALIAS(__cmpsf2, __lesf2);
+#endif
+
+enum GE_RESULT {
+    GE_LESS      = -1,
+    GE_EQUAL     =  0,
+    GE_GREATER   =  1,
+    GE_UNORDERED = -1   // Note: different from LE_UNORDERED
+};
+
+COMPILER_RT_ABI enum GE_RESULT
+__gesf2(fp_t a, fp_t b) {
+    
+    const srep_t aInt = toRep(a);
+    const srep_t bInt = toRep(b);
+    const rep_t aAbs = aInt & absMask;
+    const rep_t bAbs = bInt & absMask;
+    
+    if (aAbs > infRep || bAbs > infRep) return GE_UNORDERED;
+    if ((aAbs | bAbs) == 0) return GE_EQUAL;
+    if ((aInt & bInt) >= 0) {
+        if (aInt < bInt) return GE_LESS;
+        else if (aInt == bInt) return GE_EQUAL;
+        else return GE_GREATER;
+    } else {
+        if (aInt > bInt) return GE_LESS;
+        else if (aInt == bInt) return GE_EQUAL;
+        else return GE_GREATER;
+    }
+}
+
+COMPILER_RT_ABI int
+__unordsf2(fp_t a, fp_t b) {
+    const rep_t aAbs = toRep(a) & absMask;
+    const rep_t bAbs = toRep(b) & absMask;
+    return aAbs > infRep || bAbs > infRep;
+}
+
+// The following are alternative names for the preceding routines.
+
+COMPILER_RT_ABI enum LE_RESULT
+__eqsf2(fp_t a, fp_t b) {
+    return __lesf2(a, b);
+}
+
+COMPILER_RT_ABI enum LE_RESULT
+__ltsf2(fp_t a, fp_t b) {
+    return __lesf2(a, b);
+}
+
+COMPILER_RT_ABI enum LE_RESULT
+__nesf2(fp_t a, fp_t b) {
+    return __lesf2(a, b);
+}
+
+COMPILER_RT_ABI enum GE_RESULT
+__gtsf2(fp_t a, fp_t b) {
+    return __gesf2(a, b);
+}
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI int __aeabi_fcmpun(fp_t a, fp_t b) {
+  return __unordsf2(a, b);
+}
+#else
+AEABI_RTABI int __aeabi_fcmpun(fp_t a, fp_t b) COMPILER_RT_ALIAS(__unordsf2);
+#endif
+#endif
--- a/third_party/compiler_rt/comparetf2.c
+++ b/third_party/compiler_rt/comparetf2.c
@ -0,0 +1,141 @@
+/* clang-format off */
+//===-- lib/comparetf2.c - Quad-precision comparisons -------------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// // This file implements the following soft-float comparison routines:
+//
+//   __eqtf2   __getf2   __unordtf2
+//   __letf2   __gttf2
+//   __lttf2
+//   __netf2
+//
+// The semantics of the routines grouped in each column are identical, so there
+// is a single implementation for each, and wrappers to provide the other names.
+//
+// The main routines behave as follows:
+//
+//   __letf2(a,b) returns -1 if a < b
+//                         0 if a == b
+//                         1 if a > b
+//                         1 if either a or b is NaN
+//
+//   __getf2(a,b) returns -1 if a < b
+//                         0 if a == b
+//                         1 if a > b
+//                        -1 if either a or b is NaN
+//
+//   __unordtf2(a,b) returns 0 if both a and b are numbers
+//                           1 if either a or b is NaN
+//
+// Note that __letf2( ) and __getf2( ) are identical except in their handling of
+// NaN values.
+//
+//===----------------------------------------------------------------------===//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define QUAD_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
+enum LE_RESULT {
+    LE_LESS      = -1,
+    LE_EQUAL     =  0,
+    LE_GREATER   =  1,
+    LE_UNORDERED =  1
+};
+
+COMPILER_RT_ABI enum LE_RESULT __letf2(fp_t a, fp_t b) {
+
+    const srep_t aInt = toRep(a);
+    const srep_t bInt = toRep(b);
+    const rep_t aAbs = aInt & absMask;
+    const rep_t bAbs = bInt & absMask;
+
+    // If either a or b is NaN, they are unordered.
+    if (aAbs > infRep || bAbs > infRep) return LE_UNORDERED;
+
+    // If a and b are both zeros, they are equal.
+    if ((aAbs | bAbs) == 0) return LE_EQUAL;
+
+    // If at least one of a and b is positive, we get the same result comparing
+    // a and b as signed integers as we would with a floating-point compare.
+    if ((aInt & bInt) >= 0) {
+        if (aInt < bInt) return LE_LESS;
+        else if (aInt == bInt) return LE_EQUAL;
+        else return LE_GREATER;
+    }
+    else {
+        // Otherwise, both are negative, so we need to flip the sense of the
+        // comparison to get the correct result.  (This assumes a twos- or ones-
+        // complement integer representation; if integers are represented in a
+        // sign-magnitude representation, then this flip is incorrect).
+        if (aInt > bInt) return LE_LESS;
+        else if (aInt == bInt) return LE_EQUAL;
+        else return LE_GREATER;
+    }
+}
+
+#if defined(__ELF__)
+// Alias for libgcc compatibility
+FNALIAS(__cmptf2, __letf2);
+#endif
+
+enum GE_RESULT {
+    GE_LESS      = -1,
+    GE_EQUAL     =  0,
+    GE_GREATER   =  1,
+    GE_UNORDERED = -1   // Note: different from LE_UNORDERED
+};
+
+COMPILER_RT_ABI enum GE_RESULT __getf2(fp_t a, fp_t b) {
+
+    const srep_t aInt = toRep(a);
+    const srep_t bInt = toRep(b);
+    const rep_t aAbs = aInt & absMask;
+    const rep_t bAbs = bInt & absMask;
+
+    if (aAbs > infRep || bAbs > infRep) return GE_UNORDERED;
+    if ((aAbs | bAbs) == 0) return GE_EQUAL;
+    if ((aInt & bInt) >= 0) {
+        if (aInt < bInt) return GE_LESS;
+        else if (aInt == bInt) return GE_EQUAL;
+        else return GE_GREATER;
+    } else {
+        if (aInt > bInt) return GE_LESS;
+        else if (aInt == bInt) return GE_EQUAL;
+        else return GE_GREATER;
+    }
+}
+
+COMPILER_RT_ABI int __unordtf2(fp_t a, fp_t b) {
+    const rep_t aAbs = toRep(a) & absMask;
+    const rep_t bAbs = toRep(b) & absMask;
+    return aAbs > infRep || bAbs > infRep;
+}
+
+// The following are alternative names for the preceding routines.
+
+COMPILER_RT_ABI enum LE_RESULT __eqtf2(fp_t a, fp_t b) {
+    return __letf2(a, b);
+}
+
+COMPILER_RT_ABI enum LE_RESULT __lttf2(fp_t a, fp_t b) {
+    return __letf2(a, b);
+}
+
+COMPILER_RT_ABI enum LE_RESULT __netf2(fp_t a, fp_t b) {
+    return __letf2(a, b);
+}
+
+COMPILER_RT_ABI enum GE_RESULT __gttf2(fp_t a, fp_t b) {
+    return __getf2(a, b);
+}
+
+#endif
--- a/third_party/compiler_rt/compiler_rt.mk
+++ b/third_party/compiler_rt/compiler_rt.mk
@ -0,0 +1,60 @@
+#-*-mode:makefile-gmake;indent-tabs-mode:t;tab-width:8;coding:utf-8-*-┐
+#───vi: set et ft=make ts=8 tw=8 fenc=utf-8 :vi───────────────────────┘
+
+PKGS += THIRD_PARTY_COMPILER_RT
+
+THIRD_PARTY_COMPILER_RT_ARTIFACTS += THIRD_PARTY_COMPILER_RT_A
+THIRD_PARTY_COMPILER_RT = $(THIRD_PARTY_COMPILER_RT_A_DEPS) $(THIRD_PARTY_COMPILER_RT_A)
+THIRD_PARTY_COMPILER_RT_A = o/$(MODE)/third_party/compiler_rt/compiler_rt.a
+THIRD_PARTY_COMPILER_RT_A_FILES :=				\
+	$(wildcard third_party/compiler_rt/*)			\
+	$(wildcard third_party/compiler_rt/nexgen32e/*)
+THIRD_PARTY_COMPILER_RT_A_HDRS = $(filter %.h,$(THIRD_PARTY_COMPILER_RT_A_FILES))
+THIRD_PARTY_COMPILER_RT_A_SRCS_S = $(filter %.S,$(THIRD_PARTY_COMPILER_RT_A_FILES))
+THIRD_PARTY_COMPILER_RT_A_SRCS_C = $(filter %.c,$(THIRD_PARTY_COMPILER_RT_A_FILES))
+
+THIRD_PARTY_COMPILER_RT_A_SRCS =				\
+	$(THIRD_PARTY_COMPILER_RT_A_SRCS_S)			\
+	$(THIRD_PARTY_COMPILER_RT_A_SRCS_C)
+
+THIRD_PARTY_COMPILER_RT_A_OBJS =				\
+	$(THIRD_PARTY_COMPILER_RT_A_SRCS:%=o/$(MODE)/%.zip.o)	\
+	$(THIRD_PARTY_COMPILER_RT_A_SRCS_S:%.S=o/$(MODE)/%.o)	\
+	$(THIRD_PARTY_COMPILER_RT_A_SRCS_C:%.c=o/$(MODE)/%.o)
+
+THIRD_PARTY_COMPILER_RT_A_CHECKS =				\
+	$(THIRD_PARTY_COMPILER_RT_A).pkg			\
+	$(THIRD_PARTY_COMPILER_RT_A_HDRS:%=o/$(MODE)/%.ok)
+
+THIRD_PARTY_COMPILER_RT_A_DIRECTDEPS =				\
+	LIBC_MATH						\
+	LIBC_STUBS
+
+THIRD_PARTY_COMPILER_RT_A_DEPS :=				\
+	$(call uniq,$(foreach x,$(THIRD_PARTY_COMPILER_RT_A_DIRECTDEPS),$($(x))))
+
+$(THIRD_PARTY_COMPILER_RT_A):					\
+		third_party/compiler_rt/			\
+		$(THIRD_PARTY_COMPILER_RT_A).pkg		\
+		$(THIRD_PARTY_COMPILER_RT_A_OBJS)
+
+$(THIRD_PARTY_COMPILER_RT_A).pkg:				\
+		$(THIRD_PARTY_COMPILER_RT_A_OBJS)		\
+		$(foreach x,$(THIRD_PARTY_COMPILER_RT_A_DIRECTDEPS),$($(x)_A).pkg)
+
+$(THIRD_PARTY_COMPILER_RT_A_OBJS):				\
+	DEFAULT_COPTS +=					\
+		-DCRT_HAS_128BIT
+
+o/$(MODE)/third_party/compiler_rt/multc3.o			\
+o/$(MODE)/third_party/compiler_rt/divtc3.o:			\
+	DEFAULT_COPTS +=					\
+		-w
+
+THIRD_PARTY_COMPILER_RT_LIBS = $(foreach x,$(THIRD_PARTY_COMPILER_RT_ARTIFACTS),$($(x)))
+THIRD_PARTY_COMPILER_RT_SRCS = $(foreach x,$(THIRD_PARTY_COMPILER_RT_ARTIFACTS),$($(x)_SRCS))
+THIRD_PARTY_COMPILER_RT_CHECKS = $(foreach x,$(THIRD_PARTY_COMPILER_RT_ARTIFACTS),$($(x)_CHECKS))
+THIRD_PARTY_COMPILER_RT_OBJS = $(foreach x,$(THIRD_PARTY_COMPILER_RT_ARTIFACTS),$($(x)_OBJS))
+
+.PHONY: o/$(MODE)/third_party/compiler_rt
+o/$(MODE)/third_party/compiler_rt: $(THIRD_PARTY_COMPILER_RT_CHECKS)
--- a/third_party/compiler_rt/comprt.S
+++ b/third_party/compiler_rt/comprt.S
@ -0,0 +1,50 @@
+#include "libc/macros.h"
+
+/	Nop ref this to force pull the license into linkage.
+	.section .yoink
+huge_compiler_rt_license:
+	int3
+	.endobj	huge_compiler_rt_license,globl,hidden
+	.previous
+
+.ident "\n
+compiler_rt (Licensed MIT)
+Copyright (c) 2009-2015 by the contributors listed in:
+github.com/llvm-mirror/compiler-rt/blob/master/CREDITS.TXT"
+
+.ident "\n
+compiler_rt (Licensed \"University of Illinois/NCSA Open Source License\")
+Copyright (c) 2009-2018 by the contributors listed in:
+github.com/llvm-mirror/compiler-rt/blob/master/CREDITS.TXT
+All rights reserved.
+
+Developed by:
+
+    LLVM Team
+    University of Illinois at Urbana-Champaign
+    http://llvm.org
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of
+this software and associated documentation files (the \"Software\"), to deal with
+the Software without restriction, including without limitation the rights to
+use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
+of the Software, and to permit persons to whom the Software is furnished to do
+so, subject to the following conditions:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimers.
+    * Redistributions in binary form must reproduce the above copyright notice,
+      this list of conditions and the following disclaimers in the
+      documentation and/or other materials provided with the distribution.
+    * Neither the names of the LLVM Team, University of Illinois at
+      Urbana-Champaign, nor the names of its contributors may be used to
+      endorse or promote products derived from this Software without specific
+      prior written permission.
+
+THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
+FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
+CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS WITH THE
+SOFTWARE."
--- a/third_party/compiler_rt/ctzdi2.c
+++ b/third_party/compiler_rt/ctzdi2.c
@ -0,0 +1,43 @@
+/* clang-format off */
+/* ===-- ctzdi2.c - Implement __ctzdi2 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __ctzdi2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: the number of trailing 0-bits  */
+
+#if !defined(__clang__) &&                                                     \
+    ((defined(__sparc__) && defined(__arch64__)) ||                            \
+     defined(__mips64) ||                                                      \
+     (defined(__riscv) && __SIZEOF_POINTER__ >= 8))
+/* On 64-bit architectures with neither a native clz instruction nor a native
+ * ctz instruction, gcc resolves __builtin_ctz to __ctzdi2 rather than
+ * __ctzsi2, leading to infinite recursion. */
+#define __builtin_ctz(a) __ctzsi2(a)
+extern si_int __ctzsi2(si_int);
+#endif
+
+/* Precondition: a != 0 */
+
+COMPILER_RT_ABI si_int
+__ctzdi2(di_int a)
+{
+    dwords x;
+    x.all = a;
+    const si_int f = -(x.s.low == 0);
+    return __builtin_ctz((x.s.high & f) | (x.s.low & ~f)) +
+              (f & ((si_int)(sizeof(si_int) * CHAR_BIT)));
+}
--- a/third_party/compiler_rt/ctzsi2.c
+++ b/third_party/compiler_rt/ctzsi2.c
@ -0,0 +1,60 @@
+/* clang-format off */
+/* ===-- ctzsi2.c - Implement __ctzsi2 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __ctzsi2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: the number of trailing 0-bits */
+
+/* Precondition: a != 0 */
+
+COMPILER_RT_ABI si_int
+__ctzsi2(si_int a)
+{
+    su_int x = (su_int)a;
+    si_int t = ((x & 0x0000FFFF) == 0) << 4;  /* if (x has no small bits) t = 16 else 0 */
+    x >>= t;           /* x = [0 - 0xFFFF] + higher garbage bits */
+    su_int r = t;       /* r = [0, 16]  */
+    /* return r + ctz(x) */
+    t = ((x & 0x00FF) == 0) << 3;
+    x >>= t;           /* x = [0 - 0xFF] + higher garbage bits */
+    r += t;            /* r = [0, 8, 16, 24] */
+    /* return r + ctz(x) */
+    t = ((x & 0x0F) == 0) << 2;
+    x >>= t;           /* x = [0 - 0xF] + higher garbage bits */
+    r += t;            /* r = [0, 4, 8, 12, 16, 20, 24, 28] */
+    /* return r + ctz(x) */
+    t = ((x & 0x3) == 0) << 1;
+    x >>= t;
+    x &= 3;            /* x = [0 - 3] */
+    r += t;            /* r = [0 - 30] and is even */
+    /* return r + ctz(x) */
+
+/*  The branch-less return statement below is equivalent
+ *  to the following switch statement:
+ *     switch (x)
+ *    {
+ *     case 0:
+ *         return r + 2;
+ *     case 2:
+ *         return r + 1;
+ *     case 1:
+ *     case 3:
+ *         return r;
+ *     }
+ */
+    return r + ((2 - (x >> 1)) & -((x & 1) == 0));
+}
--- a/third_party/compiler_rt/ctzti2.c
+++ b/third_party/compiler_rt/ctzti2.c
@ -0,0 +1,36 @@
+/* clang-format off */
+/* ===-- ctzti2.c - Implement __ctzti2 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __ctzti2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+#ifdef CRT_HAS_128BIT
+
+/* Returns: the number of trailing 0-bits */
+
+/* Precondition: a != 0 */
+
+COMPILER_RT_ABI si_int
+__ctzti2(ti_int a)
+{
+    twords x;
+    x.all = a;
+    const di_int f = -(x.s.low == 0);
+    return __builtin_ctzll((x.s.high & f) | (x.s.low & ~f)) +
+              ((si_int)f & ((si_int)(sizeof(di_int) * CHAR_BIT)));
+}
+
+#endif /* CRT_HAS_128BIT */
--- a/third_party/compiler_rt/divdc3.c
+++ b/third_party/compiler_rt/divdc3.c
@ -0,0 +1,58 @@
+/* clang-format off */
+/* ===-- divdc3.c - Implement __divdc3 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __divdc3 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define DOUBLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+#include "third_party/compiler_rt/int_lib.h"
+#include "third_party/compiler_rt/int_math.h"
+
+/* Returns: the quotient of (a + ib) / (c + id) */
+
+COMPILER_RT_ABI Dcomplex __divdc3(double __a, double __b, double __c,
+                                  double __d) {
+  int __ilogbw = 0;
+  double __logbw = __compiler_rt_logb(crt_fmax(crt_fabs(__c), crt_fabs(__d)));
+  if (crt_isfinite(__logbw)) {
+    __ilogbw = (int)__logbw;
+    __c = crt_scalbn(__c, -__ilogbw);
+    __d = crt_scalbn(__d, -__ilogbw);
+  }
+  double __denom = __c * __c + __d * __d;
+  Dcomplex z;
+  COMPLEX_REAL(z) = crt_scalbn((__a * __c + __b * __d) / __denom, -__ilogbw);
+  COMPLEX_IMAGINARY(z) =
+      crt_scalbn((__b * __c - __a * __d) / __denom, -__ilogbw);
+  if (crt_isnan(COMPLEX_REAL(z)) && crt_isnan(COMPLEX_IMAGINARY(z))) {
+    if ((__denom == 0.0) && (!crt_isnan(__a) || !crt_isnan(__b))) {
+      COMPLEX_REAL(z) = crt_copysign(CRT_INFINITY, __c) * __a;
+      COMPLEX_IMAGINARY(z) = crt_copysign(CRT_INFINITY, __c) * __b;
+    } else if ((crt_isinf(__a) || crt_isinf(__b)) && crt_isfinite(__c) &&
+               crt_isfinite(__d)) {
+      __a = crt_copysign(crt_isinf(__a) ? 1.0 : 0.0, __a);
+      __b = crt_copysign(crt_isinf(__b) ? 1.0 : 0.0, __b);
+      COMPLEX_REAL(z) = CRT_INFINITY * (__a * __c + __b * __d);
+      COMPLEX_IMAGINARY(z) = CRT_INFINITY * (__b * __c - __a * __d);
+    } else if (crt_isinf(__logbw) && __logbw > 0.0 && crt_isfinite(__a) &&
+               crt_isfinite(__b)) {
+      __c = crt_copysign(crt_isinf(__c) ? 1.0 : 0.0, __c);
+      __d = crt_copysign(crt_isinf(__d) ? 1.0 : 0.0, __d);
+      COMPLEX_REAL(z) = 0.0 * (__a * __c + __b * __d);
+      COMPLEX_IMAGINARY(z) = 0.0 * (__b * __c - __a * __d);
+    }
+  }
+  return z;
+}
--- a/third_party/compiler_rt/divdf3.c
+++ b/third_party/compiler_rt/divdf3.c
@ -0,0 +1,197 @@
+/* clang-format off */
+//===-- lib/divdf3.c - Double-precision division ------------------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements double-precision soft-float division
+// with the IEEE-754 default rounding (to nearest, ties to even).
+//
+// For simplicity, this implementation currently flushes denormals to zero.
+// It should be a fairly straightforward exercise to implement gradual
+// underflow with correct rounding.
+//
+//===----------------------------------------------------------------------===//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define DOUBLE_PRECISION
+#include "libc/literal.h"
+#include "third_party/compiler_rt/fp_lib.inc"
+
+COMPILER_RT_ABI fp_t
+__divdf3(fp_t a, fp_t b) {
+    
+    const unsigned int aExponent = toRep(a) >> significandBits & maxExponent;
+    const unsigned int bExponent = toRep(b) >> significandBits & maxExponent;
+    const rep_t quotientSign = (toRep(a) ^ toRep(b)) & signBit;
+    
+    rep_t aSignificand = toRep(a) & significandMask;
+    rep_t bSignificand = toRep(b) & significandMask;
+    int scale = 0;
+    
+    // Detect if a or b is zero, denormal, infinity, or NaN.
+    if (aExponent-1U >= maxExponent-1U || bExponent-1U >= maxExponent-1U) {
+        
+        const rep_t aAbs = toRep(a) & absMask;
+        const rep_t bAbs = toRep(b) & absMask;
+        
+        // NaN / anything = qNaN
+        if (aAbs > infRep) return fromRep(toRep(a) | quietBit);
+        // anything / NaN = qNaN
+        if (bAbs > infRep) return fromRep(toRep(b) | quietBit);
+        
+        if (aAbs == infRep) {
+            // infinity / infinity = NaN
+            if (bAbs == infRep) return fromRep(qnanRep);
+            // infinity / anything else = +/- infinity
+            else return fromRep(aAbs | quotientSign);
+        }
+        
+        // anything else / infinity = +/- 0
+        if (bAbs == infRep) return fromRep(quotientSign);
+        
+        if (!aAbs) {
+            // zero / zero = NaN
+            if (!bAbs) return fromRep(qnanRep);
+            // zero / anything else = +/- zero
+            else return fromRep(quotientSign);
+        }
+        // anything else / zero = +/- infinity
+        if (!bAbs) return fromRep(infRep | quotientSign);
+        
+        // one or both of a or b is denormal, the other (if applicable) is a
+        // normal number.  Renormalize one or both of a and b, and set scale to
+        // include the necessary exponent adjustment.
+        if (aAbs < implicitBit) scale += normalize(&aSignificand);
+        if (bAbs < implicitBit) scale -= normalize(&bSignificand);
+    }
+    
+    // Or in the implicit significand bit.  (If we fell through from the
+    // denormal path it was already set by normalize( ), but setting it twice
+    // won't hurt anything.)
+    aSignificand |= implicitBit;
+    bSignificand |= implicitBit;
+    int quotientExponent = aExponent - bExponent + scale;
+    
+    // Align the significand of b as a Q31 fixed-point number in the range
+    // [1, 2.0) and get a Q32 approximate reciprocal using a small minimax
+    // polynomial approximation: reciprocal = 3/4 + 1/sqrt(2) - b/2.  This
+    // is accurate to about 3.5 binary digits.
+    const uint32_t q31b = bSignificand >> 21;
+    uint32_t recip32 = UINT32_C(0x7504f333) - q31b;
+    
+    // Now refine the reciprocal estimate using a Newton-Raphson iteration:
+    //
+    //     x1 = x0 * (2 - x0 * b)
+    //
+    // This doubles the number of correct binary digits in the approximation
+    // with each iteration, so after three iterations, we have about 28 binary
+    // digits of accuracy.
+    uint32_t correction32;
+    correction32 = -((uint64_t)recip32 * q31b >> 32);
+    recip32 = (uint64_t)recip32 * correction32 >> 31;
+    correction32 = -((uint64_t)recip32 * q31b >> 32);
+    recip32 = (uint64_t)recip32 * correction32 >> 31;
+    correction32 = -((uint64_t)recip32 * q31b >> 32);
+    recip32 = (uint64_t)recip32 * correction32 >> 31;
+    
+    // recip32 might have overflowed to exactly zero in the preceding
+    // computation if the high word of b is exactly 1.0.  This would sabotage
+    // the full-width final stage of the computation that follows, so we adjust
+    // recip32 downward by one bit.
+    recip32--;
+    
+    // We need to perform one more iteration to get us to 56 binary digits;
+    // The last iteration needs to happen with extra precision.
+    const uint32_t q63blo = bSignificand << 11;
+    uint64_t correction, reciprocal;
+    correction = -((uint64_t)recip32*q31b + ((uint64_t)recip32*q63blo >> 32));
+    uint32_t cHi = correction >> 32;
+    uint32_t cLo = correction;
+    reciprocal = (uint64_t)recip32*cHi + ((uint64_t)recip32*cLo >> 32);
+    
+    // We already adjusted the 32-bit estimate, now we need to adjust the final
+    // 64-bit reciprocal estimate downward to ensure that it is strictly smaller
+    // than the infinitely precise exact reciprocal.  Because the computation
+    // of the Newton-Raphson step is truncating at every step, this adjustment
+    // is small; most of the work is already done.
+    reciprocal -= 2;
+    
+    // The numerical reciprocal is accurate to within 2^-56, lies in the
+    // interval [0.5, 1.0), and is strictly smaller than the true reciprocal
+    // of b.  Multiplying a by this reciprocal thus gives a numerical q = a/b
+    // in Q53 with the following properties:
+    //
+    //    1. q < a/b
+    //    2. q is in the interval [0.5, 2.0)
+    //    3. the error in q is bounded away from 2^-53 (actually, we have a
+    //       couple of bits to spare, but this is all we need).
+    
+    // We need a 64 x 64 multiply high to compute q, which isn't a basic
+    // operation in C, so we need to be a little bit fussy.
+    rep_t quotient, quotientLo;
+    wideMultiply(aSignificand << 2, reciprocal, &quotient, &quotientLo);
+    
+    // Two cases: quotient is in [0.5, 1.0) or quotient is in [1.0, 2.0).
+    // In either case, we are going to compute a residual of the form
+    //
+    //     r = a - q*b
+    //
+    // We know from the construction of q that r satisfies:
+    //
+    //     0 <= r < ulp(q)*b
+    // 
+    // if r is greater than 1/2 ulp(q)*b, then q rounds up.  Otherwise, we
+    // already have the correct result.  The exact halfway case cannot occur.
+    // We also take this time to right shift quotient if it falls in the [1,2)
+    // range and adjust the exponent accordingly.
+    rep_t residual;
+    if (quotient < (implicitBit << 1)) {
+        residual = (aSignificand << 53) - quotient * bSignificand;
+        quotientExponent--;
+    } else {
+        quotient >>= 1;
+        residual = (aSignificand << 52) - quotient * bSignificand;
+    }
+    
+    const int writtenExponent = quotientExponent + exponentBias;
+    
+    if (writtenExponent >= maxExponent) {
+        // If we have overflowed the exponent, return infinity.
+        return fromRep(infRep | quotientSign);
+    }
+    
+    else if (writtenExponent < 1) {
+        // Flush denormals to zero.  In the future, it would be nice to add
+        // code to round them correctly.
+        return fromRep(quotientSign);
+    }
+    
+    else {
+        const bool round = (residual << 1) > bSignificand;
+        // Clear the implicit bit
+        rep_t absResult = quotient & significandMask;
+        // Insert the exponent
+        absResult |= (rep_t)writtenExponent << significandBits;
+        // Round
+        absResult += round;
+        // Insert the sign and return
+        const double result = fromRep(absResult | quotientSign);
+        return result;
+    }
+}
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI fp_t __aeabi_ddiv(fp_t a, fp_t b) {
+  return __divdf3(a, b);
+}
+#else
+AEABI_RTABI fp_t __aeabi_ddiv(fp_t a, fp_t b) COMPILER_RT_ALIAS(__divdf3);
+#endif
+#endif
--- a/third_party/compiler_rt/divdi3.c
+++ b/third_party/compiler_rt/divdi3.c
@ -0,0 +1,32 @@
+/* clang-format off */
+/* ===-- divdi3.c - Implement __divdi3 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __divdi3 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: a / b */
+
+COMPILER_RT_ABI di_int
+__divdi3(di_int a, di_int b)
+{
+    const int bits_in_dword_m1 = (int)(sizeof(di_int) * CHAR_BIT) - 1;
+    di_int s_a = a >> bits_in_dword_m1;           /* s_a = a < 0 ? -1 : 0 */
+    di_int s_b = b >> bits_in_dword_m1;           /* s_b = b < 0 ? -1 : 0 */
+    a = (a ^ s_a) - s_a;                         /* negate if s_a == -1 */
+    b = (b ^ s_b) - s_b;                         /* negate if s_b == -1 */
+    s_a ^= s_b;                                  /*sign of quotient */
+    return (__udivmoddi4(a, b, (du_int*)0) ^ s_a) - s_a;  /* negate if s_a == -1 */
+}
--- a/third_party/compiler_rt/divmoddi4.c
+++ b/third_party/compiler_rt/divmoddi4.c
@ -0,0 +1,28 @@
+/* clang-format off */
+/*===-- divmoddi4.c - Implement __divmoddi4 --------------------------------===
+ *
+ *                    The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __divmoddi4 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: a / b, *rem = a % b  */
+
+COMPILER_RT_ABI di_int
+__divmoddi4(di_int a, di_int b, di_int* rem)
+{
+  di_int d = __divdi3(a,b);
+  *rem = a - (d*b);
+  return d;
+}
--- a/third_party/compiler_rt/divmodsi4.c
+++ b/third_party/compiler_rt/divmodsi4.c
@ -0,0 +1,30 @@
+/* clang-format off */
+/*===-- divmodsi4.c - Implement __divmodsi4 --------------------------------===
+ *
+ *                    The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __divmodsi4 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: a / b, *rem = a % b  */
+
+COMPILER_RT_ABI si_int
+__divmodsi4(si_int a, si_int b, si_int* rem)
+{
+  si_int d = __divsi3(a,b);
+  *rem = a - (d*b);
+  return d; 
+}
+
+
--- a/third_party/compiler_rt/divsc3.c
+++ b/third_party/compiler_rt/divsc3.c
@ -0,0 +1,66 @@
+/* clang-format off */
+/*===-- divsc3.c - Implement __divsc3 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __divsc3 for the compiler_rt library.
+ *
+ *===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define SINGLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+#include "third_party/compiler_rt/int_lib.h"
+#include "third_party/compiler_rt/int_math.h"
+
+/* Returns: the quotient of (a + ib) / (c + id) */
+
+COMPILER_RT_ABI Fcomplex
+__divsc3(float __a, float __b, float __c, float __d)
+{
+    int __ilogbw = 0;
+    float __logbw =
+        __compiler_rt_logbf(crt_fmaxf(crt_fabsf(__c), crt_fabsf(__d)));
+    if (crt_isfinite(__logbw))
+    {
+        __ilogbw = (int)__logbw;
+        __c = crt_scalbnf(__c, -__ilogbw);
+        __d = crt_scalbnf(__d, -__ilogbw);
+    }
+    float __denom = __c * __c + __d * __d;
+    Fcomplex z;
+    COMPLEX_REAL(z) = crt_scalbnf((__a * __c + __b * __d) / __denom, -__ilogbw);
+    COMPLEX_IMAGINARY(z) = crt_scalbnf((__b * __c - __a * __d) / __denom, -__ilogbw);
+    if (crt_isnan(COMPLEX_REAL(z)) && crt_isnan(COMPLEX_IMAGINARY(z)))
+    {
+        if ((__denom == 0) && (!crt_isnan(__a) || !crt_isnan(__b)))
+        {
+            COMPLEX_REAL(z) = crt_copysignf(CRT_INFINITY, __c) * __a;
+            COMPLEX_IMAGINARY(z) = crt_copysignf(CRT_INFINITY, __c) * __b;
+        }
+        else if ((crt_isinf(__a) || crt_isinf(__b)) &&
+                 crt_isfinite(__c) && crt_isfinite(__d))
+        {
+            __a = crt_copysignf(crt_isinf(__a) ? 1 : 0, __a);
+            __b = crt_copysignf(crt_isinf(__b) ? 1 : 0, __b);
+            COMPLEX_REAL(z) = CRT_INFINITY * (__a * __c + __b * __d);
+            COMPLEX_IMAGINARY(z) = CRT_INFINITY * (__b * __c - __a * __d);
+        }
+        else if (crt_isinf(__logbw) && __logbw > 0 &&
+                 crt_isfinite(__a) && crt_isfinite(__b))
+        {
+            __c = crt_copysignf(crt_isinf(__c) ? 1 : 0, __c);
+            __d = crt_copysignf(crt_isinf(__d) ? 1 : 0, __d);
+            COMPLEX_REAL(z) = 0 * (__a * __c + __b * __d);
+            COMPLEX_IMAGINARY(z) = 0 * (__b * __c - __a * __d);
+        }
+    }
+    return z;
+}
--- a/third_party/compiler_rt/divsf3.c
+++ b/third_party/compiler_rt/divsf3.c
@ -0,0 +1,181 @@
+/* clang-format off */
+//===-- lib/divsf3.c - Single-precision division ------------------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements single-precision soft-float division
+// with the IEEE-754 default rounding (to nearest, ties to even).
+//
+// For simplicity, this implementation currently flushes denormals to zero.
+// It should be a fairly straightforward exercise to implement gradual
+// underflow with correct rounding.
+//
+//===----------------------------------------------------------------------===//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define SINGLE_PRECISION
+#include "libc/literal.h"
+#include "third_party/compiler_rt/fp_lib.inc"
+
+COMPILER_RT_ABI fp_t
+__divsf3(fp_t a, fp_t b) {
+    
+    const unsigned int aExponent = toRep(a) >> significandBits & maxExponent;
+    const unsigned int bExponent = toRep(b) >> significandBits & maxExponent;
+    const rep_t quotientSign = (toRep(a) ^ toRep(b)) & signBit;
+    
+    rep_t aSignificand = toRep(a) & significandMask;
+    rep_t bSignificand = toRep(b) & significandMask;
+    int scale = 0;
+    
+    // Detect if a or b is zero, denormal, infinity, or NaN.
+    if (aExponent-1U >= maxExponent-1U || bExponent-1U >= maxExponent-1U) {
+        
+        const rep_t aAbs = toRep(a) & absMask;
+        const rep_t bAbs = toRep(b) & absMask;
+        
+        // NaN / anything = qNaN
+        if (aAbs > infRep) return fromRep(toRep(a) | quietBit);
+        // anything / NaN = qNaN
+        if (bAbs > infRep) return fromRep(toRep(b) | quietBit);
+        
+        if (aAbs == infRep) {
+            // infinity / infinity = NaN
+            if (bAbs == infRep) return fromRep(qnanRep);
+            // infinity / anything else = +/- infinity
+            else return fromRep(aAbs | quotientSign);
+        }
+        
+        // anything else / infinity = +/- 0
+        if (bAbs == infRep) return fromRep(quotientSign);
+        
+        if (!aAbs) {
+            // zero / zero = NaN
+            if (!bAbs) return fromRep(qnanRep);
+            // zero / anything else = +/- zero
+            else return fromRep(quotientSign);
+        }
+        // anything else / zero = +/- infinity
+        if (!bAbs) return fromRep(infRep | quotientSign);
+        
+        // one or both of a or b is denormal, the other (if applicable) is a
+        // normal number.  Renormalize one or both of a and b, and set scale to
+        // include the necessary exponent adjustment.
+        if (aAbs < implicitBit) scale += normalize(&aSignificand);
+        if (bAbs < implicitBit) scale -= normalize(&bSignificand);
+    }
+    
+    // Or in the implicit significand bit.  (If we fell through from the
+    // denormal path it was already set by normalize( ), but setting it twice
+    // won't hurt anything.)
+    aSignificand |= implicitBit;
+    bSignificand |= implicitBit;
+    int quotientExponent = aExponent - bExponent + scale;
+    
+    // Align the significand of b as a Q31 fixed-point number in the range
+    // [1, 2.0) and get a Q32 approximate reciprocal using a small minimax
+    // polynomial approximation: reciprocal = 3/4 + 1/sqrt(2) - b/2.  This
+    // is accurate to about 3.5 binary digits.
+    uint32_t q31b = bSignificand << 8;
+    uint32_t reciprocal = UINT32_C(0x7504f333) - q31b;
+    
+    // Now refine the reciprocal estimate using a Newton-Raphson iteration:
+    //
+    //     x1 = x0 * (2 - x0 * b)
+    //
+    // This doubles the number of correct binary digits in the approximation
+    // with each iteration, so after three iterations, we have about 28 binary
+    // digits of accuracy.
+    uint32_t correction;
+    correction = -((uint64_t)reciprocal * q31b >> 32);
+    reciprocal = (uint64_t)reciprocal * correction >> 31;
+    correction = -((uint64_t)reciprocal * q31b >> 32);
+    reciprocal = (uint64_t)reciprocal * correction >> 31;
+    correction = -((uint64_t)reciprocal * q31b >> 32);
+    reciprocal = (uint64_t)reciprocal * correction >> 31;
+    
+    // Exhaustive testing shows that the error in reciprocal after three steps
+    // is in the interval [-0x1.f58108p-31, 0x1.d0e48cp-29], in line with our
+    // expectations.  We bump the reciprocal by a tiny value to force the error
+    // to be strictly positive (in the range [0x1.4fdfp-37,0x1.287246p-29], to
+    // be specific).  This also causes 1/1 to give a sensible approximation
+    // instead of zero (due to overflow).
+    reciprocal -= 2;
+    
+    // The numerical reciprocal is accurate to within 2^-28, lies in the
+    // interval [0x1.000000eep-1, 0x1.fffffffcp-1], and is strictly smaller
+    // than the true reciprocal of b.  Multiplying a by this reciprocal thus
+    // gives a numerical q = a/b in Q24 with the following properties:
+    //
+    //    1. q < a/b
+    //    2. q is in the interval [0x1.000000eep-1, 0x1.fffffffcp0)
+    //    3. the error in q is at most 2^-24 + 2^-27 -- the 2^24 term comes
+    //       from the fact that we truncate the product, and the 2^27 term
+    //       is the error in the reciprocal of b scaled by the maximum
+    //       possible value of a.  As a consequence of this error bound,
+    //       either q or nextafter(q) is the correctly rounded 
+    rep_t quotient = (uint64_t)reciprocal*(aSignificand << 1) >> 32;
+    
+    // Two cases: quotient is in [0.5, 1.0) or quotient is in [1.0, 2.0).
+    // In either case, we are going to compute a residual of the form
+    //
+    //     r = a - q*b
+    //
+    // We know from the construction of q that r satisfies:
+    //
+    //     0 <= r < ulp(q)*b
+    // 
+    // if r is greater than 1/2 ulp(q)*b, then q rounds up.  Otherwise, we
+    // already have the correct result.  The exact halfway case cannot occur.
+    // We also take this time to right shift quotient if it falls in the [1,2)
+    // range and adjust the exponent accordingly.
+    rep_t residual;
+    if (quotient < (implicitBit << 1)) {
+        residual = (aSignificand << 24) - quotient * bSignificand;
+        quotientExponent--;
+    } else {
+        quotient >>= 1;
+        residual = (aSignificand << 23) - quotient * bSignificand;
+    }
+
+    const int writtenExponent = quotientExponent + exponentBias;
+    
+    if (writtenExponent >= maxExponent) {
+        // If we have overflowed the exponent, return infinity.
+        return fromRep(infRep | quotientSign);
+    }
+    
+    else if (writtenExponent < 1) {
+        // Flush denormals to zero.  In the future, it would be nice to add
+        // code to round them correctly.
+        return fromRep(quotientSign);
+    }
+    
+    else {
+        const bool round = (residual << 1) > bSignificand;
+        // Clear the implicit bit
+        rep_t absResult = quotient & significandMask;
+        // Insert the exponent
+        absResult |= (rep_t)writtenExponent << significandBits;
+        // Round
+        absResult += round;
+        // Insert the sign and return
+        return fromRep(absResult | quotientSign);
+    }
+}
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI fp_t __aeabi_fdiv(fp_t a, fp_t b) {
+  return __divsf3(a, b);
+}
+#else
+AEABI_RTABI fp_t __aeabi_fdiv(fp_t a, fp_t b) COMPILER_RT_ALIAS(__divsf3);
+#endif
+#endif
--- a/third_party/compiler_rt/divsi3.c
+++ b/third_party/compiler_rt/divsi3.c
@ -0,0 +1,42 @@
+/* clang-format off */
+/* ===-- divsi3.c - Implement __divsi3 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __divsi3 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: a / b */
+
+COMPILER_RT_ABI si_int
+__divsi3(si_int a, si_int b)
+{
+    const int bits_in_word_m1 = (int)(sizeof(si_int) * CHAR_BIT) - 1;
+    si_int s_a = a >> bits_in_word_m1;           /* s_a = a < 0 ? -1 : 0 */
+    si_int s_b = b >> bits_in_word_m1;           /* s_b = b < 0 ? -1 : 0 */
+    a = (a ^ s_a) - s_a;                         /* negate if s_a == -1 */
+    b = (b ^ s_b) - s_b;                         /* negate if s_b == -1 */
+    s_a ^= s_b;                                  /* sign of quotient */
+    /*
+     * On CPUs without unsigned hardware division support,
+     *  this calls __udivsi3 (notice the cast to su_int).
+     * On CPUs with unsigned hardware division support,
+     *  this uses the unsigned division instruction.
+     */
+    return ((su_int)a/(su_int)b ^ s_a) - s_a;    /* negate if s_a == -1 */
+}
+
+#if defined(__ARM_EABI__)
+AEABI_RTABI si_int __aeabi_idiv(si_int a, si_int b) COMPILER_RT_ALIAS(__divsi3);
+#endif
--- a/third_party/compiler_rt/divtc3.c
+++ b/third_party/compiler_rt/divtc3.c
@ -0,0 +1,66 @@
+/* clang-format off */
+/*===-- divtc3.c - Implement __divtc3 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __divtc3 for the compiler_rt library.
+ *
+ *===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define QUAD_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+#include "third_party/compiler_rt/int_lib.h"
+#include "third_party/compiler_rt/int_math.h"
+
+/* Returns: the quotient of (a + ib) / (c + id) */
+
+COMPILER_RT_ABI Lcomplex
+__divtc3(long double __a, long double __b, long double __c, long double __d)
+{
+    int __ilogbw = 0;
+    long double __logbw =
+        __compiler_rt_logbl(crt_fmaxl(crt_fabsl(__c), crt_fabsl(__d)));
+    if (crt_isfinite(__logbw))
+    {
+        __ilogbw = (int)__logbw;
+        __c = crt_scalbnl(__c, -__ilogbw);
+        __d = crt_scalbnl(__d, -__ilogbw);
+    }
+    long double __denom = __c * __c + __d * __d;
+    Lcomplex z;
+    COMPLEX_REAL(z) = crt_scalbnl((__a * __c + __b * __d) / __denom, -__ilogbw);
+    COMPLEX_IMAGINARY(z) = crt_scalbnl((__b * __c - __a * __d) / __denom, -__ilogbw);
+    if (crt_isnan(COMPLEX_REAL(z)) && crt_isnan(COMPLEX_IMAGINARY(z)))
+    {
+        if ((__denom == 0.0) && (!crt_isnan(__a) || !crt_isnan(__b)))
+        {
+            COMPLEX_REAL(z) = crt_copysignl(CRT_INFINITY, __c) * __a;
+            COMPLEX_IMAGINARY(z) = crt_copysignl(CRT_INFINITY, __c) * __b;
+        }
+        else if ((crt_isinf(__a) || crt_isinf(__b)) &&
+                 crt_isfinite(__c) && crt_isfinite(__d))
+        {
+            __a = crt_copysignl(crt_isinf(__a) ? 1.0 : 0.0, __a);
+            __b = crt_copysignl(crt_isinf(__b) ? 1.0 : 0.0, __b);
+            COMPLEX_REAL(z) = CRT_INFINITY * (__a * __c + __b * __d);
+            COMPLEX_IMAGINARY(z) = CRT_INFINITY * (__b * __c - __a * __d);
+        }
+        else if (crt_isinf(__logbw) && __logbw > 0.0 &&
+                 crt_isfinite(__a) && crt_isfinite(__b))
+        {
+            __c = crt_copysignl(crt_isinf(__c) ? 1.0 : 0.0, __c);
+            __d = crt_copysignl(crt_isinf(__d) ? 1.0 : 0.0, __d);
+            COMPLEX_REAL(z) = 0.0 * (__a * __c + __b * __d);
+            COMPLEX_IMAGINARY(z) = 0.0 * (__b * __c - __a * __d);
+        }
+    }
+    return z;
+}
--- a/third_party/compiler_rt/divtf3.c
+++ b/third_party/compiler_rt/divtf3.c
@ -0,0 +1,207 @@
+/* clang-format off */
+//===-- lib/divtf3.c - Quad-precision division --------------------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements quad-precision soft-float division
+// with the IEEE-754 default rounding (to nearest, ties to even).
+//
+// For simplicity, this implementation currently flushes denormals to zero.
+// It should be a fairly straightforward exercise to implement gradual
+// underflow with correct rounding.
+//
+//===----------------------------------------------------------------------===//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define QUAD_PRECISION
+#include "libc/literal.h"
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
+COMPILER_RT_ABI fp_t __divtf3(fp_t a, fp_t b) {
+
+    const unsigned int aExponent = toRep(a) >> significandBits & maxExponent;
+    const unsigned int bExponent = toRep(b) >> significandBits & maxExponent;
+    const rep_t quotientSign = (toRep(a) ^ toRep(b)) & signBit;
+
+    rep_t aSignificand = toRep(a) & significandMask;
+    rep_t bSignificand = toRep(b) & significandMask;
+    int scale = 0;
+
+    // Detect if a or b is zero, denormal, infinity, or NaN.
+    if (aExponent-1U >= maxExponent-1U || bExponent-1U >= maxExponent-1U) {
+
+        const rep_t aAbs = toRep(a) & absMask;
+        const rep_t bAbs = toRep(b) & absMask;
+
+        // NaN / anything = qNaN
+        if (aAbs > infRep) return fromRep(toRep(a) | quietBit);
+        // anything / NaN = qNaN
+        if (bAbs > infRep) return fromRep(toRep(b) | quietBit);
+
+        if (aAbs == infRep) {
+            // infinity / infinity = NaN
+            if (bAbs == infRep) return fromRep(qnanRep);
+            // infinity / anything else = +/- infinity
+            else return fromRep(aAbs | quotientSign);
+        }
+
+        // anything else / infinity = +/- 0
+        if (bAbs == infRep) return fromRep(quotientSign);
+
+        if (!aAbs) {
+            // zero / zero = NaN
+            if (!bAbs) return fromRep(qnanRep);
+            // zero / anything else = +/- zero
+            else return fromRep(quotientSign);
+        }
+        // anything else / zero = +/- infinity
+        if (!bAbs) return fromRep(infRep | quotientSign);
+
+        // one or both of a or b is denormal, the other (if applicable) is a
+        // normal number.  Renormalize one or both of a and b, and set scale to
+        // include the necessary exponent adjustment.
+        if (aAbs < implicitBit) scale += normalize(&aSignificand);
+        if (bAbs < implicitBit) scale -= normalize(&bSignificand);
+    }
+
+    // Or in the implicit significand bit.  (If we fell through from the
+    // denormal path it was already set by normalize( ), but setting it twice
+    // won't hurt anything.)
+    aSignificand |= implicitBit;
+    bSignificand |= implicitBit;
+    int quotientExponent = aExponent - bExponent + scale;
+
+    // Align the significand of b as a Q63 fixed-point number in the range
+    // [1, 2.0) and get a Q64 approximate reciprocal using a small minimax
+    // polynomial approximation: reciprocal = 3/4 + 1/sqrt(2) - b/2.  This
+    // is accurate to about 3.5 binary digits.
+    const uint64_t q63b = bSignificand >> 49;
+    uint64_t recip64 = UINT64_C(0x7504f333F9DE6484) - q63b;
+    // 0x7504f333F9DE6484 / 2^64 + 1 = 3/4 + 1/sqrt(2)
+
+    // Now refine the reciprocal estimate using a Newton-Raphson iteration:
+    //
+    //     x1 = x0 * (2 - x0 * b)
+    //
+    // This doubles the number of correct binary digits in the approximation
+    // with each iteration.
+    uint64_t correction64;
+    correction64 = -((rep_t)recip64 * q63b >> 64);
+    recip64 = (rep_t)recip64 * correction64 >> 63;
+    correction64 = -((rep_t)recip64 * q63b >> 64);
+    recip64 = (rep_t)recip64 * correction64 >> 63;
+    correction64 = -((rep_t)recip64 * q63b >> 64);
+    recip64 = (rep_t)recip64 * correction64 >> 63;
+    correction64 = -((rep_t)recip64 * q63b >> 64);
+    recip64 = (rep_t)recip64 * correction64 >> 63;
+    correction64 = -((rep_t)recip64 * q63b >> 64);
+    recip64 = (rep_t)recip64 * correction64 >> 63;
+
+    // recip64 might have overflowed to exactly zero in the preceeding
+    // computation if the high word of b is exactly 1.0.  This would sabotage
+    // the full-width final stage of the computation that follows, so we adjust
+    // recip64 downward by one bit.
+    recip64--;
+
+    // We need to perform one more iteration to get us to 112 binary digits;
+    // The last iteration needs to happen with extra precision.
+    const uint64_t q127blo = bSignificand << 15;
+    rep_t correction, reciprocal;
+
+    // NOTE: This operation is equivalent to __multi3, which is not implemented
+    //       in some architechure
+    rep_t r64q63, r64q127, r64cH, r64cL, dummy;
+    wideMultiply((rep_t)recip64, (rep_t)q63b, &dummy, &r64q63);
+    wideMultiply((rep_t)recip64, (rep_t)q127blo, &dummy, &r64q127);
+
+    correction = -(r64q63 + (r64q127 >> 64));
+
+    uint64_t cHi = correction >> 64;
+    uint64_t cLo = correction;
+
+    wideMultiply((rep_t)recip64, (rep_t)cHi, &dummy, &r64cH);
+    wideMultiply((rep_t)recip64, (rep_t)cLo, &dummy, &r64cL);
+
+    reciprocal = r64cH + (r64cL >> 64);
+
+    // We already adjusted the 64-bit estimate, now we need to adjust the final
+    // 128-bit reciprocal estimate downward to ensure that it is strictly smaller
+    // than the infinitely precise exact reciprocal.  Because the computation
+    // of the Newton-Raphson step is truncating at every step, this adjustment
+    // is small; most of the work is already done.
+    reciprocal -= 2;
+
+    // The numerical reciprocal is accurate to within 2^-112, lies in the
+    // interval [0.5, 1.0), and is strictly smaller than the true reciprocal
+    // of b.  Multiplying a by this reciprocal thus gives a numerical q = a/b
+    // in Q127 with the following properties:
+    //
+    //    1. q < a/b
+    //    2. q is in the interval [0.5, 2.0)
+    //    3. the error in q is bounded away from 2^-113 (actually, we have a
+    //       couple of bits to spare, but this is all we need).
+
+    // We need a 128 x 128 multiply high to compute q, which isn't a basic
+    // operation in C, so we need to be a little bit fussy.
+    rep_t quotient, quotientLo;
+    wideMultiply(aSignificand << 2, reciprocal, &quotient, &quotientLo);
+
+    // Two cases: quotient is in [0.5, 1.0) or quotient is in [1.0, 2.0).
+    // In either case, we are going to compute a residual of the form
+    //
+    //     r = a - q*b
+    //
+    // We know from the construction of q that r satisfies:
+    //
+    //     0 <= r < ulp(q)*b
+    //
+    // if r is greater than 1/2 ulp(q)*b, then q rounds up.  Otherwise, we
+    // already have the correct result.  The exact halfway case cannot occur.
+    // We also take this time to right shift quotient if it falls in the [1,2)
+    // range and adjust the exponent accordingly.
+    rep_t residual;
+    rep_t qb;
+
+    if (quotient < (implicitBit << 1)) {
+        wideMultiply(quotient, bSignificand, &dummy, &qb);
+        residual = (aSignificand << 113) - qb;
+        quotientExponent--;
+    } else {
+        quotient >>= 1;
+        wideMultiply(quotient, bSignificand, &dummy, &qb);
+        residual = (aSignificand << 112) - qb;
+    }
+
+    const int writtenExponent = quotientExponent + exponentBias;
+
+    if (writtenExponent >= maxExponent) {
+        // If we have overflowed the exponent, return infinity.
+        return fromRep(infRep | quotientSign);
+    }
+    else if (writtenExponent < 1) {
+        // Flush denormals to zero.  In the future, it would be nice to add
+        // code to round them correctly.
+        return fromRep(quotientSign);
+    }
+    else {
+        const bool round = (residual << 1) >= bSignificand;
+        // Clear the implicit bit
+        rep_t absResult = quotient & significandMask;
+        // Insert the exponent
+        absResult |= (rep_t)writtenExponent << significandBits;
+        // Round
+        absResult += round;
+        // Insert the sign and return
+        const long double result = fromRep(absResult | quotientSign);
+        return result;
+    }
+}
+
+#endif
--- a/third_party/compiler_rt/divti3.c
+++ b/third_party/compiler_rt/divti3.c
@ -0,0 +1,36 @@
+/* clang-format off */
+/* ===-- divti3.c - Implement __divti3 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __divti3 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+#ifdef CRT_HAS_128BIT
+
+/* Returns: a / b */
+
+COMPILER_RT_ABI ti_int
+__divti3(ti_int a, ti_int b)
+{
+    const int bits_in_tword_m1 = (int)(sizeof(ti_int) * CHAR_BIT) - 1;
+    ti_int s_a = a >> bits_in_tword_m1;           /* s_a = a < 0 ? -1 : 0 */
+    ti_int s_b = b >> bits_in_tword_m1;           /* s_b = b < 0 ? -1 : 0 */
+    a = (a ^ s_a) - s_a;                         /* negate if s_a == -1 */
+    b = (b ^ s_b) - s_b;                         /* negate if s_b == -1 */
+    s_a ^= s_b;                                  /* sign of quotient */
+    return (__udivmodti4(a, b, (tu_int*)0) ^ s_a) - s_a;  /* negate if s_a == -1 */
+}
+
+#endif /* CRT_HAS_128BIT */
--- a/third_party/compiler_rt/divxc3.c
+++ b/third_party/compiler_rt/divxc3.c
@ -0,0 +1,66 @@
+/* clang-format off */
+/* ===-- divxc3.c - Implement __divxc3 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __divxc3 for the compiler_rt library.
+ *
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#if !_ARCH_PPC
+
+#include "third_party/compiler_rt/int_lib.h"
+#include "third_party/compiler_rt/int_math.h"
+
+/* Returns: the quotient of (a + ib) / (c + id) */
+
+COMPILER_RT_ABI Lcomplex
+__divxc3(long double __a, long double __b, long double __c, long double __d)
+{
+    int __ilogbw = 0;
+    long double __logbw = crt_logbl(crt_fmaxl(crt_fabsl(__c), crt_fabsl(__d)));
+    if (crt_isfinite(__logbw))
+    {
+        __ilogbw = (int)__logbw;
+        __c = crt_scalbnl(__c, -__ilogbw);
+        __d = crt_scalbnl(__d, -__ilogbw);
+    }
+    long double __denom = __c * __c + __d * __d;
+    Lcomplex z;
+    COMPLEX_REAL(z) = crt_scalbnl((__a * __c + __b * __d) / __denom, -__ilogbw);
+    COMPLEX_IMAGINARY(z) = crt_scalbnl((__b * __c - __a * __d) / __denom, -__ilogbw);
+    if (crt_isnan(COMPLEX_REAL(z)) && crt_isnan(COMPLEX_IMAGINARY(z)))
+    {
+        if ((__denom == 0) && (!crt_isnan(__a) || !crt_isnan(__b)))
+        {
+            COMPLEX_REAL(z) = crt_copysignl(CRT_INFINITY, __c) * __a;
+            COMPLEX_IMAGINARY(z) = crt_copysignl(CRT_INFINITY, __c) * __b;
+        }
+        else if ((crt_isinf(__a) || crt_isinf(__b)) &&
+                 crt_isfinite(__c) && crt_isfinite(__d))
+        {
+            __a = crt_copysignl(crt_isinf(__a) ? 1 : 0, __a);
+            __b = crt_copysignl(crt_isinf(__b) ? 1 : 0, __b);
+            COMPLEX_REAL(z) = CRT_INFINITY * (__a * __c + __b * __d);
+            COMPLEX_IMAGINARY(z) = CRT_INFINITY * (__b * __c - __a * __d);
+        }
+        else if (crt_isinf(__logbw) && __logbw > 0 &&
+                 crt_isfinite(__a) && crt_isfinite(__b))
+        {
+            __c = crt_copysignl(crt_isinf(__c) ? 1 : 0, __c);
+            __d = crt_copysignl(crt_isinf(__d) ? 1 : 0, __d);
+            COMPLEX_REAL(z) = 0 * (__a * __c + __b * __d);
+            COMPLEX_IMAGINARY(z) = 0 * (__b * __c - __a * __d);
+        }
+    }
+    return z;
+}
+
+#endif
--- a/third_party/compiler_rt/extenddftf2.c
+++ b/third_party/compiler_rt/extenddftf2.c
@ -0,0 +1,26 @@
+/* clang-format off */
+//===-- lib/extenddftf2.c - double -> quad conversion -------------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define QUAD_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
+#define SRC_DOUBLE
+#define DST_QUAD
+#include "third_party/compiler_rt/fp_extend_impl.inc"
+
+COMPILER_RT_ABI long double __extenddftf2(double a) {
+    return __extendXfYf2__(a);
+}
+
+#endif
--- a/third_party/compiler_rt/extendhfsf2.c
+++ b/third_party/compiler_rt/extendhfsf2.c
@ -0,0 +1,36 @@
+/* clang-format off */
+//===-- lib/extendhfsf2.c - half -> single conversion -------------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define SRC_HALF
+#define DST_SINGLE
+#include "third_party/compiler_rt/fp_extend_impl.inc"
+
+// Use a forwarding definition and noinline to implement a poor man's alias,
+// as there isn't a good cross-platform way of defining one.
+COMPILER_RT_ABI __attribute__((__noinline__)) float __extendhfsf2(uint16_t a) {
+    return __extendXfYf2__(a);
+}
+
+COMPILER_RT_ABI float __gnu_h2f_ieee(uint16_t a) {
+    return __extendhfsf2(a);
+}
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI float __aeabi_h2f(uint16_t a) {
+  return __extendhfsf2(a);
+}
+#else
+AEABI_RTABI float __aeabi_h2f(uint16_t a) COMPILER_RT_ALIAS(__extendhfsf2);
+#endif
+#endif
--- a/third_party/compiler_rt/extendsfdf2.c
+++ b/third_party/compiler_rt/extendsfdf2.c
@ -0,0 +1,30 @@
+/* clang-format off */
+//===-- lib/extendsfdf2.c - single -> double conversion -----------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define SRC_SINGLE
+#define DST_DOUBLE
+#include "third_party/compiler_rt/fp_extend_impl.inc"
+
+COMPILER_RT_ABI double __extendsfdf2(float a) {
+    return __extendXfYf2__(a);
+}
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI double __aeabi_f2d(float a) {
+  return __extendsfdf2(a);
+}
+#else
+AEABI_RTABI double __aeabi_f2d(float a) COMPILER_RT_ALIAS(__extendsfdf2);
+#endif
+#endif
--- a/third_party/compiler_rt/extendsftf2.c
+++ b/third_party/compiler_rt/extendsftf2.c
@ -0,0 +1,26 @@
+/* clang-format off */
+//===-- lib/extendsftf2.c - single -> quad conversion -------------*- C -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is dual licensed under the MIT and the University of Illinois Open
+// Source Licenses. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define QUAD_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
+#define SRC_SINGLE
+#define DST_QUAD
+#include "third_party/compiler_rt/fp_extend_impl.inc"
+
+COMPILER_RT_ABI long double __extendsftf2(float a) {
+    return __extendXfYf2__(a);
+}
+
+#endif
--- a/third_party/compiler_rt/ffsdi2.c
+++ b/third_party/compiler_rt/ffsdi2.c
@ -0,0 +1,36 @@
+/* clang-format off */
+/* ===-- ffsdi2.c - Implement __ffsdi2 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __ffsdi2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: the index of the least significant 1-bit in a, or
+ * the value zero if a is zero. The least significant bit is index one.
+ */
+
+COMPILER_RT_ABI si_int
+__ffsdi2(di_int a)
+{
+    dwords x;
+    x.all = a;
+    if (x.s.low == 0)
+    {
+        if (x.s.high == 0)
+            return 0;
+        return __builtin_ctz(x.s.high) + (1 + sizeof(si_int) * CHAR_BIT);
+    }
+    return __builtin_ctz(x.s.low) + 1;
+}
--- a/third_party/compiler_rt/ffssi2.c
+++ b/third_party/compiler_rt/ffssi2.c
@ -0,0 +1,32 @@
+/* clang-format off */
+/* ===-- ffssi2.c - Implement __ffssi2 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __ffssi2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: the index of the least significant 1-bit in a, or
+ * the value zero if a is zero. The least significant bit is index one.
+ */
+
+COMPILER_RT_ABI si_int
+__ffssi2(si_int a)
+{
+    if (a == 0)
+    {
+        return 0;
+    }
+    return __builtin_ctz(a) + 1;
+}
--- a/third_party/compiler_rt/ffsti2.c
+++ b/third_party/compiler_rt/ffsti2.c
@ -0,0 +1,40 @@
+/* clang-format off */
+/* ===-- ffsti2.c - Implement __ffsti2 -------------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __ffsti2 for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+#ifdef CRT_HAS_128BIT
+
+/* Returns: the index of the least significant 1-bit in a, or
+ * the value zero if a is zero. The least significant bit is index one.
+ */
+
+COMPILER_RT_ABI si_int
+__ffsti2(ti_int a)
+{
+    twords x;
+    x.all = a;
+    if (x.s.low == 0)
+    {
+        if (x.s.high == 0)
+            return 0;
+        return __builtin_ctzll(x.s.high) + (1 + sizeof(di_int) * CHAR_BIT);
+    }
+    return __builtin_ctzll(x.s.low) + 1;
+}
+
+#endif /* CRT_HAS_128BIT */
--- a/third_party/compiler_rt/fixdfdi.c
+++ b/third_party/compiler_rt/fixdfdi.c
@ -0,0 +1,58 @@
+/* clang-format off */
+/* ===-- fixdfdi.c - Implement __fixdfdi -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define DOUBLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#ifndef __SOFT_FP__
+/* Support for systems that have hardware floating-point; can set the invalid
+ * flag as a side-effect of computation.
+ */
+
+COMPILER_RT_ABI du_int __fixunsdfdi(double a);
+
+COMPILER_RT_ABI di_int
+__fixdfdi(double a)
+{
+    if (a < 0.0) {
+        return -__fixunsdfdi(-a);
+    }
+    return __fixunsdfdi(a);
+}
+
+#else
+/* Support for systems that don't have hardware floating-point; there are no
+ * flags to set, and we don't want to code-gen to an unknown soft-float
+ * implementation.
+ */
+
+typedef di_int fixint_t;
+typedef du_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixint_impl.inc"
+
+COMPILER_RT_ABI di_int
+__fixdfdi(fp_t a) {
+    return __fixint(a);
+}
+
+#endif
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI di_int __aeabi_d2lz(fp_t a) {
+  return __fixdfdi(a);
+}
+#else
+AEABI_RTABI di_int __aeabi_d2lz(fp_t a) COMPILER_RT_ALIAS(__fixdfdi);
+#endif
+#endif
--- a/third_party/compiler_rt/fixdfsi.c
+++ b/third_party/compiler_rt/fixdfsi.c
@ -0,0 +1,33 @@
+/* clang-format off */
+/* ===-- fixdfsi.c - Implement __fixdfsi -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define DOUBLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+typedef si_int fixint_t;
+typedef su_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixint_impl.inc"
+
+COMPILER_RT_ABI si_int
+__fixdfsi(fp_t a) {
+    return __fixint(a);
+}
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI si_int __aeabi_d2iz(fp_t a) {
+  return __fixdfsi(a);
+}
+#else
+AEABI_RTABI si_int __aeabi_d2iz(fp_t a) COMPILER_RT_ALIAS(__fixdfsi);
+#endif
+#endif
--- a/third_party/compiler_rt/fixdfti.c
+++ b/third_party/compiler_rt/fixdfti.c
@ -0,0 +1,29 @@
+/* clang-format off */
+/* ===-- fixdfti.c - Implement __fixdfti -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+#ifdef CRT_HAS_128BIT
+#define DOUBLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+typedef ti_int fixint_t;
+typedef tu_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixint_impl.inc"
+
+COMPILER_RT_ABI ti_int
+__fixdfti(fp_t a) {
+    return __fixint(a);
+}
+
+#endif /* CRT_HAS_128BIT */
--- a/third_party/compiler_rt/fixsfdi.c
+++ b/third_party/compiler_rt/fixsfdi.c
@ -0,0 +1,58 @@
+/* clang-format off */
+/* ===-- fixsfdi.c - Implement __fixsfdi -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define SINGLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#ifndef __SOFT_FP__
+/* Support for systems that have hardware floating-point; can set the invalid
+ * flag as a side-effect of computation.
+ */
+
+COMPILER_RT_ABI du_int __fixunssfdi(float a);
+
+COMPILER_RT_ABI di_int
+__fixsfdi(float a)
+{
+    if (a < 0.0f) {
+        return -__fixunssfdi(-a);
+    }
+    return __fixunssfdi(a);
+}
+
+#else
+/* Support for systems that don't have hardware floating-point; there are no
+ * flags to set, and we don't want to code-gen to an unknown soft-float
+ * implementation.
+ */
+
+typedef di_int fixint_t;
+typedef du_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixint_impl.inc"
+
+COMPILER_RT_ABI di_int
+__fixsfdi(fp_t a) {
+    return __fixint(a);
+}
+
+#endif
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI di_int __aeabi_f2lz(fp_t a) {
+  return __fixsfdi(a);
+}
+#else
+AEABI_RTABI di_int __aeabi_f2lz(fp_t a) COMPILER_RT_ALIAS(__fixsfdi);
+#endif
+#endif
--- a/third_party/compiler_rt/fixsfsi.c
+++ b/third_party/compiler_rt/fixsfsi.c
@ -0,0 +1,33 @@
+/* clang-format off */
+/* ===-- fixsfsi.c - Implement __fixsfsi -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define SINGLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+typedef si_int fixint_t;
+typedef su_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixint_impl.inc"
+
+COMPILER_RT_ABI si_int
+__fixsfsi(fp_t a) {
+    return __fixint(a);
+}
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI si_int __aeabi_f2iz(fp_t a) {
+  return __fixsfsi(a);
+}
+#else
+AEABI_RTABI si_int __aeabi_f2iz(fp_t a) COMPILER_RT_ALIAS(__fixsfsi);
+#endif
+#endif
--- a/third_party/compiler_rt/fixsfti.c
+++ b/third_party/compiler_rt/fixsfti.c
@ -0,0 +1,29 @@
+/* clang-format off */
+/* ===-- fixsfti.c - Implement __fixsfti -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+#ifdef CRT_HAS_128BIT
+#define SINGLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+typedef ti_int fixint_t;
+typedef tu_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixint_impl.inc"
+
+COMPILER_RT_ABI ti_int
+__fixsfti(fp_t a) {
+    return __fixint(a);
+}
+
+#endif /* CRT_HAS_128BIT */
--- a/third_party/compiler_rt/fixtfdi.c
+++ b/third_party/compiler_rt/fixtfdi.c
@ -0,0 +1,26 @@
+/* clang-format off */
+/* ===-- fixtfdi.c - Implement __fixtfdi -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define QUAD_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
+typedef di_int fixint_t;
+typedef du_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixint_impl.inc"
+
+COMPILER_RT_ABI di_int
+__fixtfdi(fp_t a) {
+    return __fixint(a);
+}
+#endif
--- a/third_party/compiler_rt/fixtfsi.c
+++ b/third_party/compiler_rt/fixtfsi.c
@ -0,0 +1,26 @@
+/* clang-format off */
+/* ===-- fixtfsi.c - Implement __fixtfsi -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define QUAD_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
+typedef si_int fixint_t;
+typedef su_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixint_impl.inc"
+
+COMPILER_RT_ABI si_int
+__fixtfsi(fp_t a) {
+    return __fixint(a);
+}
+#endif
--- a/third_party/compiler_rt/fixtfti.c
+++ b/third_party/compiler_rt/fixtfti.c
@ -0,0 +1,26 @@
+/* clang-format off */
+/* ===-- fixtfti.c - Implement __fixtfti -----------------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define QUAD_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
+typedef ti_int fixint_t;
+typedef tu_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixint_impl.inc"
+
+COMPILER_RT_ABI ti_int
+__fixtfti(fp_t a) {
+    return __fixint(a);
+}
+#endif
--- a/third_party/compiler_rt/fixunsdfdi.c
+++ b/third_party/compiler_rt/fixunsdfdi.c
@ -0,0 +1,55 @@
+/* clang-format off */
+/* ===-- fixunsdfdi.c - Implement __fixunsdfdi -----------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define DOUBLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#ifndef __SOFT_FP__
+/* Support for systems that have hardware floating-point; can set the invalid
+ * flag as a side-effect of computation.
+ */
+
+COMPILER_RT_ABI du_int
+__fixunsdfdi(double a)
+{
+    if (a <= 0.0) return 0;
+    su_int high = a / 4294967296.f;               /* a / 0x1p32f; */
+    su_int low = a - (double)high * 4294967296.f; /* high * 0x1p32f; */
+    return ((du_int)high << 32) | low;
+}
+
+#else
+/* Support for systems that don't have hardware floating-point; there are no
+ * flags to set, and we don't want to code-gen to an unknown soft-float
+ * implementation.
+ */
+
+typedef du_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixuint_impl.inc"
+
+COMPILER_RT_ABI du_int
+__fixunsdfdi(fp_t a) {
+    return __fixuint(a);
+}
+
+#endif
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI du_int __aeabi_d2ulz(fp_t a) {
+  return __fixunsdfdi(a);
+}
+#else
+AEABI_RTABI du_int __aeabi_d2ulz(fp_t a) COMPILER_RT_ALIAS(__fixunsdfdi);
+#endif
+#endif
--- a/third_party/compiler_rt/fixunsdfsi.c
+++ b/third_party/compiler_rt/fixunsdfsi.c
@ -0,0 +1,32 @@
+/* clang-format off */
+/* ===-- fixunsdfsi.c - Implement __fixunsdfsi -----------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define DOUBLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+typedef su_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixuint_impl.inc"
+
+COMPILER_RT_ABI su_int
+__fixunsdfsi(fp_t a) {
+    return __fixuint(a);
+}
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI su_int __aeabi_d2uiz(fp_t a) {
+  return __fixunsdfsi(a);
+}
+#else
+AEABI_RTABI su_int __aeabi_d2uiz(fp_t a) COMPILER_RT_ALIAS(__fixunsdfsi);
+#endif
+#endif
--- a/third_party/compiler_rt/fixunsdfti.c
+++ b/third_party/compiler_rt/fixunsdfti.c
@ -0,0 +1,26 @@
+/* clang-format off */
+/* ===-- fixunsdfti.c - Implement __fixunsdfti -----------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#include "third_party/compiler_rt/int_lib.h"
+
+#ifdef CRT_HAS_128BIT
+#define DOUBLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+typedef tu_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixuint_impl.inc"
+
+COMPILER_RT_ABI tu_int
+__fixunsdfti(fp_t a) {
+    return __fixuint(a);
+}
+#endif /* CRT_HAS_128BIT */
--- a/third_party/compiler_rt/fixunssfdi.c
+++ b/third_party/compiler_rt/fixunssfdi.c
@ -0,0 +1,56 @@
+/* clang-format off */
+/* ===-- fixunssfdi.c - Implement __fixunssfdi -----------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define SINGLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#ifndef __SOFT_FP__
+/* Support for systems that have hardware floating-point; can set the invalid
+ * flag as a side-effect of computation.
+ */
+
+COMPILER_RT_ABI du_int
+__fixunssfdi(float a)
+{
+    if (a <= 0.0f) return 0;
+    double da = a;
+    su_int high = da / 4294967296.f;               /* da / 0x1p32f; */
+    su_int low = da - (double)high * 4294967296.f; /* high * 0x1p32f; */
+    return ((du_int)high << 32) | low;
+}
+
+#else
+/* Support for systems that don't have hardware floating-point; there are no
+ * flags to set, and we don't want to code-gen to an unknown soft-float
+ * implementation.
+ */
+
+typedef du_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixuint_impl.inc"
+
+COMPILER_RT_ABI du_int
+__fixunssfdi(fp_t a) {
+    return __fixuint(a);
+}
+
+#endif
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI du_int __aeabi_f2ulz(fp_t a) {
+  return __fixunssfdi(a);
+}
+#else
+AEABI_RTABI du_int __aeabi_f2ulz(fp_t a) COMPILER_RT_ALIAS(__fixunssfdi);
+#endif
+#endif
--- a/third_party/compiler_rt/fixunssfsi.c
+++ b/third_party/compiler_rt/fixunssfsi.c
@ -0,0 +1,36 @@
+/* clang-format off */
+/* ===-- fixunssfsi.c - Implement __fixunssfsi -----------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __fixunssfsi for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define SINGLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+typedef su_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixuint_impl.inc"
+
+COMPILER_RT_ABI su_int
+__fixunssfsi(fp_t a) {
+    return __fixuint(a);
+}
+
+#if defined(__ARM_EABI__)
+#if defined(COMPILER_RT_ARMHF_TARGET)
+AEABI_RTABI su_int __aeabi_f2uiz(fp_t a) {
+  return __fixunssfsi(a);
+}
+#else
+AEABI_RTABI su_int __aeabi_f2uiz(fp_t a) COMPILER_RT_ALIAS(__fixunssfsi);
+#endif
+#endif
--- a/third_party/compiler_rt/fixunssfti.c
+++ b/third_party/compiler_rt/fixunssfti.c
@ -0,0 +1,29 @@
+/* clang-format off */
+/* ===-- fixunssfti.c - Implement __fixunssfti -----------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __fixunssfti for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define SINGLE_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#if defined(CRT_HAS_128BIT)
+typedef tu_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixuint_impl.inc"
+
+COMPILER_RT_ABI tu_int
+__fixunssfti(fp_t a) {
+    return __fixuint(a);
+}
+#endif
--- a/third_party/compiler_rt/fixunstfdi.c
+++ b/third_party/compiler_rt/fixunstfdi.c
@ -0,0 +1,25 @@
+/* clang-format off */
+/* ===-- fixunstfdi.c - Implement __fixunstfdi -----------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define QUAD_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
+typedef du_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixuint_impl.inc"
+
+COMPILER_RT_ABI du_int
+__fixunstfdi(fp_t a) {
+    return __fixuint(a);
+}
+#endif
--- a/third_party/compiler_rt/fixunstfsi.c
+++ b/third_party/compiler_rt/fixunstfsi.c
@ -0,0 +1,25 @@
+/* clang-format off */
+/* ===-- fixunstfsi.c - Implement __fixunstfsi -----------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define QUAD_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
+typedef su_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixuint_impl.inc"
+
+COMPILER_RT_ABI su_int
+__fixunstfsi(fp_t a) {
+    return __fixuint(a);
+}
+#endif
--- a/third_party/compiler_rt/fixunstfti.c
+++ b/third_party/compiler_rt/fixunstfti.c
@ -0,0 +1,25 @@
+/* clang-format off */
+/* ===-- fixunstfsi.c - Implement __fixunstfsi -----------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#define QUAD_PRECISION
+#include "third_party/compiler_rt/fp_lib.inc"
+
+#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
+typedef tu_int fixuint_t;
+#include "third_party/compiler_rt/fp_fixuint_impl.inc"
+
+COMPILER_RT_ABI tu_int
+__fixunstfti(fp_t a) {
+    return __fixuint(a);
+}
+#endif
--- a/third_party/compiler_rt/fixunsxfdi.c
+++ b/third_party/compiler_rt/fixunsxfdi.c
@ -0,0 +1,49 @@
+/* clang-format off */
+/* ===-- fixunsxfdi.c - Implement __fixunsxfdi -----------------------------===
+ *
+ *                     The LLVM Compiler Infrastructure
+ *
+ * This file is dual licensed under the MIT and the University of Illinois Open
+ * Source Licenses. See LICENSE.TXT for details.
+ *
+ * ===----------------------------------------------------------------------===
+ *
+ * This file implements __fixunsxfdi for the compiler_rt library.
+ *
+ * ===----------------------------------------------------------------------===
+ */
+
+STATIC_YOINK("huge_compiler_rt_license");
+
+#if !_ARCH_PPC
+
+#include "third_party/compiler_rt/int_lib.h"
+
+/* Returns: convert a to a unsigned long long, rounding toward zero.
+ *          Negative values all become zero.
+ */
+
+/* Assumption: long double is an intel 80 bit floating point type padded with 6 bytes
+ *             du_int is a 64 bit integral type
+ *             value in long double is representable in du_int or is negative 
+ *                 (no range checking performed)
+ */
+
+/* gggg gggg gggg gggg gggg gggg gggg gggg | gggg gggg gggg gggg seee eeee eeee eeee |
+ * 1mmm mmmm mmmm mmmm mmmm mmmm mmmm mmmm | mmmm mmmm mmmm mmmm mmmm mmmm mmmm mmmm
+ */
+
+COMPILER_RT_ABI du_int
+__fixunsxfdi(long double a)
+{
+    long_double_bits fb;
+    fb.f = a;
+    int e = (fb.u.high.s.low & 0x00007FFF) - 16383;
+    if (e < 0 || (fb.u.high.s.low & 0x00008000))
+        return 0;
+    if ((unsigned)e > sizeof(du_int) * CHAR_BIT)
+        return ~(du_int)0;
+    return fb.u.low.all >> (63 - e);
+}
+
+#endif
--- a/Show more
+++ b/Show more