mirror of
				https://github.com/jart/cosmopolitan.git
				synced 2025-10-24 18:20:59 +00:00 
			
		
		
		
	- Fix bugs in kDos2Errno definition - malloc() should now be thread safe - Fix bug in rollup.com header generator - Fix open(O_APPEND) on the New Technology - Fix select() on the New Technology and test it - Work towards refactoring i/o for thread safety - Socket reads and writes on NT now poll for signals - Work towards i/o completion ports on the New Technology - Make read() and write() intermittently check for signals - Blinkenlights keyboard i/o so much better on NT w/ poll() - You can now poll() files and sockets at the same time on NT - Fix bug in appendr() that manifests with dlmalloc footers off
		
			
				
	
	
		
			1191 lines
		
	
	
	
		
			60 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			1191 lines
		
	
	
	
		
			60 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
|   This is a version (aka dlmalloc) of malloc/free/realloc written by
 | |
|   Doug Lea and released to the public domain, as explained at
 | |
|   http://creativecommons.org/publicdomain/zero/1.0/ Send questions,
 | |
|   comments, complaints, performance data, etc to dl@cs.oswego.edu
 | |
| 
 | |
| * Version 2.8.6 Wed Aug 29 06:57:58 2012  Doug Lea
 | |
|    Note: There may be an updated version of this malloc obtainable at
 | |
|            ftp://gee.cs.oswego.edu/pub/misc/malloc.c
 | |
|          Check before installing!
 | |
| 
 | |
| * Quickstart
 | |
| 
 | |
|   This library is all in one file to simplify the most common usage:
 | |
|   ftp it, compile it (-O3), and link it into another program. All of
 | |
|   the compile-time options default to reasonable values for use on
 | |
|   most platforms.  You might later want to step through various
 | |
|   compile-time and dynamic tuning options.
 | |
| 
 | |
|   For convenience, an include file for code using this malloc is at:
 | |
|      ftp://gee.cs.oswego.edu/pub/misc/malloc-2.8.6.h
 | |
|   You don't really need this .h file unless you call functions not
 | |
|   defined in your system include files.  The .h file contains only the
 | |
|   excerpts from this file needed for using this malloc on ANSI C/C++
 | |
|   systems, so long as you haven't changed compile-time options about
 | |
|   naming and tuning parameters.  If you do, then you can create your
 | |
|   own malloc.h that does include all settings by cutting at the point
 | |
|   indicated below. Note that you may already by default be using a C
 | |
|   library containing a malloc that is based on some version of this
 | |
|   malloc (for example in linux). You might still want to use the one
 | |
|   in this file to customize settings or to avoid overheads associated
 | |
|   with library versions.
 | |
| 
 | |
| * Vital statistics:
 | |
| 
 | |
|   Supported pointer/size_t representation:       4 or 8 bytes
 | |
|        size_t MUST be an unsigned type of the same width as
 | |
|        pointers. (If you are using an ancient system that declares
 | |
|        size_t as a signed type, or need it to be a different width
 | |
|        than pointers, you can use a previous release of this malloc
 | |
|        (e.g. 2.7.2) supporting these.)
 | |
| 
 | |
|   Alignment:                                     8 bytes (minimum)
 | |
|        This suffices for nearly all current machines and C compilers.
 | |
|        However, you can define MALLOC_ALIGNMENT to be wider than this
 | |
|        if necessary (up to 128bytes), at the expense of using more space.
 | |
| 
 | |
|   Minimum overhead per allocated chunk:   4 or  8 bytes (if 4byte sizes)
 | |
|                                           8 or 16 bytes (if 8byte sizes)
 | |
|        Each malloced chunk has a hidden word of overhead holding size
 | |
|        and status information, and additional cross-check word
 | |
|        if FOOTERS is defined.
 | |
| 
 | |
|   Minimum allocated size: 4-byte ptrs:  16 bytes    (including overhead)
 | |
|                           8-byte ptrs:  32 bytes    (including overhead)
 | |
| 
 | |
|        Even a request for zero bytes (i.e., malloc(0)) returns a
 | |
|        pointer to something of the minimum allocatable size.
 | |
|        The maximum overhead wastage (i.e., number of extra bytes
 | |
|        allocated than were requested in malloc) is less than or equal
 | |
|        to the minimum size, except for requests >= mmap_threshold that
 | |
|        are serviced via mmap(), where the worst case wastage is about
 | |
|        32 bytes plus the remainder from a system page (the minimal
 | |
|        mmap unit); typically 4096 or 8192 bytes.
 | |
| 
 | |
|   Security: static-safe; optionally more or less
 | |
|        The "security" of malloc refers to the ability of malicious
 | |
|        code to accentuate the effects of errors (for example, freeing
 | |
|        space that is not currently malloc'ed or overwriting past the
 | |
|        ends of chunks) in code that calls malloc.  This malloc
 | |
|        guarantees not to modify any memory locations below the base of
 | |
|        heap, i.e., static variables, even in the presence of usage
 | |
|        errors.  The routines additionally detect most improper frees
 | |
|        and reallocs.  All this holds as long as the static bookkeeping
 | |
|        for malloc itself is not corrupted by some other means.  This
 | |
|        is only one aspect of security -- these checks do not, and
 | |
|        cannot, detect all possible programming errors.
 | |
| 
 | |
|        If FOOTERS is defined nonzero, then each allocated chunk
 | |
|        carries an additional check word to verify that it was malloced
 | |
|        from its space.  These check words are the same within each
 | |
|        execution of a program using malloc, but differ across
 | |
|        executions, so externally crafted fake chunks cannot be
 | |
|        freed. This improves security by rejecting frees/reallocs that
 | |
|        could corrupt heap memory, in addition to the checks preventing
 | |
|        writes to statics that are always on.  This may further improve
 | |
|        security at the expense of time and space overhead.  (Note that
 | |
|        FOOTERS may also be worth using with MSPACES.)
 | |
| 
 | |
|        By default detected errors cause the program to abort (calling
 | |
|        "abort()"). You can override this to instead proceed past
 | |
|        errors by defining PROCEED_ON_ERROR.  In this case, a bad free
 | |
|        has no effect, and a malloc that encounters a bad address
 | |
|        caused by user overwrites will ignore the bad address by
 | |
|        dropping pointers and indices to all known memory. This may
 | |
|        be appropriate for programs that should continue if at all
 | |
|        possible in the face of programming errors, although they may
 | |
|        run out of memory because dropped memory is never reclaimed.
 | |
| 
 | |
|        If you don't like either of these options, you can define
 | |
|        CORRUPTION_ERROR_ACTION and USAGE_ERROR_ACTION to do anything
 | |
|        else. And if if you are sure that your program using malloc has
 | |
|        no errors or vulnerabilities, you can define INSECURE to 1,
 | |
|        which might (or might not) provide a small performance improvement.
 | |
| 
 | |
|        It is also possible to limit the maximum total allocatable
 | |
|        space, using malloc_set_footprint_limit. This is not
 | |
|        designed as a security feature in itself (calls to set limits
 | |
|        are not screened or privileged), but may be useful as one
 | |
|        aspect of a secure implementation.
 | |
| 
 | |
|   Thread-safety: NOT thread-safe unless USE_LOCKS defined non-zero
 | |
|        When USE_LOCKS is defined, each public call to malloc, free,
 | |
|        etc is surrounded with a lock. By default, this uses a plain
 | |
|        pthread mutex, win32 critical section, or a spin-lock if if
 | |
|        available for the platform and not disabled by setting
 | |
|        USE_SPIN_LOCKS=0.  However, if USE_RECURSIVE_LOCKS is defined,
 | |
|        recursive versions are used instead (which are not required for
 | |
|        base functionality but may be needed in layered extensions).
 | |
|        Using a global lock is not especially fast, and can be a major
 | |
|        bottleneck.  It is designed only to provide minimal protection
 | |
|        in concurrent environments, and to provide a basis for
 | |
|        extensions.  If you are using malloc in a concurrent program,
 | |
|        consider instead using nedmalloc
 | |
|        (http://www.nedprod.com/programs/portable/nedmalloc/) or
 | |
|        ptmalloc (See http://www.malloc.de), which are derived from
 | |
|        versions of this malloc.
 | |
| 
 | |
|   System requirements: Any combination of MORECORE and/or MMAP/MUNMAP
 | |
|        This malloc can use unix sbrk or any emulation (invoked using
 | |
|        the CALL_MORECORE macro) and/or mmap/munmap or any emulation
 | |
|        (invoked using CALL_MMAP/CALL_MUNMAP) to get and release system
 | |
|        memory.  On most unix systems, it tends to work best if both
 | |
|        MORECORE and MMAP are enabled.  On Win32, it uses emulations
 | |
|        based on VirtualAlloc. It also uses common C library functions
 | |
|        like memset.
 | |
| 
 | |
|   Compliance: I believe it is compliant with the Single Unix Specification
 | |
|        (See http://www.unix.org). Also SVID/XPG, ANSI C, and probably
 | |
|        others as well.
 | |
| 
 | |
| * Overview of algorithms
 | |
| 
 | |
|   This is not the fastest, most space-conserving, most portable, or
 | |
|   most tunable malloc ever written. However it is among the fastest
 | |
|   while also being among the most space-conserving, portable and
 | |
|   tunable.  Consistent balance across these factors results in a good
 | |
|   general-purpose allocator for malloc-intensive programs.
 | |
| 
 | |
|   In most ways, this malloc is a best-fit allocator. Generally, it
 | |
|   chooses the best-fitting existing chunk for a request, with ties
 | |
|   broken in approximately least-recently-used order. (This strategy
 | |
|   normally maintains low fragmentation.) However, for requests less
 | |
|   than 256bytes, it deviates from best-fit when there is not an
 | |
|   exactly fitting available chunk by preferring to use space adjacent
 | |
|   to that used for the previous small request, as well as by breaking
 | |
|   ties in approximately most-recently-used order. (These enhance
 | |
|   locality of series of small allocations.)  And for very large requests
 | |
|   (>= 256Kb by default), it relies on system memory mapping
 | |
|   facilities, if supported.  (This helps avoid carrying around and
 | |
|   possibly fragmenting memory used only for large chunks.)
 | |
| 
 | |
|   All operations (except malloc_stats and mallinfo) have execution
 | |
|   times that are bounded by a constant factor of the number of bits in
 | |
|   a size_t, not counting any clearing in calloc or copying in realloc,
 | |
|   or actions surrounding MORECORE and MMAP that have times
 | |
|   proportional to the number of non-contiguous regions returned by
 | |
|   system allocation routines, which is often just 1. In real-time
 | |
|   applications, you can optionally suppress segment traversals using
 | |
|   NO_SEGMENT_TRAVERSAL, which assures bounded execution even when
 | |
|   system allocators return non-contiguous spaces, at the typical
 | |
|   expense of carrying around more memory and increased fragmentation.
 | |
| 
 | |
|   The implementation is not very modular and seriously overuses
 | |
|   macros. Perhaps someday all C compilers will do as good a job
 | |
|   inlining modular code as can now be done by brute-force expansion,
 | |
|   but now, enough of them seem not to.
 | |
| 
 | |
|   Some compilers issue a lot of warnings about code that is
 | |
|   dead/unreachable only on some platforms, and also about intentional
 | |
|   uses of negation on unsigned types. All known cases of each can be
 | |
|   ignored.
 | |
| 
 | |
|   For a longer but out of date high-level description, see
 | |
|      http://gee.cs.oswego.edu/dl/html/malloc.html
 | |
| 
 | |
|   -----------------------  Chunk representations ------------------------
 | |
| 
 | |
|   (The following includes lightly edited explanations by Colin Plumb.)
 | |
| 
 | |
|   The malloc_chunk declaration below is misleading (but accurate and
 | |
|   necessary).  It declares a "view" into memory allowing access to
 | |
|   necessary fields at known offsets from a given base.
 | |
| 
 | |
|   Chunks of memory are maintained using a `boundary tag' method as
 | |
|   originally described by Knuth.  (See the paper by Paul Wilson
 | |
|   ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for a survey of such
 | |
|   techniques.)  Sizes of free chunks are stored both in the front of
 | |
|   each chunk and at the end.  This makes consolidating fragmented
 | |
|   chunks into bigger chunks fast.  The head fields also hold bits
 | |
|   representing whether chunks are free or in use.
 | |
| 
 | |
|   Here are some pictures to make it clearer.  They are "exploded" to
 | |
|   show that the state of a chunk can be thought of as extending from
 | |
|   the high 31 bits of the head field of its header through the
 | |
|   prev_foot and PINUSE_BIT bit of the following chunk header.
 | |
| 
 | |
|   A chunk that's in use looks like:
 | |
| 
 | |
|    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|            | Size of previous chunk (if P = 0)                             |
 | |
|            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |P|
 | |
|          | Size of this chunk                                         1| +-+
 | |
|    mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|          |                                                               |
 | |
|          +-                                                             -+
 | |
|          |                                                               |
 | |
|          +-                                                             -+
 | |
|          |                                                               :
 | |
|          +-      size - sizeof(size_t) available payload bytes          -+
 | |
|          :                                                               |
 | |
|  chunk-> +-                                                             -+
 | |
|          |                                                               |
 | |
|          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1|
 | |
|        | Size of next chunk (may or may not be in use)               | +-+
 | |
|  mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
| 
 | |
|     And if it's free, it looks like this:
 | |
| 
 | |
|    chunk-> +-                                                             -+
 | |
|            | User payload (must be in use, or we would have merged!)       |
 | |
|            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |P|
 | |
|          | Size of this chunk                                         0| +-+
 | |
|    mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|          | Next pointer                                                  |
 | |
|          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|          | Prev pointer                                                  |
 | |
|          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|          |                                                               :
 | |
|          +-      size - sizeof(struct chunk) unused bytes               -+
 | |
|          :                                                               |
 | |
|  chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|          | Size of this chunk                                            |
 | |
|          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|
 | |
|        | Size of next chunk (must be in use, or we would have merged)| +-+
 | |
|  mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|        |                                                               :
 | |
|        +- User payload                                                -+
 | |
|        :                                                               |
 | |
|        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|                                                                      |0|
 | |
|                                                                      +-+
 | |
|   Note that since we always merge adjacent free chunks, the chunks
 | |
|   adjacent to a free chunk must be in use.
 | |
| 
 | |
|   Given a pointer to a chunk (which can be derived trivially from the
 | |
|   payload pointer) we can, in O(1) time, find out whether the adjacent
 | |
|   chunks are free, and if so, unlink them from the lists that they
 | |
|   are on and merge them with the current chunk.
 | |
| 
 | |
|   Chunks always begin on even word boundaries, so the mem portion
 | |
|   (which is returned to the user) is also on an even word boundary, and
 | |
|   thus at least double-word aligned.
 | |
| 
 | |
|   The P (PINUSE_BIT) bit, stored in the unused low-order bit of the
 | |
|   chunk size (which is always a multiple of two words), is an in-use
 | |
|   bit for the *previous* chunk.  If that bit is *clear*, then the
 | |
|   word before the current chunk size contains the previous chunk
 | |
|   size, and can be used to find the front of the previous chunk.
 | |
|   The very first chunk allocated always has this bit set, preventing
 | |
|   access to non-existent (or non-owned) memory. If pinuse is set for
 | |
|   any given chunk, then you CANNOT determine the size of the
 | |
|   previous chunk, and might even get a memory addressing fault when
 | |
|   trying to do so.
 | |
| 
 | |
|   The C (CINUSE_BIT) bit, stored in the unused second-lowest bit of
 | |
|   the chunk size redundantly records whether the current chunk is
 | |
|   inuse (unless the chunk is mmapped). This redundancy enables usage
 | |
|   checks within free and realloc, and reduces indirection when freeing
 | |
|   and consolidating chunks.
 | |
| 
 | |
|   Each freshly allocated chunk must have both cinuse and pinuse set.
 | |
|   That is, each allocated chunk borders either a previously allocated
 | |
|   and still in-use chunk, or the base of its memory arena. This is
 | |
|   ensured by making all allocations from the `lowest' part of any
 | |
|   found chunk.  Further, no free chunk physically borders another one,
 | |
|   so each free chunk is known to be preceded and followed by either
 | |
|   inuse chunks or the ends of memory.
 | |
| 
 | |
|   Note that the `foot' of the current chunk is actually represented
 | |
|   as the prev_foot of the NEXT chunk. This makes it easier to
 | |
|   deal with alignments etc but can be very confusing when trying
 | |
|   to extend or adapt this code.
 | |
| 
 | |
|   The exceptions to all this are
 | |
| 
 | |
|      1. The special chunk `top' is the top-most available chunk (i.e.,
 | |
|         the one bordering the end of available memory). It is treated
 | |
|         specially.  Top is never included in any bin, is used only if
 | |
|         no other chunk is available, and is released back to the
 | |
|         system if it is very large (see M_TRIM_THRESHOLD).  In effect,
 | |
|         the top chunk is treated as larger (and thus less well
 | |
|         fitting) than any other available chunk.  The top chunk
 | |
|         doesn't update its trailing size field since there is no next
 | |
|         contiguous chunk that would have to index off it. However,
 | |
|         space is still allocated for it (TOP_FOOT_SIZE) to enable
 | |
|         separation or merging when space is extended.
 | |
| 
 | |
|      3. Chunks allocated via mmap, have both cinuse and pinuse bits
 | |
|         cleared in their head fields.  Because they are allocated
 | |
|         one-by-one, each must carry its own prev_foot field, which is
 | |
|         also used to hold the offset this chunk has within its mmapped
 | |
|         region, which is needed to preserve alignment. Each mmapped
 | |
|         chunk is trailed by the first two fields of a fake next-chunk
 | |
|         for sake of usage checks.
 | |
| 
 | |
|   ---------------------- Overlaid data structures -----------------------
 | |
| 
 | |
|   When chunks are not in use, they are treated as nodes of either
 | |
|   lists or trees.
 | |
| 
 | |
|   "Small"  chunks are stored in circular doubly-linked lists, and look
 | |
|   like this:
 | |
| 
 | |
|     chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|             |             Size of previous chunk                            |
 | |
|             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|     `head:' |             Size of chunk, in bytes                         |P|
 | |
|       mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|             |             Forward pointer to next chunk in list             |
 | |
|             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|             |             Back pointer to previous chunk in list            |
 | |
|             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|             |             Unused space (may be 0 bytes long)                .
 | |
|             .                                                               .
 | |
|             .                                                               |
 | |
| nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|     `foot:' |             Size of chunk, in bytes                           |
 | |
|             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
| 
 | |
|   Larger chunks are kept in a form of bitwise digital trees (aka
 | |
|   tries) keyed on chunksizes.  Because malloc_tree_chunks are only for
 | |
|   free chunks greater than 256 bytes, their size doesn't impose any
 | |
|   constraints on user chunk sizes.  Each node looks like:
 | |
| 
 | |
|     chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|             |             Size of previous chunk                            |
 | |
|             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|     `head:' |             Size of chunk, in bytes                         |P|
 | |
|       mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|             |             Forward pointer to next chunk of same size        |
 | |
|             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|             |             Back pointer to previous chunk of same size       |
 | |
|             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|             |             Pointer to left child (child[0])                  |
 | |
|             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|             |             Pointer to right child (child[1])                 |
 | |
|             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|             |             Pointer to parent                                 |
 | |
|             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|             |             bin index of this chunk                           |
 | |
|             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|             |             Unused space                                      .
 | |
|             .                                                               |
 | |
| nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
|     `foot:' |             Size of chunk, in bytes                           |
 | |
|             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | |
| 
 | |
|   Each tree holding treenodes is a tree of unique chunk sizes.  Chunks
 | |
|   of the same size are arranged in a circularly-linked list, with only
 | |
|   the oldest chunk (the next to be used, in our FIFO ordering)
 | |
|   actually in the tree.  (Tree members are distinguished by a non-null
 | |
|   parent pointer.)  If a chunk with the same size an an existing node
 | |
|   is inserted, it is linked off the existing node using pointers that
 | |
|   work in the same way as fd/bk pointers of small chunks.
 | |
| 
 | |
|   Each tree contains a power of 2 sized range of chunk sizes (the
 | |
|   smallest is 0x100 <= x < 0x180), which is is divided in half at each
 | |
|   tree level, with the chunks in the smaller half of the range (0x100
 | |
|   <= x < 0x140 for the top nose) in the left subtree and the larger
 | |
|   half (0x140 <= x < 0x180) in the right subtree.  This is, of course,
 | |
|   done by inspecting individual bits.
 | |
| 
 | |
|   Using these rules, each node's left subtree contains all smaller
 | |
|   sizes than its right subtree.  However, the node at the root of each
 | |
|   subtree has no particular ordering relationship to either.  (The
 | |
|   dividing line between the subtree sizes is based on trie relation.)
 | |
|   If we remove the last chunk of a given size from the interior of the
 | |
|   tree, we need to replace it with a leaf node.  The tree ordering
 | |
|   rules permit a node to be replaced by any leaf below it.
 | |
| 
 | |
|   The smallest chunk in a tree (a common operation in a best-fit
 | |
|   allocator) can be found by walking a path to the leftmost leaf in
 | |
|   the tree.  Unlike a usual binary tree, where we follow left child
 | |
|   pointers until we reach a null, here we follow the right child
 | |
|   pointer any time the left one is null, until we reach a leaf with
 | |
|   both child pointers null. The smallest chunk in the tree will be
 | |
|   somewhere along that path.
 | |
| 
 | |
|   The worst case number of steps to add, find, or remove a node is
 | |
|   bounded by the number of bits differentiating chunks within
 | |
|   bins. Under current bin calculations, this ranges from 6 up to 21
 | |
|   (for 32 bit sizes) or up to 53 (for 64 bit sizes). The typical case
 | |
|   is of course much better.
 | |
| 
 | |
|   ----------------------------- Segments --------------------------------
 | |
| 
 | |
|   Each malloc space may include non-contiguous segments, held in a
 | |
|   list headed by an embedded malloc_segment record representing the
 | |
|   top-most space. Segments also include flags holding properties of
 | |
|   the space. Large chunks that are directly allocated by mmap are not
 | |
|   included in this list. They are instead independently created and
 | |
|   destroyed without otherwise keeping track of them.
 | |
| 
 | |
|   Segment management mainly comes into play for spaces allocated by
 | |
|   MMAP.  Any call to MMAP might or might not return memory that is
 | |
|   adjacent to an existing segment.  MORECORE normally contiguously
 | |
|   extends the current space, so this space is almost always adjacent,
 | |
|   which is simpler and faster to deal with. (This is why MORECORE is
 | |
|   used preferentially to MMAP when both are available -- see
 | |
|   sys_alloc.)  When allocating using MMAP, we don't use any of the
 | |
|   hinting mechanisms (inconsistently) supported in various
 | |
|   implementations of unix mmap, or distinguish reserving from
 | |
|   committing memory. Instead, we just ask for space, and exploit
 | |
|   contiguity when we get it.  It is probably possible to do
 | |
|   better than this on some systems, but no general scheme seems
 | |
|   to be significantly better.
 | |
| 
 | |
|   Management entails a simpler variant of the consolidation scheme
 | |
|   used for chunks to reduce fragmentation -- new adjacent memory is
 | |
|   normally prepended or appended to an existing segment. However,
 | |
|   there are limitations compared to chunk consolidation that mostly
 | |
|   reflect the fact that segment processing is relatively infrequent
 | |
|   (occurring only when getting memory from system) and that we
 | |
|   don't expect to have huge numbers of segments:
 | |
| 
 | |
|   * Segments are not indexed, so traversal requires linear scans.  (It
 | |
|     would be possible to index these, but is not worth the extra
 | |
|     overhead and complexity for most programs on most platforms.)
 | |
|   * New segments are only appended to old ones when holding top-most
 | |
|     memory; if they cannot be prepended to others, they are held in
 | |
|     different segments.
 | |
| 
 | |
|   Except for the top-most segment of an mstate, each segment record
 | |
|   is kept at the tail of its segment. Segments are added by pushing
 | |
|   segment records onto the list headed by &mstate.seg for the
 | |
|   containing mstate.
 | |
| 
 | |
|   Segment flags control allocation/merge/deallocation policies:
 | |
|   * If EXTERN_BIT set, then we did not allocate this segment,
 | |
|     and so should not try to deallocate or merge with others.
 | |
|     (This currently holds only for the initial segment passed
 | |
|     into create_mspace_with_base.)
 | |
|   * If USE_MMAP_BIT set, the segment may be merged with
 | |
|     other surrounding mmapped segments and trimmed/de-allocated
 | |
|     using munmap.
 | |
|   * If neither bit is set, then the segment was obtained using
 | |
|     MORECORE so can be merged with surrounding MORECORE'd segments
 | |
|     and deallocated/trimmed using MORECORE with negative arguments.
 | |
| 
 | |
|   ---------------------------- malloc_state -----------------------------
 | |
| 
 | |
|    A malloc_state holds all of the bookkeeping for a space.
 | |
|    The main fields are:
 | |
| 
 | |
|   Top
 | |
|     The topmost chunk of the currently active segment. Its size is
 | |
|     cached in topsize.  The actual size of topmost space is
 | |
|     topsize+TOP_FOOT_SIZE, which includes space reserved for adding
 | |
|     fenceposts and segment records if necessary when getting more
 | |
|     space from the system.  The size at which to autotrim top is
 | |
|     cached from mparams in trim_check, except that it is disabled if
 | |
|     an autotrim fails.
 | |
| 
 | |
|   Designated victim (dv)
 | |
|     This is the preferred chunk for servicing small requests that
 | |
|     don't have exact fits.  It is normally the chunk split off most
 | |
|     recently to service another small request.  Its size is cached in
 | |
|     dvsize. The link fields of this chunk are not maintained since it
 | |
|     is not kept in a bin.
 | |
| 
 | |
|   SmallBins
 | |
|     An array of bin headers for free chunks.  These bins hold chunks
 | |
|     with sizes less than MIN_LARGE_SIZE bytes. Each bin contains
 | |
|     chunks of all the same size, spaced 8 bytes apart.  To simplify
 | |
|     use in double-linked lists, each bin header acts as a malloc_chunk
 | |
|     pointing to the real first node, if it exists (else pointing to
 | |
|     itself).  This avoids special-casing for headers.  But to avoid
 | |
|     waste, we allocate only the fd/bk pointers of bins, and then use
 | |
|     repositioning tricks to treat these as the fields of a chunk.
 | |
| 
 | |
|   TreeBins
 | |
|     Treebins are pointers to the roots of trees holding a range of
 | |
|     sizes. There are 2 equally spaced treebins for each power of two
 | |
|     from TREE_SHIFT to TREE_SHIFT+16. The last bin holds anything
 | |
|     larger.
 | |
| 
 | |
|   Bin maps
 | |
|     There is one bit map for small bins ("smallmap") and one for
 | |
|     treebins ("treemap).  Each bin sets its bit when non-empty, and
 | |
|     clears the bit when empty.  Bit operations are then used to avoid
 | |
|     bin-by-bin searching -- nearly all "search" is done without ever
 | |
|     looking at bins that won't be selected.  The bit maps
 | |
|     conservatively use 32 bits per map word, even if on 64bit system.
 | |
|     For a good description of some of the bit-based techniques used
 | |
|     here, see Henry S. Warren Jr's book "Hacker's Delight" (and
 | |
|     supplement at http://hackersdelight.org/). Many of these are
 | |
|     intended to reduce the branchiness of paths through malloc etc, as
 | |
|     well as to reduce the number of memory locations read or written.
 | |
| 
 | |
|   Segments
 | |
|     A list of segments headed by an embedded malloc_segment record
 | |
|     representing the initial space.
 | |
| 
 | |
|   Address check support
 | |
|     The least_addr field is the least address ever obtained from
 | |
|     MORECORE or MMAP. Attempted frees and reallocs of any address less
 | |
|     than this are trapped (unless INSECURE is defined).
 | |
| 
 | |
|   Magic tag
 | |
|     A cross-check field that should always hold same value as mparams.magic.
 | |
| 
 | |
|   Max allowed footprint
 | |
|     The maximum allowed bytes to allocate from system (zero means no limit)
 | |
| 
 | |
|   Flags
 | |
|     Bits recording whether to use MMAP, locks, or contiguous MORECORE
 | |
| 
 | |
|   Statistics
 | |
|     Each space keeps track of current and maximum system memory
 | |
|     obtained via MORECORE or MMAP.
 | |
| 
 | |
|   Trim support
 | |
|     Fields holding the amount of unused topmost memory that should trigger
 | |
|     trimming, and a counter to force periodic scanning to release unused
 | |
|     non-topmost segments.
 | |
| 
 | |
|   Locking
 | |
|     If USE_LOCKS is defined, the "mutex" lock is acquired and released
 | |
|     around every public call using this mspace.
 | |
| 
 | |
|   Extension support
 | |
|     A void* pointer and a size_t field that can be used to help implement
 | |
|     extensions to this malloc.
 | |
| 
 | |
| ////////////////////////////////////////////////////////////////////////////////
 | |
| 
 | |
| * MSPACES
 | |
|   If MSPACES is defined, then in addition to malloc, free, etc.,
 | |
|   this file also defines mspace_malloc, mspace_free, etc. These
 | |
|   are versions of malloc routines that take an "mspace" argument
 | |
|   obtained using create_mspace, to control all internal bookkeeping.
 | |
|   If ONLY_MSPACES is defined, only these versions are compiled.
 | |
|   So if you would like to use this allocator for only some allocations,
 | |
|   and your system malloc for others, you can compile with
 | |
|   ONLY_MSPACES and then do something like...
 | |
|     static mspace mymspace = create_mspace(0,0); // for example
 | |
|     #define mymalloc(bytes)  mspace_malloc(mymspace, bytes)
 | |
| 
 | |
|   (Note: If you only need one instance of an mspace, you can instead
 | |
|   use "USE_DL_PREFIX" to relabel the global malloc.)
 | |
| 
 | |
|   You can similarly create thread-local allocators by storing
 | |
|   mspaces as thread-locals. For example:
 | |
|     static __thread mspace tlms = 0;
 | |
|     void*  tlmalloc(size_t bytes) {
 | |
|       if (tlms == 0) tlms = create_mspace(0, 0);
 | |
|       return mspace_malloc(tlms, bytes);
 | |
|     }
 | |
|     void  tlfree(void* mem) { mspace_free(tlms, mem); }
 | |
| 
 | |
|   Unless FOOTERS is defined, each mspace is completely independent.
 | |
|   You cannot allocate from one and free to another (although
 | |
|   conformance is only weakly checked, so usage errors are not always
 | |
|   caught). If FOOTERS is defined, then each chunk carries around a tag
 | |
|   indicating its originating mspace, and frees are directed to their
 | |
|   originating spaces. Normally, this requires use of locks.
 | |
| 
 | |
|  -------------------------  Compile-time options ---------------------------
 | |
| 
 | |
| Be careful in setting #define values for numerical constants of type
 | |
| size_t. On some systems, literal values are not automatically extended
 | |
| to size_t precision unless they are explicitly casted. You can also
 | |
| use the symbolic values MAX_SIZE_T, SIZE_T_ONE, etc below.
 | |
| 
 | |
| WIN32                    default: defined if _WIN32 defined
 | |
|   Defining WIN32 sets up defaults for MS environment and compilers.
 | |
|   Otherwise defaults are for unix. Beware that there seem to be some
 | |
|   cases where this malloc might not be a pure drop-in replacement for
 | |
|   Win32 malloc: Random-looking failures from Win32 GDI API's (eg;
 | |
|   SetDIBits()) may be due to bugs in some video driver implementations
 | |
|   when pixel buffers are malloc()ed, and the region spans more than
 | |
|   one VirtualAlloc()ed region. Because dlmalloc uses a small (64Kb)
 | |
|   default granularity, pixel buffers may straddle virtual allocation
 | |
|   regions more often than when using the Microsoft allocator.  You can
 | |
|   avoid this by using VirtualAlloc() and VirtualFree() for all pixel
 | |
|   buffers rather than using malloc().  If this is not possible,
 | |
|   recompile this malloc with a larger DEFAULT_GRANULARITY. Note:
 | |
|   in cases where MSC and gcc (cygwin) are known to differ on WIN32,
 | |
|   conditions use _MSC_VER to distinguish them.
 | |
| 
 | |
| DLMALLOC_EXPORT       default: extern
 | |
|   Defines how public APIs are declared. If you want to export via a
 | |
|   Windows DLL, you might define this as
 | |
|     #define DLMALLOC_EXPORT extern  __declspec(dllexport)
 | |
|   If you want a POSIX ELF shared object, you might use
 | |
|     #define DLMALLOC_EXPORT extern __attribute__((visibility("default")))
 | |
| 
 | |
| MALLOC_ALIGNMENT         default: (size_t)(2 * sizeof(void *))
 | |
|   Controls the minimum alignment for malloc'ed chunks.  It must be a
 | |
|   power of two and at least 8, even on machines for which smaller
 | |
|   alignments would suffice. It may be defined as larger than this
 | |
|   though. Note however that code and data structures are optimized for
 | |
|   the case of 8-byte alignment.
 | |
| 
 | |
| MSPACES                  default: 0 (false)
 | |
|   If true, compile in support for independent allocation spaces.
 | |
|   This is only supported if HAVE_MMAP is true.
 | |
| 
 | |
| ONLY_MSPACES             default: 0 (false)
 | |
|   If true, only compile in mspace versions, not regular versions.
 | |
| 
 | |
| USE_LOCKS                default: 0 (false)
 | |
|   Causes each call to each public routine to be surrounded with
 | |
|   pthread or WIN32 mutex lock/unlock. (If set true, this can be
 | |
|   overridden on a per-mspace basis for mspace versions.) If set to a
 | |
|   non-zero value other than 1, locks are used, but their
 | |
|   implementation is left out, so lock functions must be supplied manually,
 | |
|   as described below.
 | |
| 
 | |
| USE_SPIN_LOCKS           default: 1 iff USE_LOCKS and spin locks available
 | |
|   If true, uses custom spin locks for locking. This is currently
 | |
|   supported only gcc >= 4.1, older gccs on x86 platforms, and recent
 | |
|   MS compilers.  Otherwise, posix locks or win32 critical sections are
 | |
|   used.
 | |
| 
 | |
| USE_RECURSIVE_LOCKS      default: not defined
 | |
|   If defined nonzero, uses recursive (aka reentrant) locks, otherwise
 | |
|   uses plain mutexes. This is not required for malloc proper, but may
 | |
|   be needed for layered allocators such as nedmalloc.
 | |
| 
 | |
| LOCK_AT_FORK            default: not defined
 | |
|   If defined nonzero, performs pthread_atfork upon initialization
 | |
|   to initialize child lock while holding parent lock. The implementation
 | |
|   assumes that pthread locks (not custom locks) are being used. In other
 | |
|   cases, you may need to customize the implementation.
 | |
| 
 | |
| FOOTERS                  default: 0
 | |
|   If true, provide extra checking and dispatching by placing
 | |
|   information in the footers of allocated chunks. This adds
 | |
|   space and time overhead.
 | |
| 
 | |
| INSECURE                 default: 0
 | |
|   If true, omit checks for usage errors and heap space overwrites.
 | |
| 
 | |
| USE_DL_PREFIX            default: NOT defined
 | |
|   Causes compiler to prefix all public routines with the string 'dl'.
 | |
|   This can be useful when you only want to use this malloc in one part
 | |
|   of a program, using your regular system malloc elsewhere.
 | |
| 
 | |
| MALLOC_INSPECT_ALL       default: NOT defined
 | |
|   If defined, compiles malloc_inspect_all and mspace_inspect_all, that
 | |
|   perform traversal of all heap space.  Unless access to these
 | |
|   functions is otherwise restricted, you probably do not want to
 | |
|   include them in secure implementations.
 | |
| 
 | |
| ABORT                    default: defined as abort()
 | |
|   Defines how to abort on failed checks.  On most systems, a failed
 | |
|   check cannot die with an "assert" or even print an informative
 | |
|   message, because the underlying print routines in turn call malloc,
 | |
|   which will fail again.  Generally, the best policy is to simply call
 | |
|   abort(). It's not very useful to do more than this because many
 | |
|   errors due to overwriting will show up as address faults (null, odd
 | |
|   addresses etc) rather than malloc-triggered checks, so will also
 | |
|   abort.  Also, most compilers know that abort() does not return, so
 | |
|   can better optimize code conditionally calling it.
 | |
| 
 | |
| PROCEED_ON_ERROR           default: defined as 0 (false)
 | |
|   Controls whether detected bad addresses cause them to bypassed
 | |
|   rather than aborting. If set, detected bad arguments to free and
 | |
|   realloc are ignored. And all bookkeeping information is zeroed out
 | |
|   upon a detected overwrite of freed heap space, thus losing the
 | |
|   ability to ever return it from malloc again, but enabling the
 | |
|   application to proceed. If PROCEED_ON_ERROR is defined, the
 | |
|   static variable malloc_corruption_error_count is compiled in
 | |
|   and can be examined to see if errors have occurred. This option
 | |
|   generates slower code than the default abort policy.
 | |
| 
 | |
| DEBUG                    default: NOT defined
 | |
|   The DEBUG setting is mainly intended for people trying to modify
 | |
|   this code or diagnose problems when porting to new platforms.
 | |
|   However, it may also be able to better isolate user errors than just
 | |
|   using runtime checks.  The assertions in the check routines spell
 | |
|   out in more detail the assumptions and invariants underlying the
 | |
|   algorithms.  The checking is fairly extensive, and will slow down
 | |
|   execution noticeably. Calling malloc_stats or mallinfo with DEBUG
 | |
|   set will attempt to check every non-mmapped allocated and free chunk
 | |
|   in the course of computing the summaries.
 | |
| 
 | |
| ABORT_ON_ASSERT_FAILURE   default: defined as 1 (true)
 | |
|   Debugging assertion failures can be nearly impossible if your
 | |
|   version of the assert macro causes malloc to be called, which will
 | |
|   lead to a cascade of further failures, blowing the runtime stack.
 | |
|   ABORT_ON_ASSERT_FAILURE cause assertions failures to call abort(),
 | |
|   which will usually make debugging easier.
 | |
| 
 | |
| MALLOC_FAILURE_ACTION     default: sets errno to ENOMEM, or no-op on win32
 | |
|   The action to take before "return 0" when malloc fails to be able to
 | |
|   return memory because there is none available.
 | |
| 
 | |
| HAVE_MORECORE             default: 1 (true) unless win32 or ONLY_MSPACES
 | |
|   True if this system supports sbrk or an emulation of it.
 | |
| 
 | |
| MORECORE                  default: sbrk
 | |
|   The name of the sbrk-style system routine to call to obtain more
 | |
|   memory.  See below for guidance on writing custom MORECORE
 | |
|   functions. The type of the argument to sbrk/MORECORE varies across
 | |
|   systems.  It cannot be size_t, because it supports negative
 | |
|   arguments, so it is normally the signed type of the same width as
 | |
|   size_t (sometimes declared as "intptr_t").  It doesn't much matter
 | |
|   though. Internally, we only call it with arguments less than half
 | |
|   the max value of a size_t, which should work across all reasonable
 | |
|   possibilities, although sometimes generating compiler warnings.
 | |
| 
 | |
| MORECORE_CONTIGUOUS       default: 1 (true) if HAVE_MORECORE
 | |
|   If true, take advantage of fact that consecutive calls to MORECORE
 | |
|   with positive arguments always return contiguous increasing
 | |
|   addresses.  This is true of unix sbrk. It does not hurt too much to
 | |
|   set it true anyway, since malloc copes with non-contiguities.
 | |
|   Setting it false when definitely non-contiguous saves time
 | |
|   and possibly wasted space it would take to discover this though.
 | |
| 
 | |
| MORECORE_CANNOT_TRIM      default: NOT defined
 | |
|   True if MORECORE cannot release space back to the system when given
 | |
|   negative arguments. This is generally necessary only if you are
 | |
|   using a hand-crafted MORECORE function that cannot handle negative
 | |
|   arguments.
 | |
| 
 | |
| NO_SEGMENT_TRAVERSAL       default: 0
 | |
|   If non-zero, suppresses traversals of memory segments
 | |
|   returned by either MORECORE or CALL_MMAP. This disables
 | |
|   merging of segments that are contiguous, and selectively
 | |
|   releasing them to the OS if unused, but bounds execution times.
 | |
| 
 | |
| HAVE_MMAP                 default: 1 (true)
 | |
|   True if this system supports mmap or an emulation of it.  If so, and
 | |
|   HAVE_MORECORE is not true, MMAP is used for all system
 | |
|   allocation. If set and HAVE_MORECORE is true as well, MMAP is
 | |
|   primarily used to directly allocate very large blocks. It is also
 | |
|   used as a backup strategy in cases where MORECORE fails to provide
 | |
|   space from system. Note: A single call to MUNMAP is assumed to be
 | |
|   able to unmap memory that may have be allocated using multiple calls
 | |
|   to MMAP, so long as they are adjacent.
 | |
| 
 | |
| HAVE_MREMAP               default: 1 on linux, else 0
 | |
|   If true realloc() uses mremap() to re-allocate large blocks and
 | |
|   extend or shrink allocation spaces.
 | |
| 
 | |
| MMAP_CLEARS               default: 1 except on WINCE.
 | |
|   True if mmap clears memory so calloc doesn't need to. This is true
 | |
|   for standard unix mmap using /dev/zero and on WIN32 except for WINCE.
 | |
| 
 | |
| USE_BUILTIN_FFS            default: 0 (i.e., not used)
 | |
|   Causes malloc to use the builtin ffs() function to compute indices.
 | |
|   Some compilers may recognize and intrinsify ffs to be faster than the
 | |
|   supplied C version. Also, the case of x86 using gcc is special-cased
 | |
|   to an asm instruction, so is already as fast as it can be, and so
 | |
|   this setting has no effect. Similarly for Win32 under recent MS compilers.
 | |
|   (On most x86s, the asm version is only slightly faster than the C version.)
 | |
| 
 | |
| malloc_getpagesize         default: derive from system includes, or 4096.
 | |
|   The system page size. To the extent possible, this malloc manages
 | |
|   memory from the system in page-size units.  This may be (and
 | |
|   usually is) a function rather than a constant. This is ignored
 | |
|   if WIN32, where page size is determined using getSystemInfo during
 | |
|   initialization.
 | |
| 
 | |
| USE_DEV_RANDOM             default: 0 (i.e., not used)
 | |
|   Causes malloc to use /dev/random to initialize secure magic seed for
 | |
|   stamping footers. Otherwise, the current time is used.
 | |
| 
 | |
| NO_MALLINFO                default: 0
 | |
|   If defined, don't compile "mallinfo". This can be a simple way
 | |
|   of dealing with mismatches between system declarations and
 | |
|   those in this file.
 | |
| 
 | |
| MALLINFO_FIELD_TYPE        default: size_t
 | |
|   The type of the fields in the mallinfo struct. This was originally
 | |
|   defined as "int" in SVID etc, but is more usefully defined as
 | |
|   size_t. The value is used only if  HAVE_USR_INCLUDE_MALLOC_H is not set
 | |
| 
 | |
| NO_MALLOC_STATS            default: 0
 | |
|   If defined, don't compile "malloc_stats". This avoids calls to
 | |
|   fprintf and bringing in stdio dependencies you might not want.
 | |
| 
 | |
| REALLOC_ZERO_BYTES_FREES    default: not defined
 | |
|   This should be set if a call to realloc with zero bytes should
 | |
|   be the same as a call to free. Some people think it should. Otherwise,
 | |
|   since this malloc returns a unique pointer for malloc(0), so does
 | |
|   realloc(p, 0).
 | |
| 
 | |
| LACKS_UNISTD_H, LACKS_FCNTL_H, LACKS_SYS_PARAM_H, LACKS_SYS_MMAN_H
 | |
| LACKS_STRINGS_H, LACKS_STRING_H, LACKS_SYS_TYPES_H,  LACKS_ERRNO_H
 | |
| LACKS_STDLIB_H LACKS_SCHED_H LACKS_TIME_H  default: NOT defined unless on WIN32
 | |
|   Define these if your system does not have these header files.
 | |
|   You might need to manually insert some of the declarations they provide.
 | |
| 
 | |
| DEFAULT_GRANULARITY        default: page size if MORECORE_CONTIGUOUS,
 | |
|                                 system_info.dwAllocationGranularity in WIN32,
 | |
|                                 otherwise 64K.
 | |
|       Also settable using mallopt(M_GRANULARITY, x)
 | |
|   The unit for allocating and deallocating memory from the system.  On
 | |
|   most systems with contiguous MORECORE, there is no reason to
 | |
|   make this more than a page. However, systems with MMAP tend to
 | |
|   either require or encourage larger granularities.  You can increase
 | |
|   this value to prevent system allocation functions to be called so
 | |
|   often, especially if they are slow.  The value must be at least one
 | |
|   page and must be a power of two.  Setting to 0 causes initialization
 | |
|   to either page size or win32 region size.  (Note: In previous
 | |
|   versions of malloc, the equivalent of this option was called
 | |
|   "TOP_PAD")
 | |
| 
 | |
| DEFAULT_TRIM_THRESHOLD    default: 2MB
 | |
|       Also settable using mallopt(M_TRIM_THRESHOLD, x)
 | |
|   The maximum amount of unused top-most memory to keep before
 | |
|   releasing via malloc_trim in free().  Automatic trimming is mainly
 | |
|   useful in long-lived programs using contiguous MORECORE.  Because
 | |
|   trimming via sbrk can be slow on some systems, and can sometimes be
 | |
|   wasteful (in cases where programs immediately afterward allocate
 | |
|   more large chunks) the value should be high enough so that your
 | |
|   overall system performance would improve by releasing this much
 | |
|   memory.  As a rough guide, you might set to a value close to the
 | |
|   average size of a process (program) running on your system.
 | |
|   Releasing this much memory would allow such a process to run in
 | |
|   memory.  Generally, it is worth tuning trim thresholds when a
 | |
|   program undergoes phases where several large chunks are allocated
 | |
|   and released in ways that can reuse each other's storage, perhaps
 | |
|   mixed with phases where there are no such chunks at all. The trim
 | |
|   value must be greater than page size to have any useful effect.  To
 | |
|   disable trimming completely, you can set to MAX_SIZE_T. Note that the trick
 | |
|   some people use of mallocing a huge space and then freeing it at
 | |
|   program startup, in an attempt to reserve system memory, doesn't
 | |
|   have the intended effect under automatic trimming, since that memory
 | |
|   will immediately be returned to the system.
 | |
| 
 | |
| DEFAULT_MMAP_THRESHOLD       default: 256K
 | |
|       Also settable using mallopt(M_MMAP_THRESHOLD, x)
 | |
|   The request size threshold for using MMAP to directly service a
 | |
|   request. Requests of at least this size that cannot be allocated
 | |
|   using already-existing space will be serviced via mmap.  (If enough
 | |
|   normal freed space already exists it is used instead.)  Using mmap
 | |
|   segregates relatively large chunks of memory so that they can be
 | |
|   individually obtained and released from the host system. A request
 | |
|   serviced through mmap is never reused by any other request (at least
 | |
|   not directly; the system may just so happen to remap successive
 | |
|   requests to the same locations).  Segregating space in this way has
 | |
|   the benefits that: Mmapped space can always be individually released
 | |
|   back to the system, which helps keep the system level memory demands
 | |
|   of a long-lived program low.  Also, mapped memory doesn't become
 | |
|   `locked' between other chunks, as can happen with normally allocated
 | |
|   chunks, which means that even trimming via malloc_trim would not
 | |
|   release them.  However, it has the disadvantage that the space
 | |
|   cannot be reclaimed, consolidated, and then used to service later
 | |
|   requests, as happens with normal chunks.  The advantages of mmap
 | |
|   nearly always outweigh disadvantages for "large" chunks, but the
 | |
|   value of "large" may vary across systems.  The default is an
 | |
|   empirically derived value that works well in most systems. You can
 | |
|   disable mmap by setting to MAX_SIZE_T.
 | |
| 
 | |
| MAX_RELEASE_CHECK_RATE   default: 4095 unless not HAVE_MMAP
 | |
|   The number of consolidated frees between checks to release
 | |
|   unused segments when freeing. When using non-contiguous segments,
 | |
|   especially with multiple mspaces, checking only for topmost space
 | |
|   doesn't always suffice to trigger trimming. To compensate for this,
 | |
|   free() will, with a period of MAX_RELEASE_CHECK_RATE (or the
 | |
|   current number of segments, if greater) try to release unused
 | |
|   segments to the OS when freeing chunks that result in
 | |
|   consolidation. The best value for this parameter is a compromise
 | |
|   between slowing down frees with relatively costly checks that
 | |
|   rarely trigger versus holding on to unused memory. To effectively
 | |
|   disable, set to MAX_SIZE_T. This may lead to a very slight speed
 | |
|   improvement at the expense of carrying around more memory.
 | |
| 
 | |
|   Guidelines for creating a custom version of MORECORE:
 | |
| 
 | |
|   * For best performance, MORECORE should allocate in multiples of pagesize.
 | |
|   * MORECORE may allocate more memory than requested. (Or even less,
 | |
|       but this will usually result in a malloc failure.)
 | |
|   * MORECORE must not allocate memory when given argument zero, but
 | |
|       instead return one past the end address of memory from previous
 | |
|       nonzero call.
 | |
|   * For best performance, consecutive calls to MORECORE with positive
 | |
|       arguments should return increasing addresses, indicating that
 | |
|       space has been contiguously extended.
 | |
|   * Even though consecutive calls to MORECORE need not return contiguous
 | |
|       addresses, it must be OK for malloc'ed chunks to span multiple
 | |
|       regions in those cases where they do happen to be contiguous.
 | |
|   * MORECORE need not handle negative arguments -- it may instead
 | |
|       just return MFAIL when given negative arguments.
 | |
|       Negative arguments are always multiples of pagesize. MORECORE
 | |
|       must not misinterpret negative args as large positive unsigned
 | |
|       args. You can suppress all such calls from even occurring by defining
 | |
|       MORECORE_CANNOT_TRIM,
 | |
| 
 | |
|   As an example alternative MORECORE, here is a custom allocator
 | |
|   kindly contributed for pre-OSX macOS.  It uses virtually but not
 | |
|   necessarily physically contiguous non-paged memory (locked in,
 | |
|   present and won't get swapped out).  You can use it by uncommenting
 | |
|   this section, adding some #includes, and setting up the appropriate
 | |
|   defines above:
 | |
| 
 | |
|       #define MORECORE osMoreCore
 | |
| 
 | |
|   There is also a shutdown routine that should somehow be called for
 | |
|   cleanup upon program exit.
 | |
| 
 | |
|   #define MAX_POOL_ENTRIES 100
 | |
|   #define MINIMUM_MORECORE_SIZE  (64 * 1024U)
 | |
|   static int next_os_pool;
 | |
|   void *our_os_pools[MAX_POOL_ENTRIES];
 | |
| 
 | |
|   void *osMoreCore(int size)
 | |
|   {
 | |
|     void *ptr = 0;
 | |
|     static void *sbrk_top = 0;
 | |
| 
 | |
|     if (size > 0)
 | |
|     {
 | |
|       if (size < MINIMUM_MORECORE_SIZE)
 | |
|          size = MINIMUM_MORECORE_SIZE;
 | |
|       if (CurrentExecutionLevel() == kTaskLevel)
 | |
|          ptr = PoolAllocateResident(size + RM_PAGE_SIZE, 0);
 | |
|       if (ptr == 0)
 | |
|       {
 | |
|         return (void *) MFAIL;
 | |
|       }
 | |
|       // save ptrs so they can be freed during cleanup
 | |
|       our_os_pools[next_os_pool] = ptr;
 | |
|       next_os_pool++;
 | |
|       ptr = (void *) ((((size_t) ptr) + RM_PAGE_MASK) & ~RM_PAGE_MASK);
 | |
|       sbrk_top = (char *) ptr + size;
 | |
|       return ptr;
 | |
|     }
 | |
|     else if (size < 0)
 | |
|     {
 | |
|       // we don't currently support shrink behavior
 | |
|       return (void *) MFAIL;
 | |
|     }
 | |
|     else
 | |
|     {
 | |
|       return sbrk_top;
 | |
|     }
 | |
|   }
 | |
| 
 | |
|   // cleanup any allocated memory pools
 | |
|   // called as last thing before shutting down driver
 | |
| 
 | |
|   void osCleanupMem(void)
 | |
|   {
 | |
|     void **ptr;
 | |
| 
 | |
|     for (ptr = our_os_pools; ptr < &our_os_pools[MAX_POOL_ENTRIES]; ptr++)
 | |
|       if (*ptr)
 | |
|       {
 | |
|          PoolDeallocate(*ptr);
 | |
|          *ptr = 0;
 | |
|       }
 | |
|   }
 | |
| 
 | |
| */
 | |
| 
 | |
| 
 | |
| /* -----------------------------------------------------------------------
 | |
| History:
 | |
|     v2.8.6 Wed Aug 29 06:57:58 2012  Doug Lea
 | |
|       * fix bad comparison in dlposix_memalign
 | |
|       * don't reuse adjusted asize in sys_alloc
 | |
|       * add LOCK_AT_FORK -- thanks to Kirill Artamonov for the suggestion
 | |
|       * reduce compiler warnings -- thanks to all who reported/suggested these
 | |
| 
 | |
|     v2.8.5 Sun May 22 10:26:02 2011  Doug Lea  (dl at gee)
 | |
|       * Always perform unlink checks unless INSECURE
 | |
|       * Add posix_memalign.
 | |
|       * Improve realloc to expand in more cases; expose realloc_in_place.
 | |
|         Thanks to Peter Buhr for the suggestion.
 | |
|       * Add footprint_limit, inspect_all, bulk_free. Thanks
 | |
|         to Barry Hayes and others for the suggestions.
 | |
|       * Internal refactorings to avoid calls while holding locks
 | |
|       * Use non-reentrant locks by default. Thanks to Roland McGrath
 | |
|         for the suggestion.
 | |
|       * Small fixes to mspace_destroy, reset_on_error.
 | |
|       * Various configuration extensions/changes. Thanks
 | |
|          to all who contributed these.
 | |
| 
 | |
|     V2.8.4a Thu Apr 28 14:39:43 2011 (dl at gee.cs.oswego.edu)
 | |
|       * Update Creative Commons URL
 | |
| 
 | |
|     V2.8.4 Wed May 27 09:56:23 2009  Doug Lea  (dl at gee)
 | |
|       * Use zeros instead of prev foot for is_mmapped
 | |
|       * Add mspace_track_large_chunks; thanks to Jean Brouwers
 | |
|       * Fix set_inuse in internal_realloc; thanks to Jean Brouwers
 | |
|       * Fix insufficient sys_alloc padding when using 16byte alignment
 | |
|       * Fix bad error check in mspace_footprint
 | |
|       * Adaptations for ptmalloc; thanks to Wolfram Gloger.
 | |
|       * Reentrant spin locks; thanks to Earl Chew and others
 | |
|       * Win32 improvements; thanks to Niall Douglas and Earl Chew
 | |
|       * Add NO_SEGMENT_TRAVERSAL and MAX_RELEASE_CHECK_RATE options
 | |
|       * Extension hook in malloc_state
 | |
|       * Various small adjustments to reduce warnings on some compilers
 | |
|       * Various configuration extensions/changes for more platforms. Thanks
 | |
|          to all who contributed these.
 | |
| 
 | |
|     V2.8.3 Thu Sep 22 11:16:32 2005  Doug Lea  (dl at gee)
 | |
|       * Add max_footprint functions
 | |
|       * Ensure all appropriate literals are size_t
 | |
|       * Fix conditional compilation problem for some #define settings
 | |
|       * Avoid concatenating segments with the one provided
 | |
|         in create_mspace_with_base
 | |
|       * Rename some variables to avoid compiler shadowing warnings
 | |
|       * Use explicit lock initialization.
 | |
|       * Better handling of sbrk interference.
 | |
|       * Simplify and fix segment insertion, trimming and mspace_destroy
 | |
|       * Reinstate REALLOC_ZERO_BYTES_FREES option from 2.7.x
 | |
|       * Thanks especially to Dennis Flanagan for help on these.
 | |
| 
 | |
|     V2.8.2 Sun Jun 12 16:01:10 2005  Doug Lea  (dl at gee)
 | |
|       * Fix memalign brace error.
 | |
| 
 | |
|     V2.8.1 Wed Jun  8 16:11:46 2005  Doug Lea  (dl at gee)
 | |
|       * Fix improper #endif nesting in C++
 | |
|       * Add explicit casts needed for C++
 | |
| 
 | |
|     V2.8.0 Mon May 30 14:09:02 2005  Doug Lea  (dl at gee)
 | |
|       * Use trees for large bins
 | |
|       * Support mspaces
 | |
|       * Use segments to unify sbrk-based and mmap-based system allocation,
 | |
|         removing need for emulation on most platforms without sbrk.
 | |
|       * Default safety checks
 | |
|       * Optional footer checks. Thanks to William Robertson for the idea.
 | |
|       * Internal code refactoring
 | |
|       * Incorporate suggestions and platform-specific changes.
 | |
|         Thanks to Dennis Flanagan, Colin Plumb, Niall Douglas,
 | |
|         Aaron Bachmann,  Emery Berger, and others.
 | |
|       * Speed up non-fastbin processing enough to remove fastbins.
 | |
|       * Remove useless cfree() to avoid conflicts with other apps.
 | |
|       * Remove internal memcpy, memset. Compilers handle builtins better.
 | |
|       * Remove some options that no one ever used and rename others.
 | |
| 
 | |
|     V2.7.2 Sat Aug 17 09:07:30 2002  Doug Lea  (dl at gee)
 | |
|       * Fix malloc_state bitmap array misdeclaration
 | |
| 
 | |
|     V2.7.1 Thu Jul 25 10:58:03 2002  Doug Lea  (dl at gee)
 | |
|       * Allow tuning of FIRST_SORTED_BIN_SIZE
 | |
|       * Use PTR_UINT as type for all ptr->int casts. Thanks to John Belmonte.
 | |
|       * Better detection and support for non-contiguousness of MORECORE.
 | |
|         Thanks to Andreas Mueller, Conal Walsh, and Wolfram Gloger
 | |
|       * Bypass most of malloc if no frees. Thanks To Emery Berger.
 | |
|       * Fix freeing of old top non-contiguous chunk im sysmalloc.
 | |
|       * Raised default trim and map thresholds to 256K.
 | |
|       * Fix mmap-related #defines. Thanks to Lubos Lunak.
 | |
|       * Fix copy macros; added LACKS_FCNTL_H. Thanks to Neal Walfield.
 | |
|       * Branch-free bin calculation
 | |
|       * Default trim and mmap thresholds now 256K.
 | |
| 
 | |
|     V2.7.0 Sun Mar 11 14:14:06 2001  Doug Lea  (dl at gee)
 | |
|       * Introduce independent_comalloc and independent_calloc.
 | |
|         Thanks to Michael Pachos for motivation and help.
 | |
|       * Make optional .h file available
 | |
|       * Allow > 2GB requests on 32bit systems.
 | |
|       * new WIN32 sbrk, mmap, munmap, lock code from <Walter@GeNeSys-e.de>.
 | |
|         Thanks also to Andreas Mueller <a.mueller at paradatec.de>,
 | |
|         and Anonymous.
 | |
|       * Allow override of MALLOC_ALIGNMENT (Thanks to Ruud Waij for
 | |
|         helping test this.)
 | |
|       * memalign: check alignment arg
 | |
|       * realloc: don't try to shift chunks backwards, since this
 | |
|         leads to  more fragmentation in some programs and doesn't
 | |
|         seem to help in any others.
 | |
|       * Collect all cases in malloc requiring system memory into sysmalloc
 | |
|       * Use mmap as backup to sbrk
 | |
|       * Place all internal state in malloc_state
 | |
|       * Introduce fastbins (although similar to 2.5.1)
 | |
|       * Many minor tunings and cosmetic improvements
 | |
|       * Introduce USE_PUBLIC_MALLOC_WRAPPERS, USE_MALLOC_LOCK
 | |
|       * Introduce MALLOC_FAILURE_ACTION, MORECORE_CONTIGUOUS
 | |
|         Thanks to Tony E. Bennett <tbennett@nvidia.com> and others.
 | |
|       * Include errno.h to support default failure action.
 | |
| 
 | |
|     V2.6.6 Sun Dec  5 07:42:19 1999  Doug Lea  (dl at gee)
 | |
|       * return null for negative arguments
 | |
|       * Added Several WIN32 cleanups from Martin C. Fong <mcfong at yahoo.com>
 | |
|          * Add 'LACKS_SYS_PARAM_H' for those systems without 'sys/param.h'
 | |
|           (e.g. WIN32 platforms)
 | |
|          * Cleanup header file inclusion for WIN32 platforms
 | |
|          * Cleanup code to avoid Microsoft Visual C++ compiler complaints
 | |
|          * Add 'USE_DL_PREFIX' to quickly allow co-existence with existing
 | |
|            memory allocation routines
 | |
|          * Set 'malloc_getpagesize' for WIN32 platforms (needs more work)
 | |
|          * Use 'assert' rather than 'ASSERT' in WIN32 code to conform to
 | |
|            usage of 'assert' in non-WIN32 code
 | |
|          * Improve WIN32 'sbrk()' emulation's 'findRegion()' routine to
 | |
|            avoid infinite loop
 | |
|       * Always call 'fREe()' rather than 'free()'
 | |
| 
 | |
|     V2.6.5 Wed Jun 17 15:57:31 1998  Doug Lea  (dl at gee)
 | |
|       * Fixed ordering problem with boundary-stamping
 | |
| 
 | |
|     V2.6.3 Sun May 19 08:17:58 1996  Doug Lea  (dl at gee)
 | |
|       * Added pvalloc, as recommended by H.J. Liu
 | |
|       * Added 64bit pointer support mainly from Wolfram Gloger
 | |
|       * Added anonymously donated WIN32 sbrk emulation
 | |
|       * Malloc, calloc, getpagesize: add optimizations from Raymond Nijssen
 | |
|       * malloc_extend_top: fix mask error that caused wastage after
 | |
|         foreign sbrks
 | |
|       * Add linux mremap support code from HJ Liu
 | |
| 
 | |
|     V2.6.2 Tue Dec  5 06:52:55 1995  Doug Lea  (dl at gee)
 | |
|       * Integrated most documentation with the code.
 | |
|       * Add support for mmap, with help from
 | |
|         Wolfram Gloger (Gloger@lrz.uni-muenchen.de).
 | |
|       * Use last_remainder in more cases.
 | |
|       * Pack bins using idea from  colin@nyx10.cs.du.edu
 | |
|       * Use ordered bins instead of best-fit threshhold
 | |
|       * Eliminate block-local decls to simplify tracing and debugging.
 | |
|       * Support another case of realloc via move into top
 | |
|       * Fix error occuring when initial sbrk_base not word-aligned.
 | |
|       * Rely on page size for units instead of SBRK_UNIT to
 | |
|         avoid surprises about sbrk alignment conventions.
 | |
|       * Add mallinfo, mallopt. Thanks to Raymond Nijssen
 | |
|         (raymond@es.ele.tue.nl) for the suggestion.
 | |
|       * Add `pad' argument to malloc_trim and top_pad mallopt parameter.
 | |
|       * More precautions for cases where other routines call sbrk,
 | |
|         courtesy of Wolfram Gloger (Gloger@lrz.uni-muenchen.de).
 | |
|       * Added macros etc., allowing use in linux libc from
 | |
|         H.J. Lu (hjl@gnu.ai.mit.edu)
 | |
|       * Inverted this history list
 | |
| 
 | |
|     V2.6.1 Sat Dec  2 14:10:57 1995  Doug Lea  (dl at gee)
 | |
|       * Re-tuned and fixed to behave more nicely with V2.6.0 changes.
 | |
|       * Removed all preallocation code since under current scheme
 | |
|         the work required to undo bad preallocations exceeds
 | |
|         the work saved in good cases for most test programs.
 | |
|       * No longer use return list or unconsolidated bins since
 | |
|         no scheme using them consistently outperforms those that don't
 | |
|         given above changes.
 | |
|       * Use best fit for very large chunks to prevent some worst-cases.
 | |
|       * Added some support for debugging
 | |
| 
 | |
|     V2.6.0 Sat Nov  4 07:05:23 1995  Doug Lea  (dl at gee)
 | |
|       * Removed footers when chunks are in use. Thanks to
 | |
|         Paul Wilson (wilson@cs.texas.edu) for the suggestion.
 | |
| 
 | |
|     V2.5.4 Wed Nov  1 07:54:51 1995  Doug Lea  (dl at gee)
 | |
|       * Added malloc_trim, with help from Wolfram Gloger
 | |
|         (wmglo@Dent.MED.Uni-Muenchen.DE).
 | |
| 
 | |
|     V2.5.3 Tue Apr 26 10:16:01 1994  Doug Lea  (dl at g)
 | |
| 
 | |
|     V2.5.2 Tue Apr  5 16:20:40 1994  Doug Lea  (dl at g)
 | |
|       * realloc: try to expand in both directions
 | |
|       * malloc: swap order of clean-bin strategy;
 | |
|       * realloc: only conditionally expand backwards
 | |
|       * Try not to scavenge used bins
 | |
|       * Use bin counts as a guide to preallocation
 | |
|       * Occasionally bin return list chunks in first scan
 | |
|       * Add a few optimizations from colin@nyx10.cs.du.edu
 | |
| 
 | |
|     V2.5.1 Sat Aug 14 15:40:43 1993  Doug Lea  (dl at g)
 | |
|       * faster bin computation & slightly different binning
 | |
|       * merged all consolidations to one part of malloc proper
 | |
|          (eliminating old malloc_find_space & malloc_clean_bin)
 | |
|       * Scan 2 returns chunks (not just 1)
 | |
|       * Propagate failure in realloc if malloc returns 0
 | |
|       * Add stuff to allow compilation on non-ANSI compilers
 | |
|           from kpv@research.att.com
 | |
| 
 | |
|     V2.5 Sat Aug  7 07:41:59 1993  Doug Lea  (dl at g.oswego.edu)
 | |
|       * removed potential for odd address access in prev_chunk
 | |
|       * removed dependency on getpagesize.h
 | |
|       * misc cosmetics and a bit more internal documentation
 | |
|       * anticosmetics: mangled names in macros to evade debugger strangeness
 | |
|       * tested on sparc, hp-700, dec-mips, rs6000
 | |
|           with gcc & native cc (hp, dec only) allowing
 | |
|           Detlefs & Zorn comparison study (in SIGPLAN Notices.)
 | |
| 
 | |
|     Trial version Fri Aug 28 13:14:29 1992  Doug Lea  (dl at g.oswego.edu)
 | |
|       * Based loosely on libg++-1.2X malloc. (It retains some of the overall
 | |
|          structure of old version,  but most details differ.)
 |