2024-06-03 16:09:33 +00:00
|
|
|
// -*-mode:c++;indent-tabs-mode:nil;c-basic-offset:4;tab-width:8;coding:utf-8-*-
|
|
|
|
// vi: set et ft=cpp ts=4 sts=4 sw=4 fenc=utf-8 :vi
|
2024-06-23 17:08:48 +00:00
|
|
|
#ifndef CTL_STRING_H_
|
|
|
|
#define CTL_STRING_H_
|
Introduce more CTL content
This change introduces accumulate, addressof, advance, all_of, distance,
array, enable_if, allocator_traits, back_inserter, bad_alloc, is_signed,
any_of, copy, exception, fill, fill_n, is_same, is_same_v, out_of_range,
lexicographical_compare, is_integral, uninitialized_fill_n, is_unsigned,
numeric_limits, uninitialized_fill, iterator_traits, move_backward, min,
max, iterator_tag, move_iterator, reverse_iterator, uninitialized_move_n
This change experiments with rewriting the ctl::vector class to make the
CTL design more similar to the STL. So far it has not slowed things down
to have 42 #include lines rather than 2, since it's still almost nothing
compared to LLVM's code. In fact the closer we can flirt with being just
like libcxx, the better chance we might have of discovering exactly what
makes it so slow to compile. It would be an enormous discovery if we can
find one simple trick to solving the issue there instead.
This also fixes a bug in `ctl::string(const string &s)` when `s` is big.
2024-06-28 05:18:55 +00:00
|
|
|
#include "reverse_iterator.h"
|
2024-06-04 12:41:53 +00:00
|
|
|
#include "string_view.h"
|
2024-06-03 16:09:33 +00:00
|
|
|
|
2024-06-04 12:41:53 +00:00
|
|
|
namespace ctl {
|
2024-06-03 16:09:33 +00:00
|
|
|
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
class string;
|
2024-06-03 16:09:33 +00:00
|
|
|
|
2024-07-01 10:48:28 +00:00
|
|
|
ctl::string strcat(ctl::string_view, ctl::string_view) noexcept __wur;
|
2024-06-03 16:09:33 +00:00
|
|
|
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
namespace __ {
|
|
|
|
|
|
|
|
constexpr size_t string_size = 3 * sizeof(size_t);
|
|
|
|
constexpr size_t sso_max = string_size - 1;
|
|
|
|
constexpr size_t big_mask = ~(1ull << (8ull * sizeof(size_t) - 1ull));
|
|
|
|
|
|
|
|
struct small_string
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
char buf[sso_max];
|
|
|
|
// interpretation is: size == sso_max - rem
|
|
|
|
unsigned char rem;
|
|
|
|
#if 0
|
|
|
|
size_t rem : 7;
|
|
|
|
size_t big : 1 /* = 0 */;
|
|
|
|
#endif
|
|
|
|
};
|
|
|
|
|
|
|
|
struct big_string
|
|
|
|
{
|
|
|
|
char* p;
|
|
|
|
size_t n;
|
|
|
|
// interpretation is: capacity == c & big_mask
|
|
|
|
size_t c;
|
|
|
|
#if 0
|
|
|
|
size_t c : sizeof(size_t) * 8 - 1;
|
|
|
|
size_t big : 1 /* = 1 */;
|
|
|
|
#endif
|
|
|
|
};
|
|
|
|
|
|
|
|
} // namespace __
|
2024-06-03 16:09:33 +00:00
|
|
|
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
class string
|
|
|
|
{
|
|
|
|
public:
|
Introduce more CTL content
This change introduces accumulate, addressof, advance, all_of, distance,
array, enable_if, allocator_traits, back_inserter, bad_alloc, is_signed,
any_of, copy, exception, fill, fill_n, is_same, is_same_v, out_of_range,
lexicographical_compare, is_integral, uninitialized_fill_n, is_unsigned,
numeric_limits, uninitialized_fill, iterator_traits, move_backward, min,
max, iterator_tag, move_iterator, reverse_iterator, uninitialized_move_n
This change experiments with rewriting the ctl::vector class to make the
CTL design more similar to the STL. So far it has not slowed things down
to have 42 #include lines rather than 2, since it's still almost nothing
compared to LLVM's code. In fact the closer we can flirt with being just
like libcxx, the better chance we might have of discovering exactly what
makes it so slow to compile. It would be an enormous discovery if we can
find one simple trick to solving the issue there instead.
This also fixes a bug in `ctl::string(const string &s)` when `s` is big.
2024-06-28 05:18:55 +00:00
|
|
|
using value_type = char;
|
|
|
|
using size_type = size_t;
|
|
|
|
using difference_type = ptrdiff_t;
|
|
|
|
using pointer = value_type*;
|
|
|
|
using const_pointer = const value_type*;
|
|
|
|
using reference = value_type&;
|
|
|
|
using const_reference = const value_type&;
|
|
|
|
using iterator = pointer;
|
|
|
|
using const_iterator = const_pointer;
|
|
|
|
using reverse_iterator = ctl::reverse_iterator<iterator>;
|
|
|
|
using const_reverse_iterator = ctl::reverse_iterator<const_iterator>;
|
|
|
|
|
2024-06-03 16:09:33 +00:00
|
|
|
static constexpr size_t npos = -1;
|
|
|
|
|
2024-06-20 18:52:12 +00:00
|
|
|
string() noexcept
|
|
|
|
{
|
|
|
|
__builtin_memset(blob, 0, sizeof(size_t) * 2);
|
|
|
|
// equivalent to set_small_size(0) but also zeroes memory
|
|
|
|
*(((size_t*)blob) + 2) = __::sso_max << (sizeof(size_t) - 1) * 8;
|
|
|
|
}
|
|
|
|
|
2024-07-01 10:48:28 +00:00
|
|
|
string(const ctl::string_view s) noexcept
|
2024-06-20 18:52:12 +00:00
|
|
|
{
|
|
|
|
if (s.n <= __::sso_max) {
|
|
|
|
__builtin_memcpy(blob, s.p, s.n);
|
|
|
|
__builtin_memset(blob + s.n, 0, __::sso_max - s.n);
|
|
|
|
set_small_size(s.n);
|
|
|
|
} else {
|
|
|
|
init_big(s);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
explicit string(const size_t n, const char ch = 0) noexcept
|
|
|
|
{
|
|
|
|
if (n <= __::sso_max) {
|
|
|
|
__builtin_memset(blob, ch, n);
|
|
|
|
__builtin_memset(blob + n, 0, __::sso_max - n);
|
|
|
|
set_small_size(n);
|
|
|
|
} else {
|
|
|
|
init_big(n, ch);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
string(const char* const p) noexcept
|
2024-07-01 10:48:28 +00:00
|
|
|
: string(ctl::string_view(p, __builtin_strlen(p)))
|
2024-06-20 18:52:12 +00:00
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
string(const string& r) noexcept
|
|
|
|
{
|
|
|
|
if (r.size() <= __::sso_max) {
|
|
|
|
__builtin_memcpy(blob, r.data(), __::string_size);
|
|
|
|
set_small_size(r.size());
|
|
|
|
} else {
|
|
|
|
init_big(r);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
string(const char* const p, const size_t n) noexcept
|
2024-07-01 10:48:28 +00:00
|
|
|
: string(ctl::string_view(p, n))
|
2024-06-20 18:52:12 +00:00
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
~string() /* noexcept */
|
|
|
|
{
|
|
|
|
if (isbig())
|
|
|
|
destroy_big();
|
|
|
|
}
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
|
|
|
|
string& operator=(string) noexcept;
|
2024-06-03 16:09:33 +00:00
|
|
|
const char* c_str() const noexcept;
|
|
|
|
|
|
|
|
void pop_back() noexcept;
|
|
|
|
void grow(size_t) noexcept;
|
|
|
|
void reserve(size_t) noexcept;
|
|
|
|
void resize(size_t, char = 0) noexcept;
|
|
|
|
void append(char) noexcept;
|
|
|
|
void append(char, size_t) noexcept;
|
|
|
|
void append(unsigned long) noexcept;
|
|
|
|
void append(const void*, size_t) noexcept;
|
2024-07-01 10:48:28 +00:00
|
|
|
string& insert(size_t, ctl::string_view) noexcept;
|
2024-06-04 12:41:53 +00:00
|
|
|
string& erase(size_t = 0, size_t = npos) noexcept;
|
|
|
|
string substr(size_t = 0, size_t = npos) const noexcept;
|
2024-07-01 10:48:28 +00:00
|
|
|
string& replace(size_t, size_t, ctl::string_view) noexcept;
|
|
|
|
bool operator==(ctl::string_view) const noexcept;
|
|
|
|
bool operator!=(ctl::string_view) const noexcept;
|
|
|
|
bool contains(ctl::string_view) const noexcept;
|
|
|
|
bool ends_with(ctl::string_view) const noexcept;
|
|
|
|
bool starts_with(ctl::string_view) const noexcept;
|
2024-06-03 16:09:33 +00:00
|
|
|
size_t find(char, size_t = 0) const noexcept;
|
2024-07-01 10:48:28 +00:00
|
|
|
size_t find(ctl::string_view, size_t = 0) const noexcept;
|
2024-06-03 16:09:33 +00:00
|
|
|
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
void swap(string& s) noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
2024-06-19 05:00:59 +00:00
|
|
|
ctl::swap(blob, s.blob);
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
string(string&& s) noexcept
|
|
|
|
{
|
2024-06-15 17:45:52 +00:00
|
|
|
__builtin_memcpy(blob, s.blob, __::string_size);
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
s.set_small_size(0);
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void clear() noexcept
|
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
if (isbig()) {
|
2024-06-29 02:07:35 +00:00
|
|
|
__b.n = 0;
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
} else {
|
|
|
|
set_small_size(0);
|
|
|
|
}
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
bool empty() const noexcept
|
|
|
|
{
|
2024-06-29 02:07:35 +00:00
|
|
|
return isbig() ? !__b.n : __s.rem >= __::sso_max;
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
}
|
|
|
|
|
2024-06-29 02:07:35 +00:00
|
|
|
__attribute__((__always_inline__)) char* data() noexcept
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
{
|
2024-06-29 02:07:35 +00:00
|
|
|
return isbig() ? __b.p : __s.buf;
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
2024-06-29 02:07:35 +00:00
|
|
|
__attribute__((__always_inline__)) const char* data() const noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
2024-06-29 02:07:35 +00:00
|
|
|
return isbig() ? __b.p : __s.buf;
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
2024-06-29 02:07:35 +00:00
|
|
|
__attribute__((__always_inline__)) size_t size() const noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
#if 0
|
2024-06-29 02:07:35 +00:00
|
|
|
if (!isbig() && __s.rem > __::sso_max)
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
__builtin_trap();
|
|
|
|
#endif
|
2024-06-29 02:07:35 +00:00
|
|
|
return isbig() ? __b.n : __::sso_max - __s.rem;
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
size_t length() const noexcept
|
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
return size();
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
size_t capacity() const noexcept
|
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
#if 0
|
2024-06-29 02:07:35 +00:00
|
|
|
if (isbig() && __b.c <= __::sso_max)
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
__builtin_trap();
|
|
|
|
#endif
|
2024-06-29 02:07:35 +00:00
|
|
|
return isbig() ? __::big_mask & __b.c : __::string_size;
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
iterator begin() noexcept
|
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
return data();
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
Introduce more CTL content
This change introduces accumulate, addressof, advance, all_of, distance,
array, enable_if, allocator_traits, back_inserter, bad_alloc, is_signed,
any_of, copy, exception, fill, fill_n, is_same, is_same_v, out_of_range,
lexicographical_compare, is_integral, uninitialized_fill_n, is_unsigned,
numeric_limits, uninitialized_fill, iterator_traits, move_backward, min,
max, iterator_tag, move_iterator, reverse_iterator, uninitialized_move_n
This change experiments with rewriting the ctl::vector class to make the
CTL design more similar to the STL. So far it has not slowed things down
to have 42 #include lines rather than 2, since it's still almost nothing
compared to LLVM's code. In fact the closer we can flirt with being just
like libcxx, the better chance we might have of discovering exactly what
makes it so slow to compile. It would be an enormous discovery if we can
find one simple trick to solving the issue there instead.
This also fixes a bug in `ctl::string(const string &s)` when `s` is big.
2024-06-28 05:18:55 +00:00
|
|
|
const_iterator begin() const noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
Introduce more CTL content
This change introduces accumulate, addressof, advance, all_of, distance,
array, enable_if, allocator_traits, back_inserter, bad_alloc, is_signed,
any_of, copy, exception, fill, fill_n, is_same, is_same_v, out_of_range,
lexicographical_compare, is_integral, uninitialized_fill_n, is_unsigned,
numeric_limits, uninitialized_fill, iterator_traits, move_backward, min,
max, iterator_tag, move_iterator, reverse_iterator, uninitialized_move_n
This change experiments with rewriting the ctl::vector class to make the
CTL design more similar to the STL. So far it has not slowed things down
to have 42 #include lines rather than 2, since it's still almost nothing
compared to LLVM's code. In fact the closer we can flirt with being just
like libcxx, the better chance we might have of discovering exactly what
makes it so slow to compile. It would be an enormous discovery if we can
find one simple trick to solving the issue there instead.
This also fixes a bug in `ctl::string(const string &s)` when `s` is big.
2024-06-28 05:18:55 +00:00
|
|
|
return data();
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
const_iterator cbegin() const noexcept
|
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
return data();
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
Introduce more CTL content
This change introduces accumulate, addressof, advance, all_of, distance,
array, enable_if, allocator_traits, back_inserter, bad_alloc, is_signed,
any_of, copy, exception, fill, fill_n, is_same, is_same_v, out_of_range,
lexicographical_compare, is_integral, uninitialized_fill_n, is_unsigned,
numeric_limits, uninitialized_fill, iterator_traits, move_backward, min,
max, iterator_tag, move_iterator, reverse_iterator, uninitialized_move_n
This change experiments with rewriting the ctl::vector class to make the
CTL design more similar to the STL. So far it has not slowed things down
to have 42 #include lines rather than 2, since it's still almost nothing
compared to LLVM's code. In fact the closer we can flirt with being just
like libcxx, the better chance we might have of discovering exactly what
makes it so slow to compile. It would be an enormous discovery if we can
find one simple trick to solving the issue there instead.
This also fixes a bug in `ctl::string(const string &s)` when `s` is big.
2024-06-28 05:18:55 +00:00
|
|
|
reverse_iterator rbegin() noexcept
|
|
|
|
{
|
|
|
|
return reverse_iterator(end());
|
|
|
|
}
|
|
|
|
|
|
|
|
const_reverse_iterator rbegin() const noexcept
|
|
|
|
{
|
|
|
|
return const_reverse_iterator(end());
|
|
|
|
}
|
|
|
|
|
|
|
|
const_reverse_iterator crbegin() const noexcept
|
|
|
|
{
|
|
|
|
return const_reverse_iterator(end());
|
|
|
|
}
|
|
|
|
|
|
|
|
iterator end() noexcept
|
|
|
|
{
|
|
|
|
return data() + size();
|
|
|
|
}
|
|
|
|
|
|
|
|
const_iterator end() const noexcept
|
|
|
|
{
|
|
|
|
return data() + size();
|
|
|
|
}
|
|
|
|
|
2024-06-03 16:09:33 +00:00
|
|
|
const_iterator cend() const noexcept
|
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
return data() + size();
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
Introduce more CTL content
This change introduces accumulate, addressof, advance, all_of, distance,
array, enable_if, allocator_traits, back_inserter, bad_alloc, is_signed,
any_of, copy, exception, fill, fill_n, is_same, is_same_v, out_of_range,
lexicographical_compare, is_integral, uninitialized_fill_n, is_unsigned,
numeric_limits, uninitialized_fill, iterator_traits, move_backward, min,
max, iterator_tag, move_iterator, reverse_iterator, uninitialized_move_n
This change experiments with rewriting the ctl::vector class to make the
CTL design more similar to the STL. So far it has not slowed things down
to have 42 #include lines rather than 2, since it's still almost nothing
compared to LLVM's code. In fact the closer we can flirt with being just
like libcxx, the better chance we might have of discovering exactly what
makes it so slow to compile. It would be an enormous discovery if we can
find one simple trick to solving the issue there instead.
This also fixes a bug in `ctl::string(const string &s)` when `s` is big.
2024-06-28 05:18:55 +00:00
|
|
|
reverse_iterator rend() noexcept
|
|
|
|
{
|
|
|
|
return reverse_iterator(begin());
|
|
|
|
}
|
|
|
|
|
|
|
|
const_reverse_iterator rend() const noexcept
|
|
|
|
{
|
|
|
|
return const_reverse_iterator(cbegin());
|
|
|
|
}
|
|
|
|
|
|
|
|
const_reverse_iterator crend() const noexcept
|
|
|
|
{
|
|
|
|
return const_reverse_iterator(cbegin());
|
|
|
|
}
|
|
|
|
|
2024-06-03 16:09:33 +00:00
|
|
|
char& front()
|
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
if (!size())
|
2024-06-03 16:09:33 +00:00
|
|
|
__builtin_trap();
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
return data()[0];
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
const char& front() const
|
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
if (!size())
|
2024-06-03 16:09:33 +00:00
|
|
|
__builtin_trap();
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
return data()[0];
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
char& back()
|
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
if (!size())
|
2024-06-03 16:09:33 +00:00
|
|
|
__builtin_trap();
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
return data()[size() - 1];
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
const char& back() const
|
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
if (!size())
|
2024-06-03 16:09:33 +00:00
|
|
|
__builtin_trap();
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
return data()[size() - 1];
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
char& operator[](size_t i) noexcept
|
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
if (i >= size())
|
2024-06-03 16:09:33 +00:00
|
|
|
__builtin_trap();
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
return data()[i];
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
2024-06-16 01:09:05 +00:00
|
|
|
const char& operator[](const size_t i) const noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
if (i >= size())
|
2024-06-03 16:09:33 +00:00
|
|
|
__builtin_trap();
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
return data()[i];
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
2024-06-16 01:09:05 +00:00
|
|
|
void push_back(const char ch) noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
|
|
|
append(ch);
|
|
|
|
}
|
|
|
|
|
2024-07-01 10:48:28 +00:00
|
|
|
void append(const ctl::string_view s) noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
|
|
|
append(s.p, s.n);
|
|
|
|
}
|
|
|
|
|
2024-07-01 10:48:28 +00:00
|
|
|
operator ctl::string_view() const noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
2024-07-01 10:48:28 +00:00
|
|
|
return ctl::string_view(data(), size());
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
|
|
|
|
2024-06-04 12:41:53 +00:00
|
|
|
string& operator=(const char* s) noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
|
|
|
clear();
|
|
|
|
append(s);
|
|
|
|
return *this;
|
|
|
|
}
|
|
|
|
|
2024-07-01 10:48:28 +00:00
|
|
|
string& operator=(const ctl::string_view s) noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
|
|
|
clear();
|
|
|
|
append(s);
|
|
|
|
return *this;
|
|
|
|
}
|
|
|
|
|
2024-06-16 01:09:05 +00:00
|
|
|
string& operator+=(const char x) noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
|
|
|
append(x);
|
|
|
|
return *this;
|
|
|
|
}
|
|
|
|
|
2024-07-01 14:10:35 +00:00
|
|
|
string& operator+=(const char* s) noexcept
|
|
|
|
{
|
|
|
|
append(s);
|
|
|
|
return *this;
|
|
|
|
}
|
|
|
|
|
|
|
|
string& operator+=(const ctl::string s) noexcept
|
|
|
|
{
|
|
|
|
append(s);
|
|
|
|
return *this;
|
|
|
|
}
|
|
|
|
|
2024-07-01 10:48:28 +00:00
|
|
|
string& operator+=(const ctl::string_view s) noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
|
|
|
append(s);
|
|
|
|
return *this;
|
|
|
|
}
|
|
|
|
|
2024-07-01 12:40:38 +00:00
|
|
|
string operator+(const char c) const noexcept
|
|
|
|
{
|
|
|
|
char s[2] = { c };
|
|
|
|
return strcat(*this, s);
|
|
|
|
}
|
|
|
|
|
2024-07-01 14:10:35 +00:00
|
|
|
string operator+(const char* s) const noexcept
|
2024-07-01 12:40:38 +00:00
|
|
|
{
|
|
|
|
return strcat(*this, s);
|
|
|
|
}
|
|
|
|
|
2024-07-01 14:10:35 +00:00
|
|
|
string operator+(const string& s) const noexcept
|
2024-06-04 12:41:53 +00:00
|
|
|
{
|
|
|
|
return strcat(*this, s);
|
|
|
|
}
|
|
|
|
|
2024-07-01 14:10:35 +00:00
|
|
|
string operator+(const ctl::string_view s) const noexcept
|
2024-07-01 12:40:38 +00:00
|
|
|
{
|
|
|
|
return strcat(*this, s);
|
|
|
|
}
|
|
|
|
|
2024-07-01 10:48:28 +00:00
|
|
|
int compare(const ctl::string_view s) const noexcept
|
2024-06-04 12:41:53 +00:00
|
|
|
{
|
|
|
|
return strcmp(*this, s);
|
|
|
|
}
|
|
|
|
|
2024-07-01 10:48:28 +00:00
|
|
|
bool operator<(const ctl::string_view s) const noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
|
|
|
return compare(s) < 0;
|
|
|
|
}
|
|
|
|
|
2024-07-01 10:48:28 +00:00
|
|
|
bool operator<=(const ctl::string_view s) const noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
|
|
|
return compare(s) <= 0;
|
|
|
|
}
|
|
|
|
|
2024-07-01 10:48:28 +00:00
|
|
|
bool operator>(const ctl::string_view s) const noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
|
|
|
return compare(s) > 0;
|
|
|
|
}
|
|
|
|
|
2024-07-01 10:48:28 +00:00
|
|
|
bool operator>=(const ctl::string_view s) const noexcept
|
2024-06-03 16:09:33 +00:00
|
|
|
{
|
|
|
|
return compare(s) >= 0;
|
|
|
|
}
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
|
|
|
|
private:
|
2024-06-20 18:52:12 +00:00
|
|
|
void destroy_big() noexcept;
|
|
|
|
void init_big(const string&) noexcept;
|
2024-07-01 10:48:28 +00:00
|
|
|
void init_big(ctl::string_view) noexcept;
|
2024-06-20 18:52:12 +00:00
|
|
|
void init_big(size_t, char) noexcept;
|
|
|
|
|
2024-06-29 02:07:35 +00:00
|
|
|
__attribute__((__always_inline__)) bool isbig() const noexcept
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
{
|
2024-06-15 17:45:52 +00:00
|
|
|
return *(blob + __::sso_max) & 0x80;
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
}
|
|
|
|
|
2024-06-20 18:52:12 +00:00
|
|
|
void set_small_size(const size_t size) noexcept
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
{
|
|
|
|
if (size > __::sso_max)
|
|
|
|
__builtin_trap();
|
2024-06-20 18:52:12 +00:00
|
|
|
__s.rem = __::sso_max - size;
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
}
|
|
|
|
|
2024-06-20 18:52:12 +00:00
|
|
|
void set_big_string(char* const p, const size_t n, const size_t c2) noexcept
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
{
|
|
|
|
if (c2 > __::big_mask)
|
|
|
|
__builtin_trap();
|
2024-06-20 18:52:12 +00:00
|
|
|
__b.p = p;
|
|
|
|
__b.n = n;
|
|
|
|
__b.c = c2 | ~__::big_mask;
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
}
|
|
|
|
|
2024-07-01 10:48:28 +00:00
|
|
|
friend string strcat(ctl::string_view, ctl::string_view) noexcept;
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
|
2024-06-20 18:52:12 +00:00
|
|
|
union
|
|
|
|
{
|
|
|
|
__::big_string __b;
|
|
|
|
__::small_string __s;
|
|
|
|
char blob[__::string_size];
|
|
|
|
};
|
2024-06-03 16:09:33 +00:00
|
|
|
};
|
|
|
|
|
ctl::string small-string optimization (#1199)
A small-string optimization is a way of reusing inline storage space for
sufficiently small strings, rather than allocating them on the heap. The
current approach takes after an old Facebook string class: it reuses the
highest-order byte for flags and small-string size, in such a way that a
maximally-sized small string will have its last byte zeroed, making it a
null terminator for the C string.
The only flag we have is in the highest-order bit, that says whether the
string is big (set) or small (cleared.) Most of the logic switches based
on the value of this bit; e.g. data() returns big()->p if it's set, else
small()->buf if it's cleared. For a small string, the capacity is always
fixed at sizeof(string) - 1 bytes; we store the length in the last byte,
but we store it as the number of remaining bytes of capacity, so that at
max size, the last byte will read zero and serve as our null terminator.
Morally speaking, our class's storage is a union over two POD C structs.
For now I gravitated towards a slightly more obtuse approach: the string
class itself contains a blob of the right size, and we alias that blob's
pointer for the two structs, taking some care not to run afoul of object
lifetime rules in C++. If anyone wants to improve on this, contributions
are welcome.
This commit also introduces the `ctl::__` namespace. It can't be legally
spelled by library users, and serves as our version of boost's "detail".
We introduced a string::swap function, and we now use that in operator=.
operator= now takes its argument by value, so we never need to check for
the case where the pointers are equal and can just swap the entire store
of the argument with our own, leaving the C++ destructor to free our old
storage afterwards.
There are probably still a few places where our capacity is slightly off
and we grow too fast, although there don't appear to be any where we are
too slow. I will leave these to be fixed in future changes.
2024-06-07 00:50:51 +00:00
|
|
|
static_assert(sizeof(string) == __::string_size);
|
|
|
|
static_assert(sizeof(__::small_string) == __::string_size);
|
|
|
|
static_assert(sizeof(__::big_string) == __::string_size);
|
|
|
|
|
2024-07-01 12:40:38 +00:00
|
|
|
ctl::string
|
2024-07-01 12:54:22 +00:00
|
|
|
to_string(int) noexcept;
|
2024-07-01 12:40:38 +00:00
|
|
|
|
|
|
|
ctl::string
|
2024-07-01 12:54:22 +00:00
|
|
|
to_string(long) noexcept;
|
2024-07-01 12:40:38 +00:00
|
|
|
|
|
|
|
ctl::string
|
2024-07-01 12:54:22 +00:00
|
|
|
to_string(long long) noexcept;
|
2024-07-01 12:40:38 +00:00
|
|
|
|
|
|
|
ctl::string
|
2024-07-01 12:54:22 +00:00
|
|
|
to_string(unsigned) noexcept;
|
2024-07-01 12:40:38 +00:00
|
|
|
|
|
|
|
ctl::string
|
2024-07-01 12:54:22 +00:00
|
|
|
to_string(unsigned long) noexcept;
|
2024-07-01 12:40:38 +00:00
|
|
|
|
|
|
|
ctl::string
|
2024-07-01 12:54:22 +00:00
|
|
|
to_string(unsigned long long) noexcept;
|
2024-07-01 12:40:38 +00:00
|
|
|
|
|
|
|
ctl::string
|
2024-07-01 12:54:22 +00:00
|
|
|
to_string(float) noexcept;
|
2024-07-01 12:40:38 +00:00
|
|
|
|
|
|
|
ctl::string
|
2024-07-01 12:54:22 +00:00
|
|
|
to_string(double) noexcept;
|
2024-07-01 12:40:38 +00:00
|
|
|
|
|
|
|
ctl::string
|
2024-07-01 12:54:22 +00:00
|
|
|
to_string(long double) noexcept;
|
2024-07-01 12:40:38 +00:00
|
|
|
|
2024-06-04 12:41:53 +00:00
|
|
|
} // namespace ctl
|
2024-06-03 16:09:33 +00:00
|
|
|
|
2024-06-04 12:41:53 +00:00
|
|
|
#pragma GCC diagnostic push
|
|
|
|
#pragma GCC diagnostic ignored "-Wliteral-suffix"
|
|
|
|
inline ctl::string
|
2024-06-03 16:09:33 +00:00
|
|
|
operator"" s(const char* s, size_t n)
|
|
|
|
{
|
2024-06-04 12:41:53 +00:00
|
|
|
return ctl::string(s, n);
|
2024-06-03 16:09:33 +00:00
|
|
|
}
|
2024-06-04 12:41:53 +00:00
|
|
|
#pragma GCC diagnostic pop
|
2024-06-03 16:09:33 +00:00
|
|
|
|
2024-06-23 17:08:48 +00:00
|
|
|
#endif // CTL_STRING_H_
|