When reading hexadecimal floats, cosmopolitan would previously sometimes
print a number of warnings relating to undefined behavior on left shift:
third_party/gdtoa/gethex.c:172: ubsan warning: signed left shift changed
sign bit or overflowed 12 'int' 28 'int' is undefined behavior
This is because gdtoa assumes left shifts are safe when overflow happens
even on signed integers - this is false: the C standard considers it UB.
This is easy to fix, by simply casting the shifted value to unsigned, as
doing so does not change the value or the semantics of the left shifting
(except for avoiding the undefined behavior, as the C standard specifies
that unsigned overflow yields wraparound, avoiding undefined behaviour).
This commit does this, and adds a testcase that previously triggered UB.
(this also adds test macros to test for exact float equality, instead of
the existing {EXPECT,ASSERT}_FLOAT_EQ macros which only tests inputs for
being "almost equal" (with a significant epsilon) whereas exact equality
makes more sense for certain things such as reading floats from strings,
and modifies other testcases for sscanf/fscanf of floats to utilize it).
* Fix reading the same symbol twice when using `{f,}scanf()`
PR #924 appears to use `unget()` subtly incorrectly when parsing
floating point numbers. The rest of the code only uses `unget()`
immediately followed by `goto Done;` to return back the symbol that
can't possibly belong to the directive we're processing.
With floating-point, however, the ungot characters could very well
be valid for the *next* directive, so we will essentially read them
twice. It can't be seen in `sscanf()` tests because `unget()` is a
no-op there, but the test I added for `fscanf()` fails like this:
...
EXPECT_EQ(0xDEAD, i1)
need 57005 (or 0xdead) =
got 908973 (or 0x000ddead)
...
EXPECT_EQ(0xBEEF, i2)
need 48879 (or 0xbeef) =
got 769775 (or 0x000bbeef)
This means we read 0xDDEAD instead of 0xDEAD and 0xBBEEF instead of
0xBEEF. I checked that both musl and glibc read 0xDEAD/0xBEEF, as
expected.
Fix the failing test by removing the unneeded `unget()` calls.
* Don't read invalid floating-point numbers in `*scanf()`
Currently, we just ignore any errors from `strtod()`. They can
happen either because no valid float can be parsed at all, or
because the state machine recognizes only a prefix of a valid
floating-point number.
Fix this by making sure `strtod()` parses everything we recognized,
provided it's non-empty. This requires to pop the last character
off the FP buffer, which is supposed to be parsed by the next
`*scanf()` directive.
* Make `%c` parsing in `*scanf()` respect the C standard
Currently, `%c`-style directives always succeed even if there
are actually fewer characters in the input than requested.
Before the fix, the added test fails like this:
...
EXPECT_EQ(2, sscanf("ab", "%c %c %c", &c2, &c3, &c4))
need 2 (or 0x02 or '\2' or ENOENT) =
got 3 (or 0x03 or '\3' or ESRCH)
...
EXPECT_EQ(0, sscanf("abcd", "%5c", s2))
need 0 (or 0x0 or '\0') =
got 1 (or 0x01 or '\1' or EPERM)
musl and glibc pass this test.