This is supported by both OpenBSD and FreBSD so it seems possible that
we will run into \" sequences at some point. The handling is basically
identical to \\ sequences.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
The mixing of "byte" and "rune" usage made the code a little more
complicated than necessary. The benchmarks seem to indicate that this
bumps the speed of most operations up by ~3% but I would just chalk that
up to noise.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
There was a TODO to make this code more legible. I still think it's
somewhat ugly, but it does read _slightly_ better as a switch statement.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
By avoiding lots of small string allocations and reallocations when
appending to the output buffer, we can get a pretty decent performance
improvement (~6x for strings that do not require escaping, and ~2x for
most other multi-byte utf8 strings).
goos: linux
goarch: amd64
pkg: github.com/vbatts/go-mtree/pkg/govis
cpu: AMD Ryzen 7 7840U w/ Radeon 780M Graphics
│ before │ after │
│ sec/op │ sec/op vs base │
Vis/NoChange-16 2372.5n ± 2% 379.1n ± 1% -84.02% (p=0.000 n=10)
Vis/Binary-16 2.104µ ± 8% 1.319µ ± 8% -37.35% (p=0.000 n=10)
Vis/ASCII-16 2070.0n ± 1% 737.3n ± 0% -64.38% (p=0.000 n=10)
Vis/German-16 3.380µ ± 1% 1.181µ ± 2% -65.04% (p=0.000 n=10)
Vis/Russian-16 10.927µ ± 2% 5.293µ ± 2% -51.56% (p=0.000 n=10)
Vis/Japanese-16 7.489µ ± 1% 3.990µ ± 0% -46.72% (p=0.000 n=10)
geomean 3.767µ 1.447µ -61.58%
In theory we could get more performance if switch away from fmt.Sprintf,
but the %.N handling would be a little annoying to implement and so we
can punt on that for now.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
By using a buffer, we can avoid a bunch of small allocations that the
previous implementation did. Based on a few small benchmarks, the
performance improvement is very stark (~3x faster for strings that don't
require any escaping, and ~20% faster for multi-byte utf8 strings):
goos: linux
goarch: amd64
pkg: github.com/vbatts/go-mtree/pkg/govis
cpu: AMD Ryzen 7 7840U w/ Radeon 780M Graphics
│ before │ after │
│ sec/op │ sec/op vs base │
Unvis/NoChange-16 1501.0n ± 0% 497.7n ± 1% -66.84% (p=0.000 n=10)
Unvis/Binary-16 1317.5n ± 3% 934.9n ± 9% -29.04% (p=0.000 n=10)
Unvis/ASCII-16 1325.5n ± 1% 616.8n ± 1% -53.47% (p=0.000 n=10)
Unvis/German-16 1884.5n ± 1% 986.9n ± 2% -47.63% (p=0.000 n=10)
Unvis/Russian-16 4.636µ ± 1% 3.796µ ± 1% -18.11% (p=0.000 n=10)
Unvis/Japanese-16 3.453µ ± 1% 2.867µ ± 1% -16.99% (p=0.000 n=10)
geomean 2.072µ 1.206µ -41.77%
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This makes it a little easier to read the common case code which
consumes the token and helps highlight which sub-parsers are explicitly
not consuming tokens until we are sure we are using it.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Passing the parsers as an argument is very C-like and is not really as
idiomadic as just using methods (in my defence, I was still pretty green
when I wrote this code and I was trying to port some logic from C).
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
These error cases were already handled correctly, but we really should
have tests for them anyway. Now that we have proper error variables
declared we can also test for specific errors as well.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This code was written before %w was added to Go, and there were a fair
few mistakes in the copy-pasted error code.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
testify makes most bog-standard test checks much easier to read and
maintain, and is quite widely used. It wasn't really well known back
when govis was first written, but the migration is fairly
straight-forward for most tests.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
govis is a reimplementation of vis(3) and unvis(3) specifically made to
be unicode aware. It was specifically rewritten to replace cvis and the
other go vis reimplementation we have in go-mtree.
Signed-off-by: Aleksa Sarai <asarai@suse.de>