By using a buffer, we can avoid a bunch of small allocations that the
previous implementation did. Based on a few small benchmarks, the
performance improvement is very stark (~3x faster for strings that don't
require any escaping, and ~20% faster for multi-byte utf8 strings):
goos: linux
goarch: amd64
pkg: github.com/vbatts/go-mtree/pkg/govis
cpu: AMD Ryzen 7 7840U w/ Radeon 780M Graphics
│ before │ after │
│ sec/op │ sec/op vs base │
Unvis/NoChange-16 1501.0n ± 0% 497.7n ± 1% -66.84% (p=0.000 n=10)
Unvis/Binary-16 1317.5n ± 3% 934.9n ± 9% -29.04% (p=0.000 n=10)
Unvis/ASCII-16 1325.5n ± 1% 616.8n ± 1% -53.47% (p=0.000 n=10)
Unvis/German-16 1884.5n ± 1% 986.9n ± 2% -47.63% (p=0.000 n=10)
Unvis/Russian-16 4.636µ ± 1% 3.796µ ± 1% -18.11% (p=0.000 n=10)
Unvis/Japanese-16 3.453µ ± 1% 2.867µ ± 1% -16.99% (p=0.000 n=10)
geomean 2.072µ 1.206µ -41.77%
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This makes it a little easier to read the common case code which
consumes the token and helps highlight which sub-parsers are explicitly
not consuming tokens until we are sure we are using it.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Passing the parsers as an argument is very C-like and is not really as
idiomadic as just using methods (in my defence, I was still pretty green
when I wrote this code and I was trying to port some logic from C).
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This code was written before %w was added to Go, and there were a fair
few mistakes in the copy-pasted error code.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
govis is a reimplementation of vis(3) and unvis(3) specifically made to
be unicode aware. It was specifically rewritten to replace cvis and the
other go vis reimplementation we have in go-mtree.
Signed-off-by: Aleksa Sarai <asarai@suse.de>