93 lines
3.7 KiB
Text
93 lines
3.7 KiB
Text
|
##
|
||
|
## Netstrings spec copied as-is from http://cr.yp.to/proto/netstrings.txt
|
||
|
##
|
||
|
|
||
|
Netstrings
|
||
|
D. J. Bernstein, djb@pobox.com
|
||
|
19970201
|
||
|
|
||
|
|
||
|
1. Introduction
|
||
|
|
||
|
A netstring is a self-delimiting encoding of a string. Netstrings are
|
||
|
very easy to generate and to parse. Any string may be encoded as a
|
||
|
netstring; there are no restrictions on length or on allowed bytes.
|
||
|
Another virtue of a netstring is that it declares the string size up
|
||
|
front. Thus an application can check in advance whether it has enough
|
||
|
space to store the entire string.
|
||
|
|
||
|
Netstrings may be used as a basic building block for reliable network
|
||
|
protocols. Most high-level protocols, in effect, transmit a sequence
|
||
|
of strings; those strings may be encoded as netstrings and then
|
||
|
concatenated into a sequence of characters, which in turn may be
|
||
|
transmitted over a reliable stream protocol such as TCP.
|
||
|
|
||
|
Note that netstrings can be used recursively. The result of encoding
|
||
|
a sequence of strings is a single string. A series of those encoded
|
||
|
strings may in turn be encoded into a single string. And so on.
|
||
|
|
||
|
In this document, a string of 8-bit bytes may be written in two
|
||
|
different forms: as a series of hexadecimal numbers between angle
|
||
|
brackets, or as a sequence of ASCII characters between double quotes.
|
||
|
For example, <68 65 6c 6c 6f 20 77 6f 72 6c 64 21> is a string of
|
||
|
length 12; it is the same as the string "hello world!".
|
||
|
|
||
|
Although this document restricts attention to strings of 8-bit bytes,
|
||
|
netstrings could be used with any 6-bit-or-larger character set.
|
||
|
|
||
|
|
||
|
2. Definition
|
||
|
|
||
|
Any string of 8-bit bytes may be encoded as [len]":"[string]",".
|
||
|
Here [string] is the string and [len] is a nonempty sequence of ASCII
|
||
|
digits giving the length of [string] in decimal. The ASCII digits are
|
||
|
<30> for 0, <31> for 1, and so on up through <39> for 9. Extra zeros
|
||
|
at the front of [len] are prohibited: [len] begins with <30> exactly
|
||
|
when [string] is empty.
|
||
|
|
||
|
For example, the string "hello world!" is encoded as <31 32 3a 68
|
||
|
65 6c 6c 6f 20 77 6f 72 6c 64 21 2c>, i.e., "12:hello world!,". The
|
||
|
empty string is encoded as "0:,".
|
||
|
|
||
|
[len]":"[string]"," is called a netstring. [string] is called the
|
||
|
interpretation of the netstring.
|
||
|
|
||
|
|
||
|
3. Sample code
|
||
|
|
||
|
The following C code starts with a buffer buf of length len and
|
||
|
prints it as a netstring.
|
||
|
|
||
|
if (printf("%lu:",len) < 0) barf();
|
||
|
if (fwrite(buf,1,len,stdout) < len) barf();
|
||
|
if (putchar(',') < 0) barf();
|
||
|
|
||
|
The following C code reads a netstring and decodes it into a
|
||
|
dynamically allocated buffer buf of length len.
|
||
|
|
||
|
if (scanf("%9lu",&len) < 1) barf(); /* >999999999 bytes is bad */
|
||
|
if (getchar() != ':') barf();
|
||
|
buf = malloc(len + 1); /* malloc(0) is not portable */
|
||
|
if (!buf) barf();
|
||
|
if (fread(buf,1,len,stdin) < len) barf();
|
||
|
if (getchar() != ',') barf();
|
||
|
|
||
|
Both of these code fragments assume that the local character set is
|
||
|
ASCII, and that the relevant stdio streams are in binary mode.
|
||
|
|
||
|
|
||
|
4. Security considerations
|
||
|
|
||
|
The famous Finger security hole may be blamed on Finger's use of the
|
||
|
CRLF encoding. In that encoding, each string is simply terminated by
|
||
|
CRLF. This encoding has several problems. Most importantly, it does
|
||
|
not declare the string size in advance. This means that a correct
|
||
|
CRLF parser must be prepared to ask for more and more memory as it is
|
||
|
reading the string. In the case of Finger, a lazy implementor found
|
||
|
this to be too much trouble; instead he simply declared a fixed-size
|
||
|
buffer and used C's gets() function. The rest is history.
|
||
|
|
||
|
In contrast, as the above sample code shows, it is very easy to
|
||
|
handle netstrings without risking buffer overflow. Thus widespread
|
||
|
use of netstrings may improve network security.
|