Mailing List Archive

why is GPG_ERR_SEXP_ZERO_PREFIX an error for gcry_sexp_canon_len()?
Over in https://dev.gnupg.org/T4501 i discovered that
gcry_sexp_canon_len() can fail and report an error if the S-expression
contains a zero-length string.

In particular, it will fail with errcode returning
GPG_ERR_SEXP_ZERO_PREFIX.

Why is it an error to have a string of zero-length in an s-expression in
canonical form?

i tried to do some testing with higher-level S-expression tools,
including dumpsexp (from gcrypt) and sexp-conv (from nettle).

nettle doesn't seem to have a problem with it, and dumpsexp also doesn't
have a problem with it in non-canonical form, but it reports an error in
"canonical" form.

Here is a simple test that shows the weirdness:

----------------------
0 dkg@alice:~$ echo '(foo: (bar: ""))' > sexp
0 dkg@alice:~$ sexp-conv -s canonical < sexp | hd
00000000 28 34 3a 66 6f 6f 3a 28 34 3a 62 61 72 3a 30 3a |(4:foo:(4:bar:0:|
00000010 29 29 |))|
00000012
0 dkg@alice:~$ dumpsexp < sexp
foo:bar:00000000 28 66 6f 6f 3a 20 28 62 61 72 3a 20 22 22 29 29 |(foo: (bar: ""))|
00000010 0a |.|
0 dkg@alice:~$ sexp-conv -s canonical < sexp | dumpsexp
00000000 28 34 3a 66 6f 6f 3a 28 34 3a 62 61 72 3a 30 |(4:foo:(4:bar:0|
^ ^
Error: zero prefixed length
0000000f 3a | :|
^ ^
Error: no data length
00000010 | |
00000010 29 29 |))|
0 dkg@alice:~$
----------------------

Can someone who understands S-Expressions better than me point me to
documentation that will help me understand why gcry_sexp_canon_len()
should treat this as an error?

--dkg
Re: why is GPG_ERR_SEXP_ZERO_PREFIX an error for gcry_sexp_canon_len()? [ In reply to ]
On Tue, 14 May 2019 16:56, dkg@fifthhorseman.net said:

> Can someone who understands S-Expressions better than me point me to
> documentation that will help me understand why gcry_sexp_canon_len()
> should treat this as an error?

From the specs: http://theory.lcs.mit.edu/~rivest/sexp.html

| 10. Utilization of S-expressions
|
| This note has described S-expressions in general form. Application writers
| may wish to restrict their use of S-expressions in various ways. Here are
| some possible restrictions that might be considered:
|
| -- no display-hints
| -- no lengths on hexadecimal, quoted-strings, or base-64 encodings
| -- no empty lists
| -- no empty octet-strings


Salam-Shalom,

Werner


--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.
Re: why is GPG_ERR_SEXP_ZERO_PREFIX an error for gcry_sexp_canon_len()? [ In reply to ]
Thanks for the response Werner--

On Wed 2019-05-15 08:27:22 +0200, Werner Koch wrote:
> On Tue, 14 May 2019 16:56, dkg@fifthhorseman.net said:
>
>> Can someone who understands S-Expressions better than me point me to
>> documentation that will help me understand why gcry_sexp_canon_len()
>> should treat this as an error?
>
> From the specs: http://theory.lcs.mit.edu/~rivest/sexp.html
>
> | 10. Utilization of S-expressions
> |
> | This note has described S-expressions in general form. Application writers
> | may wish to restrict their use of S-expressions in various ways. Here are
> | some possible restrictions that might be considered:
> |
> | -- no display-hints
> | -- no lengths on hexadecimal, quoted-strings, or base-64 encodings
> | -- no empty lists
> | -- no empty octet-strings

Hm, i don't see this text on that webpage at all. Can you tell me where
you're getting it from?

Wherever it came from, this doesn't read like an actual justification to
me, though: there is no explanation of why that might be a good or
useful restriction, or what contexts it might be appropriate to do so or
not. It's just a statement (presumably from an authority) that says
it's ok to constrain in some contexts.

But the choice gcrypt makes is also inconsistent with this list -- why
say "no empty octet-strings" but not "no empty lists", for example?

0 dkg@alice:~$ printf '()' | dumpsexp
00000000 28 29 |()|
0 dkg@alice:~$

Moreover, on:

https://people.csail.mit.edu/rivest/Sexp.txt

it says:

>>> 4. Octet string representations
>>>
>>> This section describes in detail the ways in which an octet-string may
>>> be represented.
>>>
>>> We recall that an octet-string is any finite sequence of octets, and
>>> that the octet-string may have length zero.

So i still don't understand the rationale, and would appreciate a
clearer justification. what is the benefit here? what risks would we
see if gcry_sexp_canon_len() *didn't* treat that as an error? What are
the implications for interoperability with other tools that use
S-expressions that may or may not have chosen the same representations?

--dkg
Re: why is GPG_ERR_SEXP_ZERO_PREFIX an error for gcry_sexp_canon_len()? [ In reply to ]
On Fri, 17 May 2019 01:59, dkg@fifthhorseman.net said:
> Hm, i don't see this text on that webpage at all. Can you tell me where
> you're getting it from?

Second heading ("References and Documentarion"), first line:

SEXP 1.0 guide (text) --> http://people.csail.mit.edu/rivest/Sexp.txt

> But the choice gcrypt makes is also inconsistent with this list -- why
> say "no empty octet-strings" but not "no empty lists", for example?

Because it has been implemented in this way 17 years ago and any changes
now may result in broken applications.


Shalom-Salam,

Werner


--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.
Re: why is GPG_ERR_SEXP_ZERO_PREFIX an error for gcry_sexp_canon_len()? [ In reply to ]
On Tue 2019-05-21 10:32:21 +0200, Werner Koch wrote:
> On Fri, 17 May 2019 01:59, dkg@fifthhorseman.net said:
>> Hm, i don't see this text on that webpage at all. Can you tell me where
>> you're getting it from?
>
> Second heading ("References and Documentarion"), first line:
>
> SEXP 1.0 guide (text) --> http://people.csail.mit.edu/rivest/Sexp.txt
>
>> But the choice gcrypt makes is also inconsistent with this list -- why
>> say "no empty octet-strings" but not "no empty lists", for example?
>
> Because it has been implemented in this way 17 years ago and any changes
> now may result in broken applications.

Do you have an example of an application that might be broken by
dropping this constraint?

Nothing in the documentation of libgcrypt guarantees that
gcry_sexp_canon_len() will continue to enforce this constraint. Rather,
it says "for a valid S-expression, it should never return 0", but it
does in fact return zero for the valid S-expression (x: "").

I've opened https://dev.gnupg.org/T4534, which identifies the
inconsistency between the documentation and the behavior of the
function.

If gcrypt intends to rigidly enforce arbitrary limits like this, it
needs to declare it explicitly, so that applications that need to deal
with S-expressions from the outside world (which may actually contain
zero-length octet strings) can select a more suitable library for
S-expression parsing.

--dkg