Asp Forum - Ruby ASN1 examples?

Brian Candler

3/26/2005 11:35:00 AM

Does anyone have any example code I can see for dealing with ASN1 in Ruby?

I see that the openssl library has a wrapper for its ASN1 functions; but I'm
currently having to use reverse-engineering and guesswork to try to make it
do what I want :-) Or there might be other ASN1 libraries I could use.

A simple program which encodes and decodes the example given in Annex A of
ITU X.690 would be great ...

Cheers,

Brian.

7 Answers

Brian Candler

4/7/2005 9:14:00 PM

On Sat, Mar 26, 2005 at 11:35:07AM +0000, Brian Candler wrote:
> Does anyone have any example code I can see for dealing with ASN1 in Ruby?

I finally hacked my way through this, and am posting it here for reference.

The hardest part was how to make a tagged instance of a tagged type. That
is, in the example in Annex A of X.690, we have:

dateOfHire [1] Date,
...
Date ::= [APPLICATION 3] IMPLICIT VisibleString -- YYYYMMDD

So we have Date as a tagged type, and dateOfHire as a tagged instance of
that type.

The Annex shows that this encodes as:

dateOfHire Length
A1 0A <----------------------->
Date Length Contents
43 08 "19710917"

It's easy to generate an untagged instance of the Date type, like this:

date = OpenSSL::ASN1::ISO64String("19710917", 3, :IMPLICIT, :APPLICATION)
p date.to_der # => 43 08 "19710917"

And it's also easy to generate a tagged instance of a universal type:

foo = OpenSSL::ASN1::ISO64String("19710917", 1, :EXPLICIT, :CONTEXT_SPECIFIC)
p foo.to_der # => A1 0A 1A 08 "19710917"
^
`-- VisibleString (not Date)

But a tagged instance of a tagged type is tricky. The only way I could
figure out was:

date = OpenSSL::ASN1::ISO64String("19710917", 3, :IMPLICIT, :APPLICATION)
date2 = OpenSSL::ASN1::ASN1Data.new([date], 1, :CONTEXT_SPECIFIC)
p date2.to_der # => A1 0A 43 08 "19710917"

ext/openssl/ossl_asn1.c claims to be written by "'OpenSSL for Ruby' team
members" - are there any of them here who would care to comment? Is there an
easier way to achieve this?

Regards,

Brian.

GOTOU Yuuzou

4/8/2005 8:25:00 AM

Brian Candler

4/9/2005 8:28:00 AM

On Fri, Apr 08, 2005 at 05:24:41PM +0900, GOTOU Yuuzou wrote:
> If this premise is right and to_der method is modified to
> refer user defined default tag, inheriting primitive types
> may make it easy. A conceptual usage is as follows:
>
> # define new class to overrides ISO64Strings's tag number
> # and tag class.
> class X690Date < OpenSSL::ASN1::ISO64String
> DEFAULT_TAG = 3
> DEFAULT_TAG_CLASS = :APPLICATION
> end
> X690Date.new("19710917", 1, :EXPLICIT, :CONTEXT_SPECIFIC).to_der
>
> any ideas?

This looks good. But the obvious next thing to do is to record that
information in a parser table, so that ASN1.decode will create an instance
of X690Date instead of an ASN1Data object.

If we go down that route, then what I'd really want is bindings between
arbitary Ruby classes and ASN1 types (including set/sequence/choice), so
that a tree of objects can be converted to and from der. Attached is one
idea how this might look.

My actual target is to encode and decode ASN1 protocol messages, such as:
http://homepages.tesco.net./~J.deBoynePollard/Proposals/IM2000/Architecture/...

Regards,

Brian.

Sam Roberts

4/9/2005 4:07:00 PM

Quoting B.Candler@pobox.com, on Sat, Apr 09, 2005 at 05:28:12PM +0900:
> On Fri, Apr 08, 2005 at 05:24:41PM +0900, GOTOU Yuuzou wrote:
> > If this premise is right and to_der method is modified to
> > refer user defined default tag, inheriting primitive types
> > may make it easy. A conceptual usage is as follows:
> >
> > # define new class to overrides ISO64Strings's tag number
> > # and tag class.
> > class X690Date < OpenSSL::ASN1::ISO64String
> > DEFAULT_TAG = 3
> > DEFAULT_TAG_CLASS = :APPLICATION
> > end
> > X690Date.new("19710917", 1, :EXPLICIT, :CONTEXT_SPECIFIC).to_der
> >
> > any ideas?
>
> This looks good. But the obvious next thing to do is to record that
> information in a parser table, so that ASN1.decode will create an instance
> of X690Date instead of an ASN1Data object.
>
> If we go down that route, then what I'd really want is bindings between
> arbitary Ruby classes and ASN1 types (including set/sequence/choice), so
> that a tree of objects can be converted to and from der. Attached is one
> idea how this might look.

I'd suggest not doing that. Its a common desire, but it leads to
incredible problems down the road.

Problems:
- String has multiple representations in ASN.1. To encode it, you need
to choose the String type. Particularly for DER, this causes
round-trip problems - you decode a TeletexString to ruby String, then
reencode, it gets encoded as UTF8String, and now you have mangled the
data. In particular, cryptographic signatures fail.

- Memory overhead goes through the roof, because ASN.1 is very verbose.
This is a variation of what happened in the XML world. XML looks like
a tree, so people write tree-based APIs. Fast and easy... then they
get a large document, or try and figure out why their code is so slow,
and end up having to change to SAX, or some other stream-based API.

I've had direct experience maintaining and writing BER and DER codecs
for PKI/cryptographic protocols. We had a Java one written be people
who believed that DER was actually "distinguished". They were wrong.
When you get a certificate signed by by a major CA, and it has an extra
insignificant zero in an INTEGER, and you decode, then reencode
(correctly!) before verifying the signature, and the verification fails,
but MS IE6.0 verifies the signature fine, guess who fixes the problem?
Hint: not the CA, and not MS (and even if they do, you still have to
interop with legacy data floating around).

Anyhow, not to say that this won't work in special cases, just like it
does in XML. It is possible to set up mappings between classes and
ASN.1, but I'd suggest not requiring an entire in-memory tree of your
input to be built, and if you are using DER, you must be very careful
to preserve the original encoding, and there are multiple
representations of things like strings and dates/times, even in DER.
Your mapping has to take this into account.

Don't get off on the wrong foot by assuming ASN.1 works as advertised.

Anyhow, DER and BER are such simple formats (at the bit-level, not in
the way they are used in protocols), you might be better off just
writing your own codec in ruby. Its basically just a TAG/LENGTH/VALUE
encoding, implementing it might actually be faster than figuring out how
to use OpenSSL, though admittedly I say that as somebody who has read 3
or 4 implementations, and wrote a few generations, so maybe it just
seems easy to me.

Btw, its this very low-level simplicity that suckers folks into thinking
its easy to wrap in high-level OO APIs. I think this is similar to XML -
just tagged data, with some params, how complicated can XML be? :-) If
you do take this approach, I'd spend some time thinking about how you
would do it in XML, the problems that arise, and the API patterns
developed to work-around the problems.

And if this sounds like an incomprehensible rant, and is of no use to
you at all, sorry!

Cheers,
Sam

Btw, DJBs name is a bad word in the mail community, and I'm deeply
suspicious of anybody who suggests that somehow ASN.1 is "simpler" than
the IETF text-based protocols. I've implemented both, and its not true.

On the plus side, binary protocols almost force the writing of proper
decoders, rather than letting the innocent thing that using scanf() to
decode mail headers is a workable idea. On the other hand, I see just as
much brain-damaged protocol complexity in ASN.1 protocols as I do in
IETF mail protocols. And it is really nice to be able to see your data
on the wire without doing hex dumps. Thinking mail is easier with XML or
ASN.1 misses the point - mail isn't hard because of the bits on the wire
- its hard because it is a globally distributed system used in lots of
ways, for many purposes.

DJB has a tendency to radical over-simplification, and then to abusing
people who want to do things his proposals don't allow. Example would be
his proposal to outlaw accented characters in "internationalized" domain
names. It may be more secure, but its not that internationalized when
the turks and french lose a few of their vowels from the allowed
characters in domain names!

Anyhow, have fun, implementing protocols is usually lots of that, its
pretty cool to pull up the hood and see how things really work!

> My actual target is to encode and decode ASN1 protocol messages, such as:
> http://homepages.tesco.net./~J.deBoynePollard/Proposals/IM2000/Architecture/...
>
> Regards,
>
> Brian.

Brian Candler

4/10/2005 2:51:00 PM

On Sun, Apr 10, 2005 at 01:07:10AM +0900, Sam Roberts wrote:
> > If we go down that route, then what I'd really want is bindings between
> > arbitary Ruby classes and ASN1 types (including set/sequence/choice), so
> > that a tree of objects can be converted to and from der. Attached is one
> > idea how this might look.
>
>
> I'd suggest not doing that. Its a common desire, but it leads to
> incredible problems down the road.
>
> Problems:
> - String has multiple representations in ASN.1. To encode it, you need
> to choose the String type. Particularly for DER, this causes
> round-trip problems - you decode a TeletexString to ruby String, then
> reencode, it gets encoded as UTF8String, and now you have mangled the
> data. In particular, cryptographic signatures fail.

A solution would be to mark each attribute in the class with its ASN.1 type:
e.g.

class Foo
attr_accessor :bar, :baz

asn1_attr :bar, OpenSSL::ASN1::ISO64String
asn1_attr :baz, OpenSSL::ASN1::UTF8String
end

Serialising Foo to der will then tag @bar and @baz correctly.

If there's a possibility that a single attribute will be one of multiple
types, then it should be wrapped in an ASN.1 'choice'

> - Memory overhead goes through the roof, because ASN.1 is very verbose.
> This is a variation of what happened in the XML world. XML looks like
> a tree, so people write tree-based APIs. Fast and easy... then they
> get a large document, or try and figure out why their code is so slow,
> and end up having to change to SAX, or some other stream-based API.

Sure, although it depends on how complex your object tree is. The existence
of stream APIs for XML doesn't mean that in-memory data structure APIs for
XML are worthless, and I think the same applies to ASN.1

I'm not intimately familiar with OpenSSL's ASN.1 routines, for example, but
they do seem to be memory-based (ASN1_OBJECT_new, ASN1_OBJECT_free etc)

I don't see ASN.1 as a general-purpose object serialisation tool
incidentally; in particular, having object references and graphs of objects
would be a bit of a nightmare. I'm just interested in a toolset for encoding
and decoding ASN.1 messages.

Incidentally, Ruby's ASN.1 library does appear to have a 'traverse' method
which acts as a stream parser. You still need to build a suitable state
machine for it to 'yield' each element to, of course.

> Btw, DJBs name is a bad word in the mail community, and I'm deeply
> suspicious of anybody who suggests that somehow ASN.1 is "simpler" than
> the IETF text-based protocols. I've implemented both, and its not true.

Was it DJB who suggested this? I only came across it in J.deBoynePollard's
protocol (which stemmed from DJB's initial idea, but I don't think that
specified what form the protocol should take)

> Anyhow, have fun, implementing protocols is usually lots of that, its
> pretty cool to pull up the hood and see how things really work!

Exactly. Whilst I don't believe JdeBP's proposal is by itself an improvement
on what we have now for E-mail, it's given me a chance to think it through,
and having a bash at implementation might bring up some more ideas.
http://pobox.com/~b.candler/doc/misc/i...

Cheers,

Brian.

Sam Roberts

4/10/2005 9:22:00 PM

Quoting B.Candler@pobox.com, on Sun, Apr 10, 2005 at 11:51:26PM +0900:
> On Sun, Apr 10, 2005 at 01:07:10AM +0900, Sam Roberts wrote:
> > Problems:
> > - String has multiple representations in ASN.1. To encode it, you need
> > to choose the String type. Particularly for DER, this causes
> > round-trip problems - you decode a TeletexString to ruby String, then
> > reencode, it gets encoded as UTF8String, and now you have mangled the
> > data. In particular, cryptographic signatures fail.
>
> A solution would be to mark each attribute in the class with its ASN.1 type:
> e.g.
>
> class Foo
> attr_accessor :bar, :baz
>
> asn1_attr :bar, OpenSSL::ASN1::ISO64String
> asn1_attr :baz, OpenSSL::ASN1::UTF8String
> end

> Serialising Foo to der will then tag @bar and @baz correctly.

You willalso have to take into account invalidly encoded DER, though,
unless you can really take the moral high-ground and refuse to interop
with invalid DER. It's quite common for implementations to neglect the
leading zero necessary to make INTEGER positive if the high bit is set,
for example. So, when you reencode (correctly) you don't have the same
input. There's a whole set of common errors like this.

> If there's a possibility that a single attribute will be one of multiple
> types, then it should be wrapped in an ASN.1 'choice'

ASN.1 choices aren't a "wrapping" in the sense that you see any wrapping
in the BER or DER encoding, not unless you tag, anyhow. When an ASN.1
choice appears, you literally encode whichever one you want. This is the
common case for strings, for example. ASN.1 to BER/DER is one-way, there
are numbers of places where you cannot infer the ASN.1 from the
encoding. Not necessarily a criticism, just an observation.

> > - Memory overhead goes through the roof, because ASN.1 is very verbose.
> > This is a variation of what happened in the XML world. XML looks like
> > a tree, so people write tree-based APIs. Fast and easy... then they
> > get a large document, or try and figure out why their code is so slow,
> > and end up having to change to SAX, or some other stream-based API.
>
> Sure, although it depends on how complex your object tree is. The existence
> of stream APIs for XML doesn't mean that in-memory data structure APIs for
> XML are worthless, and I think the same applies to ASN.1

No, but they tend to be written now (rexml, that next gen java api whose
name I forget) in terms of an underlying stream decoder. If you can live
with the cost of memory, you go for tree, but you have an alternative.
This is the right approach, I think.

> I'm not intimately familiar with OpenSSL's ASN.1 routines, for example, but
> they do seem to be memory-based (ASN1_OBJECT_new, ASN1_OBJECT_free etc)

It runs only on desktop big-memory systems. Take a largish word
document, and encrypt then decrypt it with its PKCS#7 APIs.

Mostly openssl deals with keys and certs, these are (relatively) small.
Even so, the verbosity of ASN.1 is truly astounding.

> Incidentally, Ruby's ASN.1 library does appear to have a 'traverse' method
> which acts as a stream parser. You still need to build a suitable state
> machine for it to 'yield' each element to, of course.

Probably built on top of openssl's tree-base routines, so you pay the
memory cose, and the complexity cost.

Anyhow, mostly I just wanted to say writing a stream-based BER/DER
decoder in ruby would be easy. Writing stream-base DER encoders is
impossible, unfortunately (the ouput size is encoded at the beginning,
they should have used CER more often, but its too late now), but
stream-based BER encoders are also easy.

> > Btw, DJBs name is a bad word in the mail community, and I'm deeply
> > suspicious of anybody who suggests that somehow ASN.1 is "simpler" than
> > the IETF text-based protocols. I've implemented both, and its not true.
>
> Was it DJB who suggested this?

Sorry, didn't mean to imply he said this. He did suggest the "no
accents" security "fix". His name is a bad word for other reasons.

Sam

Brian Candler

4/14/2005 3:21:00 PM

On Mon, Apr 11, 2005 at 06:21:39AM +0900, Sam Roberts wrote:
> You willalso have to take into account invalidly encoded DER, though,
> unless you can really take the moral high-ground and refuse to interop
> with invalid DER. It's quite common for implementations to neglect the
> leading zero necessary to make INTEGER positive if the high bit is set,

But then, they are actually sending you a negative value, are they not?

What I mean is, there's no ambiguity. If somebody sends you b11111111 then
it's -1, not 255, and there's no question about it. It must be a contextual
thing to decide that -1 is an invalid value here, and that therefore the
sender 'must' have meant 255.

> for example. So, when you reencode (correctly) you don't have the same
> input. There's a whole set of common errors like this.

Ah. OK, I can see the case where you receive b11111111 b11111111 - you could
either reject this as an invalid encoding (the BER rules say that it is), or
you could decode it as -1, in which case you'll generate a different
encoding when you re-encode.

I'd prefer to take the view that the encoding is invalid: the standard is
absolutely unambiguous.

However, if I really needed to interoperate with something so broken, I'd
probably define an UNSIGNEDINTEGER type internally. It would encode using
the same universal tag as INTEGER, but the value would be treated as
unsigned. Hence b11111111 would be 255 and b11111111 b11111111 would be
65535. Propagating invalid encodings in this way should be something of a
last resort.

I'd be interested to know what the other common errors are that you mention.
This is the sort of knowledge which only an experienced implementor will
have...

> > If there's a possibility that a single attribute will be one of multiple
> > types, then it should be wrapped in an ASN.1 'choice'
>
> ASN.1 choices aren't a "wrapping" in the sense that you see any wrapping
> in the BER or DER encoding, not unless you tag, anyhow. When an ASN.1
> choice appears, you literally encode whichever one you want.

Yes indeed. What I meant was, if I have
foo PrintableString,
bar UTF8String,

then I can assign @foo = "xxx" and @bar = "yyy", i.e. using native Ruby
strings, since when it comes to re-encoding them I'll know what ASN.1 type
to use from the ASN.1 definition for each attribute.

However if foo were a CHOICE between PrintableString and UTF8String, then
this information would be lost. One solution would be to decode as
@foo = PrintableString.new("xxx")
or
@foo = UTF8String.new("xxx")
in which case the class of foo carries forward that information. But that
makes a new object with an instance variable (say @value) holding the
string. Alternatively that information could be recorded in the singleton
class of the object:
@foo = "xxx"
@foo.extend PrintableString

That may be cleaner, although this metadata is easily lost:

@foo.downcase! # keeps singleton class
@foo = @foo.downcase # loses it

> This is the
> common case for strings, for example. ASN.1 to BER/DER is one-way, there
> are numbers of places where you cannot infer the ASN.1 from the
> encoding. Not necessarily a criticism, just an observation.

Yes, I gathered that. That's why you'd need to carry metadata about the
required ASN.1 encodings with the class, or (in some cases, as outlined
above) individual values.

> > Incidentally, Ruby's ASN.1 library does appear to have a 'traverse' method
> > which acts as a stream parser. You still need to build a suitable state
> > machine for it to 'yield' each element to, of course.
>
> Probably built on top of openssl's tree-base routines, so you pay the
> memory cose, and the complexity cost.

ossl_asn1_decode0 is basically a loop on ASN1_get_object, and as far as I
can tell that just walks along an DER stream in memory, updating a start
pointer as it goes. So it should work along an object in its linear form,
not having expanded to a tree; and with mmap() I guess it could work
directly from a file too. It calls itself recursively when it meets a
constructed item.

$ cat traverse.rb
require 'openssl'

a = "\xA1\x0A\x43\x08..test.."
OpenSSL::ASN1.traverse(a) { |y| p y }

$ ruby traverse.rb
[0, 0, 2, 10, true, :CONTEXT_SPECIFIC, 1]
[1, 2, 2, 8, false, :APPLICATION, 3]
$

The parameters to the block seem to be (looking at ext/openssl/ossl_asn1.c):
depth
start offset
header length
data length
constructed=true (so primitive=false)
tag class
tag

A more friendly API could be a stream of tag_start / data / tag_end method
calls on an object, like an REXML stream parser.

I don't think the reverse exists, i.e. for taking a stream of these tags and
turning them into DER/CER.

Shame that none of this appears to be documented! Somebody has taken a lot
of time to wrap openssl's ASN.1 parsing for Ruby, but anyone who wants to
use it (like me) has to do quite a bit of work to reverse-engineer the API.

> Anyhow, mostly I just wanted to say writing a stream-based BER/DER
> decoder in ruby would be easy. Writing stream-base DER encoders is
> impossible, unfortunately (the ouput size is encoded at the beginning,
> they should have used CER more often, but its too late now), but
> stream-based BER encoders are also easy.

Understood. Once upon a time I wrote a one-pass machine-code assembler that
used to rewind to previous points and insert branch offsets once it was able
to resolve a label :-)

DER makes this a bit more difficult with the variable sized encoding of the
length octets, but I think it could be made a two-pass operation. Or you
could write out as CER, and then have a two-pass CER to DER convertor (pass
one reads in the CER and writes out some auxilliary data about lengths seen;
pass two reads the CER again and merges in the length data to create DER)

Regards,

Brian.

comp.lang.ruby

Ruby ASN1 examples?

Brian Candler

Brian Candler

GOTOU Yuuzou

Brian Candler

Sam Roberts

Brian Candler

Sam Roberts

Brian Candler

x Login to ForumsZone