develooper Front page | perl.perl6.users | Postings from January 2022

Re: about binary protocol porting

Thread Previous | Thread Next
From:
Geoffrey Broadwell
Date:
January 4, 2022 03:15
Subject:
Re: about binary protocol porting
Message ID:
8b9be243-754b-6ce4-2adc-c6aefae11d76@sonic.net
I love doing binary codecs for Raku[1]!  How you approach this really 
depends on what formats and protocols you want to create Raku modules for.

The first thing you need to be able to do is test if your codec is 
correct.  It is notoriously easy to make a tiny mistake in a protocol 
implementation and (especially for binary protocols) miss it entirely 
because it only happens in certain edge cases.

If the format or protocol in question is open and has one or more public 
test suites, you're in good shape.  Raku gives a lot of power for 
refactoring tests to be very clean, and I've had good success doing this 
with several formats.

If there is no public test suite, but you can find RFCs or other 
detailed specs, you can often bootstrap a bespoke test suite from the 
examples in the spec documents.  Failing that, sometimes you can find 
sites (even Wikipedia, for the most common formats) that have 
known-correct examples to start with, or have published reverse 
engineering of files or captured data.

If the format is truly proprietary, you'll be getting lots of reverse 
engineering practice of your own. 😉

Now that you have some way of testing correctness, you'll want to be 
able to diagnose the incorrect bits.  Make sure you have some way of 
presenting easily-readable text expansions of the binary format, because 
just comparing raw buffer contents can be rather tedious (though I admit 
to having found bugs in a public test suite by spending so much time 
staring at the buffers I could tell they'd messed up a translation in a 
way that made the test always pass).  If the format or protocol has an 
official text translation/diagnostic/debug format -- CBOR, BSON, 
Protobuf, etc. all have these -- so much the better, you should support 
that format as soon as practical.

Once you get down to the nitty-gritty of writing the codec, I find it is 
very important to make it work before making it fast. There is a lot of 
room for tuning Raku code, but it is WAY easier to get things going in 
the right direction by starting off with idiomatic Raku -- given/when, 
treating the data buffer as if it was a normal Array (Positional 
really), and so on.

Make sure that with every protocol feature that you add, that you make 
tests newly pass, and (I find at least) that you write the coding and 
decoding bits at the same time, so you can check that you can round-trip 
data successfully.  For the love of all that is good, don't implement 
any obtuse features before the core features are rock solid and pass the 
test suite with nary a hiccup.

After that, when you think you're ready to optimize, write performance 
/tests/ first.  Make sure you test with data that will both use your 
codec in a typical manner, and also test out all the odd corners.  
You're looking for things that seem weirdly slow; this usually indicates 
a thinko like copying the entire buffer each time you read a byte from 
it, or somesuch.

Once you've got the obvious performance kinks worked out, come by and 
ask again, and we can give you further advice from there.  Or heck, just 
come visit us on IRC (#raku at Libera.chat), and we'll be happy to 
help.  (Do stick around for a while though, because traffic varies 
strongly by time of day and day of week.)

Best Regards,


Geoff (japhb)


[1]  I'm a bit of a nut for it, really.  In the distant past, I wrapped 
C libraries to get the job done, but more recently I've done them as 
plain Raku code (and sometimes NQP, the language that Rakudo is written in).

I've written some of the binary format codecs for Raku:

  * https://github.com/japhb/CBOR-Simple
    <https://github.com/japhb/CBOR-Simple>
  * https://github.com/japhb/BSON-Simple
    <https://github.com/japhb/BSON-Simple>
  * https://github.com/japhb/Terminal-ANSIParser
    <https://github.com/japhb/Terminal-ANSIParser>
  * https://github.com/japhb/TinyFloats
    <https://github.com/japhb/TinyFloats>

Modified or tuned others:

  * https://github.com/samuraisam/p6-pb/commits?author=japhb
    <https://github.com/samuraisam/p6-pb/commits?author=japhb>
  * https://github.com/japhb/serializer-perf
    <https://github.com/japhb/serializer-perf>
  * (Lots of stuff spread across various Cro
    <https://github.com/croservices> repositories)

Added a spec extension for an existing standardized format (CBOR):

  * https://github.com/japhb/cbor-specs/blob/main/capture.md
    <https://github.com/japhb/cbor-specs/blob/main/capture.md>

And I think I forgot a few things.  😁



Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About