WELCOME TO WAPNETREALM

NEWS | SPORTS | CELEBRITY GIST | MIXTAPE | COMEDY VIDEOS | JOKES | TECH | FASHION

PROMOTE MUSIC | ADVERTISE | SUBMIT FREEBEAT

  • Hackaday | MessagePack Is A More Efficient JSON
  • It is an age-old problem, that of having some data you want to store somewhere, and later bring it back. How do you format the data? Custom file formats are not that hard, but if you use an existing format you can probably steal code from a library to help you. Common choices include XML or the simpler JSON. However, neither of these are very concise. That's where MessagePack comes in.

    For example, consider this simple JSON stanza:

      {"compact":true, "schema":0}  

    This is easy to understand and weighs in at 27 bytes. Using MessagePack, you'd signal some special binary fields by using bytes >80 hex. Here's the same thing using the MessagePack format:

       0x82 0xA7 c o m p a c t 0xC3 0xA6 s c h e m a 0x00  

    Of course, the spaces are there for readability; they would not be in the actual data stream which is now 18 bytes. The 0x82 indicates a two-byte map. The 0xA7 introduces a 7-byte string. The "true" part of the map is the 0xC3. Then there's a six-byte string (0xA6). Finally, there's a zero byte indicating a zero.

    You can probably puzzle it out for the most part. Any byte that starts with a zero is a fixed integer. Numbers that start at 0x80 encode a map, so 0x84 is a four-element map. For arrays, the prefix is 9 instead of 8 and strings start with either 0xA0 or 0xB0, so you can have up to 32 characters easily encoded.

    Of course, you might need an integer bigger than 0x7F, right? So there are other integer formats such as 0xCC for 8-bit unsigned or 0xD3 which is a 64-bit signed big-endian number. Prefixes of 0xCA and 0xCB store 32- or 64-bit IEEE 754 floating point numbers.

    For larger strings there is str8 (0xd9), str16 (0xda), and str32 (0xdb). In each case, the number is the count of bits in the string length. So 0xd9 gets a single byte count and 0xdb gets four-bytes for the count. There are other formats, of course, and you can see them in the spec.

    The real trick, of course, is the availability of library code. The project claims over 50 languages on their web page. So if you are writing in C, C++, Haskell, Dart, Kotlin, or Matlab, you can find code to help you.

    We've seen a lot of JSON out there, and it will probably remain since most applications don't care about the efficiency of representing data. While XML has fallen out of favor because of its complexity, there are still places you run into it.



    via https://ift.tt/2U1Gl4P

    No comments:

    Post a Comment