Upload
sergey-podolsky
View
428
Download
2
Embed Size (px)
Citation preview
History
Binary serialization - sucks• Language-specific• Not safe (see “Effective Java”)• Not extensible
XML, JSON – back to 1999• Too verbose• Need to parse• Slow performance• Huge size if not compressed• No strong types (int vs float)• Need to store field names
Google Protocol Buffers• Cross-language• Schema evolution• Compact• Strongly typed
MessagePerson.json Person.proto{ "userName": "Martin", "favouriteNumber": 1337, "interests": ["daydreaming", "hacking"]}
message Person { required string user_name = 1; optional int64 favourite_number = 2; repeated string interests = 3;}
Options• Outer class• Default value• Deprecated• Speed / Size• Custom options• desriptor.proto
Our protobuf use cases:• Java, C++, C#• Payload for ZeroC ICE• IBM MQ / Solace messages• DB raw data• Log messages to disk• Compress using TAR.GZIP• Show as XML / JSON• exe utility associated with protobuf files
Disadvantages• No Map<K, V> / Dictionary<K, V>• No Set<T>• No short / int16 / uint16• No interning• Generated classes are immutable• compiler vs library are not backwards compatible• descriptor.proto is not backwards compatible• Poor number of officially supported languages• Enum is not extensible (unknown resets to 0)
Apache Avro• No tag • Schema is required• The entire record is tagged by schema ID• Fields are matched by name• No optional values: union { null, long } is used instead• Resolution rules are used for server vs client schemas
Apache Avro
JSON Notation IDL{ "type": "record", "name": "Person", "fields": [ {"name": "userName", "type": "string"}, {"name": "favouriteNumber", "type": ["null", "long"]}, {"name": "interests", "type": {"type": "array", "items": "string"}} ]}
record Person { string userName; union { null, long } favouriteNumber; array<string> interests;}
Apache• “one-stop shop”• RPC framework• Different serialization formats (“protocols”)
Apachestruct Person { 1: string userName, 2: optional i64 favouriteNumber, 3: list<string> interests}
Comparison Thrift Protobuf
Language Bindings Java, C++, Python, C#, Cocoa, Erlang, Haskell, OCaml, Perl, PHP, Ruby, Smalltalk Java, C++, Python
Primitive Types bool, byte, 16/32/64-bit integers, double, string, byte sequence, map<t1,t2>, list<t>, set<t>
bool, 32/64-bit integers, float, double, string, byte sequence, “repeated” properties act like lists
Enumerations Yes YesConstants Yes NoComposite Type Struct MessageException Handling Yes NoDocumentation Lacking GoodLicense Apache BSD-styleCompiler C++ C++RPC Interfaces Yes YesRPC Implementation Yes NoComposite Type Extensions No YesData Versioning Yes Yes
Pros- More languages supported out of the box- Richer data structures than Protobuf (e.g.: Map and Set)- Includes RPC implementation for services
- Slightly faster than Thrift when using "optimize_for = SPEED"- Serialized objects slightly smaller than Thrift due to more aggressive data compression- Better documentation- API a bit cleaner than Thrift
Cons - Good examples are hard to find - Missing/incomplete documentation
- .proto can define services, but no RPC implementation is defined (although stubs are generated for you).