Apache Avro is a service in Hadoop that enables data serialization. The main tasks of Avro are:
- Provide complex data structures
- Provide a compact and fast binary data format
- Provide a container to persist data
- Provide RPC’s to the data
- Enable the integration with dynamic languages
Avro is built with a JSON Schema, that allows several different types:
Elementary types
- Null, Boolean, Int, Long, Float, Double, Byte and String
Complex types
- Record, Enum, Array, Map, Union and Fixed
The sample below demonstrates an Avro schema
{“namespace”: “person.avro”,
“type”: “record”, “name”: “Person”, “fields”: [ {“name”: “name”, “type”: “string”}, {“name”: “age”, “type”: [“int”, “null”]}, {“name”: “street”, “type”: [“string”, “null”]} ] } |
Table 4: an avro schema
Leave a Reply
Want to join the discussion?Feel free to contribute!