big data Big Data Big Data Technologies

Big Data 101: Data Representation as part of Variety

Representation is an often-mentioned characteristic for Big Data. It goes well with “Variety” in the above stated definition. Each Data is represented in a specific form and it doesn’t matter what form it is. Well-known forms of Data are XML, Json, CSV or binary. Depending on the Representation of Data, different possibilities regarding relations can be integrated. XML and Json for instance allows us to set child-objects or relations for data, whereas it is rather hard with CSV or binary. A possibility for relations can be a dataset of the type “Person”. Each person consists of some attributes that identify the person (e.g. the last name, age, sex) and an address that is an independent entity. To retrieve this data as CSV or binary, you either have to do two queries or create a new entity for a query where the data is merged. XML and Json allows us to nest entities in other entities.


The in Figure described entity would look like the following, if presented in XML:











Listing 1: XML representation of the entity “person”

Similar to that, the Json representation of our Model “Person” would look slightly similar:

[Person :[Common :

[“firstname” : “Mario”, “lastname” : “Meir-Huber”, “Age” : 29]


[Address :

[“zipcode” : “1150”, “city” : “Vienna”]



Listing 2: Json interpretation

If we now look at how we could represent this data from a database as binary data, we need to join two different datasets. This is basically supported by SQL. A possible representation could look like the following:

p.Firstname p.Lastname p.Age a.Zipcode a.City
Mario Meir-Huber 29 1150 Vienna

Listing 3: SQL-based binary representation

The representation of Data isn’t limited to what was described in this chapter so far. There are several other formats available and others might arise in the future. However, data must have a clear and documented representation in a form that can be processed by Tools that built upon that data.

I lead a team of Senior Experts in Data & Data Science as Head of Data & Analytics and AI at A1 Telekom Austria Group. I also teach this topic at various universities and frequently speak at various Conferences. In 2010 I wrote a book about Cloud Computing, which is often used at German & Austrian Universities. In my home country (Austria) I am part of several organisations on Big Data & Data Science.

0 comments on “Big Data 101: Data Representation as part of Variety

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: