Wednesday, March 19, 2014

Big Data

One of the biggest IT trends in last few years has been 'Big Data'.

"Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications"

Data management has became an important competency in all kind of organizations. Lots of leading companies in the world are making strategies to examine how they can transform their business using Big Data.

There are three main terms which we have to consider in Big Data arena.

  • Volume - Have terabytes of data to be examine. 
  • Variety - Not only numbers. Have to consider about geospatial data, 3D data, audio and video, and unstructured text, including log files and social media
  • Velocity - Massive amount of data which generates real time, even in micro seconds.

According to GigaOM following are the things happen in the ream of Big Data.
  • Hadoop is becoming a true platform.
  • Artificial intelligence is finally becoming something of a reality.
  • New tools are making it possible for ordinary people to use analytics.
  • Big data and cloud computing are intersecting in a major way.
  • The legal system will attempt to develop and impose new rules.
As IT professions we must be aware whats going on with Big Data. J

Sunday, March 9, 2014

ASP.Net Web API with BSON

BSON is a binary serialization format. "BSON" stands for "Binary JSON", but BSON and JSON are serialized very differently. BSON is "JSON-like", because objects are represented as name-value pairs, similar to JSON. Unlike JSON, numeric data types are stored as bytes, not strings. BSON is mainly used as a data storage and network transfer format in the MongoDB database.

According to ASP.Net Web API 2.1 release;

  • The BSON was designed to be lightweight, easy to scan, and fast to encode/decode.
  • BSON is comparable in size to JSON. Depending on the data, a BSON payload may be smaller or larger than a JSON payload. For serializing binary data, such as an image file, BSON is smaller than JSON, because the binary data does is not base64-encoded.
  • BSON documents are easy to scan because elements are prefixed with a length field, so a parser can skip elements without decoding them.
  • Encoding and decoding are efficient, because numeric data types are stored as numbers, not strings.

Native clients, such as .NET client apps, can benefit from using BSON in place of text-based formats such as JSON or XML. For browser clients, you will probably want to stick with JSON, because JavaScript can directly convert the JSON payload.
Advantage of WEB API is, we can have content negotiation. So the client can select in which format does he need the data.

Create ASP.net Web API project. Update ASP.net Web API nuget packages.
In WebApiConfig file add BsonMediaTypeFormatter. Now if the client requests "application/bson", Web API will use BSON formatting as the response.
Add a simple class called Student.
Add a api controller called StudentController and change the Index method like below.
Using fiddle compose a json message like below.
And the response will be like;
Now set the accept header as application/bson.
Now the response will be like;
In general speaking, if your service returns more binary, numeric and non textual data, then BSON is the best thing to use.

Saturday, March 8, 2014

Windows Azure Queue Storage

In my previous post I explained about Windows Azure Storage and how to create it via Azure portal. In this post I'll explain about Azure Queue storage.

Windows Azure Queue storage is a service for storing large numbers of messages that can be accessed from anywhere in the world via authenticated calls using HTTP or HTTPS.

If you are going to use development storage you may have to consider about azure storage client library version and emulator version. When I was using development storage version 2.2 with Azure storage client library version 3, I got some errors related to invalid HTTP headers and 400 Bad request issues. Please go through this link if you are going to use development storage.

Add new cloud project and add web role and a worker role.
Expand server explorer, you can see the azure storage explorer.
Add QueueStorage connection string to the web role cloud service config file.
Add following code to Home controller.
You must consider following naming conventions for Azure Queues.

  • A queue name must start with a letter or number, and can only contain letters, numbers, and the dash (-) character. 
  • The first and last letters in the queue name must be alphanumeric. The dash (-) character cannot be the first or last character. Consecutive dash characters are not permitted in the queue name.
  • All letters in a queue name must be lowercase.
  • A queue name must be from 3 through 63 characters long.
After run this code it will create a new queue container in Azure and add a message. But you won't be able to see the Queue container in Azure portal. Because that feature is not facilitated yet by Azure team. Refresh the storage explorer in Visual Studio. Open the queue. You can see the queue message in queue container. In Azure queue message will expire after 7 days.

Now we'll look at how to consume queue messages from worker role.
Add following code to worker role.
After you call queue.GetMesage(), that message becomes invisible. That means no one allows to read that message. At the end you can delete the message. But if that process crashed after a few time message again visible in queue. That means there is an invisibility time for a queue message. You can further read about Poison Message handling in Queues.

Windows Azure Storage

Windows Azure storage consists of blobs, tables, and queues. It is accessible with HTTP/HTTPS requests. It is distinct from SQL Azure.
If you try to create a storage account from Azure Portal, you have to enter following details.
Location/Affinity group - A geographic location of you cloud service deployments which will affects for the performance by putting near to your target audience.

Geo redundant replication - This enables replicating your data in to a secondary location within the same region. This enable the fail over handling.

Locally redundant storage - Locally redundant data is replicated three times in same data center. All storage in Windows Azure is locally redundant.

After you create the storage account in the dash board you can see the end point URLs. This represents the namespace of blob, queue and table in your storage.
Container - This provides set of storage types. A storage account can contain an unlimited number of containers.