Knoldus

Here I am to going to  write a blog on Hadoop!

“Bigdata is not about data! The value in Bigdata [is in] the analytics. ”

-Harvard Prof. Gary King

So the Hadoop came into Introduction!

Hadoop is an open source, Java-based programming framework that supports the processing and storage of extremely large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.

                                                                                                                -Cloudera

Hadoop was created by computer scientists Doug Cutting and Mike Cafarella in 2006 to support distribution for the Nutch search engine. It was inspired by Google’s MapReduce.

Why HADOOP?

The problem with RDBMS is , it can not processed semi-structured and unstructured data (text, videos, audios, Facebook posts, clickstream data, etc.). It can only work with structured data(banking transaction, location information, etc.). Both are also different in term of processing data.

RDBMS architecture with ER…

View original post 753 more words

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s