What is Hadoop?
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures. – Source : http://hadoop.apache.org/
The above definition can be bit overwhelming when read all at once. But the reality is that it is a very simple concept. Continue reading
I just completed the Cloudera Developer Training for Apache Hadoop course which was held in Denver, CO, USA from 4th Dec 2012 to 7th Dec 2012.
This program prepares you for the following certification:
Cloudera Certified Developer for Apache Hadoop CDH4 (CCD-410)
The course was spread over 4 days that gives you a complete understanding of the Hadoop system (HDFS & MapReduce) along with a few tools that are part of the Hadoop ecosystem. Continue reading
Flicksery is a simple Netflix Search Engine that uses filters to search the Netflix catalog to find the right movies for you.
Why did I develop Flicksery? Continue reading
In one of the projects I am working there was a requirement to parse a very large XML file (around 1.2 GB) in Ruby. Using the the traditional method of parsing wherein the XML file is loaded in memory and parsed was not a feasible approach for this.
So, I started exploring different methods for XML parsing and came across the libxml library. Continue reading
Note : You can watch the screencast for a more complete tutorial at Hadoop Screencasts – Installing Apache Hadoop
Following are the steps for installing Hadoop. I have just listed the steps with very brief explanation at some places. This is more or less like some reference notes for installation. I made a note of this when I was installing Hadoop on my system for the very first time. Continue reading