Put Your Head Down

Exploration is a human nature. Everyone wants to know more, learn more, read more, write more and something more. There is a very thin line between exploring and being distracted. So often we tend to use exploration as a excuse for distraction. It is somewhere in our heads, that the more we are aware of what is happening around the more productive it will help us be at our work.

In my opinion, this is not TRUE.

Exploring and being amazed by what you find is alone not going to get you anywhere. It is the application of what you explore is what will get you doing things you always wanted to do. Discovering new motivational articles is not going to motivate you more than the last article, though it may seem like that for a while. There is nothing that motivates you more than doing what you have always wanted to do and seeing it through to the end.

Read, but read well
Never stop reading. Set aside sometime to read. It should not be an activity that should be done whenever you get time. Set aside two hours and read and research the subject you are reading. Make notes.

Write and write often
It is a belief that if one is a good reader, he or she will also be a good writer. Again, this is not true. It is a different game when you are writing. You will be amazed how words don’t make it out of your head to paper. Writing needs constant practise just like any other activity. The more you write the better you will get.

Keep your knives sharp
Skill acquired over time need to be sharpened once in a while. Keep revising your skills and don;t let them fade. It is necessary as it may be so frustrating to realize that a skill that you were so good at is a tough task now.

Growth
Keep your eyes set on growth.

Relax and ponder
You need to relax. As human beings we can relax our bodies but it is a battle to relax our minds. Stop reading & writing or listening and sit aside and just ponder. The best ideas often surface when pondering over the things you have in your head.

Be Simple
Bruce Lee puts this really well, “It’s not the daily increase but daily decrease. Hack away at the unessential.”

So put your head down right now and start working.

[HOW TO] Install Apache Hive

Environment Details
Operating System : Linux Mint Release 14
Hadoop Version : 0.20.2

Following are the steps to install Apache Hive:

  • Download Apache Hive

$ wget http://apache.mirrors.hoobly.com/hive/hive-0.10.0/hive-0.10.0.tar.gz

  • Untar the archive. I have untarred the file to the /usr/local/

$ tar -xzvf hive-0.10.0.tar.gz
$ mv hive-0.10.0 hive

  • Set the environment variable HIVE_HOME to point to the installation directory:

$ cd hive
$ export HIVE_HOME=/usr/local/hive

  • Add $HIVE_HOME/bin to your PATH:

$ export PATH=$HIVE_HOME/bin:$PATH
$ hive

Do let me know if you need any information.

Cloudera Certified Hadoop Developer (CCD-410)

I cleared the Cloudera Certified Hadoop Developer (CCD – 410) examination and I just wanted to list down a few suggestions for those wanting to appear for the same.

Note : If you are here looking for questions that are part of the CCD-410 test, you have come to the wrong place. However, if you are here to learn more on how the test is and how to prepare for it you have come to the write place.

If you are very serious about the Hadoop certification, I highly recommend the Cloudera Developer Training for Apache Hadoop. I had attended it and found it very useful in understanding the correct working of Hadoop. For more details on the Cloudera training you can read my blog post : Cloudera Developer Training for Apache Hadoop. Continue reading

[HOW TO] Connect 2 phones and make a simple call using Asterisk

Asterisk is a software implementation of a telephone private branch exchange (PBX); it was created in 1999 by Mark Spencer of Digium. Like any PBX, it allows attached telephones to make calls to one another, and to connect to other telephone services including the public switched telephone network (PSTN) and Voice over Internet Protocol (VoIP) services. Its name comes from the asterisk symbol, “*”. – Source : http://en.wikipedia.org/wiki/Asterisk_(PBX)

When I learned about Asterisk, I wanted to try connecting 2 phones on my local LAN and try making a call between them. I just thought the whole exercise would be fun and it was fun indeed. Continue reading

Introducing MapReduce – Part I

MapReduce is the programming model to work on data within the HDFS. The programming language for MapReduce is Java. Hadoop also provides streaming wherein other langauges could also be used to write MapReduce programs. All data emitted in the flow of a MapReduce program is in the form of <Key,Value> pairs.

We have seen in the previous post a typical flow for the Hadoop system. Here we will break down the MapReduce program and try and understand each part in detail. Continue reading

Introducing Hadoop – Part II

Hadoop uses HDFS to store files efficiently in the cluster. When a file is placed in HDFS it is broken down into blocks, 64 MB block size by default.These blocks are then replicated across the different nodes (DataNodes) in the cluster. The default replication value is 3, i.e. there will be 3 copies of the same block in the cluster. We will see later on why we maintain replicas of the blocks in the cluster. Continue reading

Introducing Hadoop – Part I

What is Hadoop?

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures. - Source : http://hadoop.apache.org/

The above definition can be bit overwhelming when read all at once. But the reality is that it is a very simple concept. Continue reading