Cloudera Developer Training for Apache Hadoop




I just completed the Cloudera Developer Training for Apache Hadoop course which was held in Denver, CO, USA from 4th Dec 2012 to 7th Dec 2012.

This program prepares you for the following certification:

Cloudera Certified Developer for Apache Hadoop CDH4 (CCD-410)

The course was spread over 4 days that gives you a complete understanding of the Hadoop system (HDFS & MapReduce) along with a few tools that are part of the Hadoop ecosystem.

Day 1

  • The Motivation for Hadoop
  • Hadoop: Basic Concepts

This was a good start where we were given a complete detail of the HDFS architecture and the reason behind why the system was built this way. A few examples of the practical problems that exist in the real world and how Hadoop comes to the rescue were also mentioned. A lot of emphasis was given on each and every basic daemon that runs within Hadoop and their respective responsibilities. The good thing was that the pace of the training on the first day was very slow and that helped all the people who are completely new to Hadoop get a good grasp of the basic concepts and also allowed for all doubts to be clarified in detail.

Day 2

  • Writing a MapReduce Program
  • Unit Testing MapReduce Programs
  • Delving Deeper into the Hadoop API

This day introduced us to how a MapReduce Program runs and how they can be coded and executed within the Hadoop environment. All MapReduce examples were in Java. However a brief discussion on how other languages could be used for MapReduce development (streaming) were also explained. The langauge for the streaming example was Python. A hands-on on session on using MRUnit (just like jUnit) was very useful in understanding how simple it is to unit test MapReduce code. All MapReduce code was written in Eclipse IDE which I personally found very useful.

Day 3

  • Practical Development Tips and Techniques
  • Data Input and Output
  • Common MapReduce Alogrithms

A simple demo was shown on how to use Eclipse to run and debug MapReduce programs using the LocalJobRunner mode. The exercise on using counters in MapReduce also gave a good insight on how counters are handled within Hadoop. The really useful exercise of the day was using SequenceFiles along with compression. This is really useful to merge large number of small sized files. A complete module on some common MapReduce alogrithms were also discussed. I found this part of the course could have been a bit more in detail.

Day 4

  • Joining Data Sets in MapReduce
  • Integrating Hadoop into the Enterprise Workflow
  • Machine Learning and Mahout
  • An Introduction to Hive and Pig
  • An Introduction to Oozie

The day started with the a detailed discussion on performing joins using MapReduce. I am not sure if anyone would really do joins in MapReduce but it was a good to understand the complexity in doing joins using MapReduce. The rest of the day was more to do with some of the tools that are part of the Hadoop ecosystem like Sqoop, Mahout, Hive and Pig. There were hands on exercises for these tools, however the last tool, Oozie was left as an exercise for home work.The exercises on Hive and Pig make it clear that one should use these tools to perform joins rather than MapReduce.

After the completion of the course Jesse (the instructor) spoke about hisMillion Monkeys project. This was very interesting. You should definitely check it out.

About the Instructor

Jesse Anderson is an Instructor and Curriculum Developer at Cloudera.

Almost at all times he was very clear and kept both, the people who were from a programming background as well as the non programmers engaged at all times. He often reiterated important points and would also make sure to brush up few concepts time and again. Overall, a fantastic instructor.

A useful tip that he gave was to get a good grip on Regular Expressions.

Do take a look at his website, he has tons of information there:

Jesse Anderson’s website

Who should take the course?

1. Anyone who would like to get into Hadoop either as developer or as solutions designer, this course is definitely a starting point to explore Hadoop’s possibilities.

2. Anyone who is evaluating Hadoop to see if it really fits their organizations needs and to see whether Hadoop fits into their ecosystem.

Would I recommend taking this course?

Yes. Before attending the course, I too had my own doubts, especially considering the course fee ($2995, at the time I took it). I paid for this myself, however, this course has definitely brought to light a lot of things that I would not have got just by reading books and the internet. The course also gives you a voucher code to appear for the certification exam.

Note:

Before attending the course I had gone through the Hadoop Tutorial from Yahoo Developer Network which allowed me to grasp concepts with ease. I highly recommend it.

Do let me know if you need any information.



29 Comments Cloudera Developer Training for Apache Hadoop

  1. rohith Janga

    Thanks for info rohith.
    Do you think one can crack the exam with out the training($3000 is what bothers me), by just reading couple of books like
    1.Hadoop In Action
    2. Definitive Guide(YDN).

    Reply
    1. Rohit Menon

      Hi Rohith,

      The training is not a must. I am sure there have been many who have cleared the certification solely on understanding the concepts and referring to the books you have mentioned.
      Spend enough time really visualizing the system and understanding every component of the system.

      Do let me know if you need any information.

      Reply
  2. Naveen

    Iam taking hadoop developer from edureka and later want to take the CCD exam. Can please recommend me some good books for clearing the exam.

    Reply
    1. Rohit Menon

      Hi Naveen,

      I prefer the following books:
      1. Yahoo Developer Network Hadoop Tutorials (Online)
      2. Hadoop in Action by Chuck Lam
      3. Hadoop – The Definitive Guide.

      All the best !

      Reply
  3. sanju

    Hi Rohit,
    I have recently joined the bigdata team in my organization and pretty new to the Hadoop world. I was part of DW(MSBI). I m not from JAVA background.How difficult will it be for me to take up this cerification?I am very keen on learning about Hadoop and taking up this certification both , but JAVA is a road block for me 🙁

    Reply
    1. Rohit Menon

      Hi Sanju,

      JAVA is not a prerequisite for the certification. Go through the Yahoo Developer Hadoop Tutorial, followed by the Hadoop in Action book. If you have a thorough understanding of these two, you should be good for the certification. The certification does not ask to write any kind of Java programs.

      Reply
  4. Aks K

    I have attended Cloudera ‘Devloper’ training which was for 4 days. Here is my review about the training.

    They had good material to go through, but unfortunately our trainer was a newbee.
    He could not answer more than 5% of the questions and could not debug a single programming related issue. I am 100% sure that he has not written any mapreduce jobs.
    There was a lady in our group who was hired as a Consultant by Cloudera and she said she going to provide the same training to other students in near future.

    I called Cloudera and shared my learning expereience. The instructor could not explain any concept in even one layer of detail, forget about going deep into it. The person I spoke with was surprised and then offered me to attend the similar training again for no cost.
    I know none of the students were happy, and one guy emailed me that he has also got the similar offere from Cloudera.
    I can retake the course, but then I will lose 4 days of my pay as my company is not going to give me the day off for the same training.

    Why I am telling you this so you know that the training you are going to attend is going to be a waste of your time and money. Even if it is your company who pays for it, you will not be able to show the results to them.
    In a way, Cloudera admitted their training quality when they told me that “if I decide to take the upcoming training, then they are going to change the trainer”. That shows how much they trust their trainer.
    On a side note, all the best to students who are going to attend “Cloudera Developer Training for Apache Hadoop” on Aug 19 – Aug 22 in Chicago area.

    If you just want to take a break from your work, then go for it and I assure you that you will get what you want.
    but if you really want to learn Hadoop, then better look somewhere else.

    Reply
    1. Vic

      Hi Aks K,

      i am also planning to take up the training within couple of months. if you dont mind can you share the tutor name to my email id – vichhu@gmail.com. so that i make sure before attending the training.

      also i am not from DW side. i am java developer. will this be a issue in getting trained in Hadoop. please advise

      thanks,

      Reply
  5. swapna

    Hi Rohit,
    I am debating between hadoop administrator and developer.I am not from a Java background.I am good at oracle Pl/sql and unix shell scripting.Can you please suggest which would be a good option?

    Thanks
    Swapna.

    Reply
    1. Rohit Menon

      Hi Swapna,

      You may want to go though the topics in both and see what interests you. If you are good at SQL and scripting you should find the developer route interesting.

      Reply
  6. Debajit

    Hi Rohit,
    I have gone through the definitive guide book once. I am bit comfortable with the concepts but not sure with the API’s. I have scheduled the exam on 3rd Sep. Could you please provide me some last minute tips. Some areas in the API’s related questions where I need to put more stress? Thanks in advance..!!

    Reply
  7. Jai batra

    Hi Rohit,
    I’m planning to go for Cloudera Certified Developer for Apache Hadoop CDH4, i have gone through Hadoop in Action and the cookbook also set-up cluster a personal cluster.
    But i’m really unsure as to whether this enough for certification exam or not, also i’m a little concerned about the exam pattern and difficulty Level as i’m not so good with Java.
    Could you please help me out on This..

    Reply
  8. Rishikesh Nair

    i stay in india (pune) can you please help me with the institutes in pune that provide better training for the apache hadoop developer training course.i am pursueing my MCA will this course benefit me to become a successful developer like you

    Reply
  9. hareesha

    Hi Rohit,

    I am looking forward to take the cloudera certification exam.
    I have started with Yahoo Tutorial.
    Do you think certification is really necessary to answer questions in the exam ,as you have mentioned some tricky questions were there.
    I have pretty good java background .
    Thanks for the information and tutorials too.

    Reply
  10. Pingback: Cloudera Certified Hadoop Developer (CCD-410) | Rohit Menon

  11. Ago

    Hello Rohit, my background is in Adsl Broadband and voice, am looking at changing career to a Hadoop developer and intend taking this course, do you think I would be able to get into companies with this certification and no prior experience in Big Data

    Reply
  12. Kishore

    Hi Rohit,

    This page is very useful.

    Thanks for sharing your experience.

    Keep up the good work.

    Regards,
    Kishore

    Reply

Leave a Reply to Jesse Anderson Cancel reply

Your email address will not be published. Required fields are marked *