Tuesday, November 18, 2014

Debug Hadoop Unit Test which runs on Maven


I am just running a single unit test

mvn -Dmaven.surefire.debug test -Dtest=org.apache.hadoop.hdfs.server.namenode.TestAddBlock


The tests will automatically pause and await a remote debugger on port 5005. You can then attach to the running tests using Eclipse.

Run > Debug Configarations


If you need to configure a different port, you may pass a more detailed value. For example, the command below will use port 8000 instead of port 5005.

mvn -Dmaven.surefire.debug="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000 -Xnoagent -Djava.compiler=NONE" test

Monday, November 17, 2014

CAP Theorem

In 2000, Eric Brewer published a paper stating that a Distributed system cannot simultaneously provide all three of the following properties:
  • Consistency: A read sees all previously completed writes.
  • Availability: Reads (Actual Data) and writes always succeed.
  • Partition tolerance: Guaranteed properties are maintained even when network failures prevent some machines from communicating with others.
     
The CAP Theorem says that the Distributed system cannot have all 3 of the above, it can only have any two of the above properties (CA, AP, CP).


Sunday, November 16, 2014

Run Haddop Unit Tests

Rull all the tests
mvn test

Run a single test
mvn test -Dtest=org.apache.hadoop.fs.TestSymlinkLocalFSFileContext

Run all the test even if there are errors
mvn -Dmaven.test.failure.ignore=true test

Run a single test even if there are errors
mvn -Dmaven.test.failure.ignore=true test -Dtest=org.apache.hadoop.fs.TestSymlinkLocalFSFileContext

Basic Git Commands

Get a copy of specific repository branch to the local computer
git clone -b [branch name] https://github.com/[namespace]/[project].git


Initializes and push a new project to GitHub
git init
git add .
git commit -m "first commit"
git remote add origin https://github.com/[namespace]/[project].git
git push -u origin master


Reset all the local repository changes to Head.
git reset --hard

Show the working tree status
git status

Monday, November 10, 2014

Hide current working directory in Ubuntu Terminal


Just type below command in the Terminal.
export PS1='\u@\h:~$ '

Add above command to the bashrc file, so you do not have to type in the command again every time you load a new terminal.
sudo vi $HOME/.bashrc

Go to the bottom of the file and add the following using a editor such as VI.
export PS1='\u@\h:~$ '

Saturday, November 8, 2014

Compile Hadoop Trunk on Ubuntu

Prerequisites

Install JDK
How to Install Oracle Java7 in Ubuntu Server 14.04 LTS

Install Maven
Install Maven on Ubuntu

Check if the ProtocolBuffers already installed?
protoc --version

Install ProtocolBuffers
sudo apt-get install protobuf-compiler  

Check the version
protoc --version

if the version does not display try following command.
sudo ldconfig
protoc --version

I have downloaded the "hadoop-trunk.zip" from Git Hub. 

if you want after the extract copy the  hadoop-trunk directory to your project working area.

sudo cp -r hadoop-trunk /home/mahesh/projects/hadoop/hadoop-trunk


Change the ownership of the hadoop-trunk direcoty to you. This should be done with '-R' (Recursive)
sudo chown -R mahesh hadoop-trunk

Compile  Hadoop
mvn install -DskipTests








If everything goes well you should see "BUILD SUCCESSFUL" message. Build would take a while to complete depending on you Internet connection speed, since Maven will download all the required artifacts in the first build.
 

mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true













'protoc --version' did not return a version

I came across this error while I was triying to Build Hadoop.

Check if the ProtocolBuffers already installed?
protoc --version

Install ProtocolBuffers
sudo apt-get install protobuf-compiler  

Check the version
protoc --version


if the version does not display try following command.
sudo ldconfig
protoc --version


The above steps should fix the issue. But while i was trying to figure out the issue I installed the following as well. 
apt-get -y install autoconf automake libtool cmake zlib1g-dev pkg-config libssl-dev

I got the above command from HowToContribute but i changed it.
apt-get -y install maven build-essential autoconf automake libtool cmake zlib1g-dev pkg-config libssl-dev

I removed the  "maven build-essential" from the command. since this will configure the Ubuntu with Open JDK and other tools which is not required.


Further Details can found in the following URI.
Trunk doesn't compile 

Install Maven on Ubuntu

Download and extract Maven.

Add Maven bin directory path to the bashrc file, so you can run Maven from any location. 
sudo vi $HOME/.bashrc

Go to the bottom of the file and add the following using a editor such as VI.
export M2_HOME=/usr/local/apache-maven-3.2.3
export PATH=$PATH:$M2_HOME/bin

Make sure JDK path is also set
export JDK_PREFIX=/usr/local/jdk1.7.0_71
export PATH=$PATH:$JDK_PREFIX/bin