bing
Flat 10% & upto 50% off + Free additional Courses. Hurry up!

Implementation Of Mapreduce

The following table shows the data about customer visited the Intellipaat.com page. The table includes the monthly visitors of intellipaat.com  page and annual average of five years.

JAN FEB MAR APR MAY JUN JULY AUG SEP OCT NOV DEC AVG
2008 23 23 2 43 24 25 26 26 26 25 26 26 25
2009 26 27 28 28 28 30 31 31 31 30 30 30 29
2010 31 32 32 32 33 34 35 36 36 34 34 34 34
2014 39 38 39 39 39 41 42 43 40 39 39 38 40
2016 38 39 39 39 39 41 41 41 00 40 40 39 45

To find the maximum number of visitors and minimum number of visitors in the year we used MapReduce framework.

Input data: The above data is saved as intellipaat.txt and this is used as an input data.

Example program of MapReduce framework

package hadoop;

import java.util.*;
import java.io.IOException;
import java.io.IOException;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;

public class Intellipaat_visitors
{
//Mapper class
public static class E_EMapper extends MapReduceBase implements
Mapper<LongWritable, /*Input key Type */
Text, /*Input value Type*/
Text, /*Output key Type*/
IntWritable> /*Output value Type*/
{
//Map function
public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException
{
String line = value.toString();
String lasttoken = null;
StringTokenizer s = new StringTokenizer(line,”\t”);
String year = s.nextToken();

while(s.hasMoreTokens()){
lasttoken=s.nextToken();
}

int avgprice = Integer.parseInt(lasttoken);
output.collect(new Text(year), new IntWritable(avgprice));
}
}

//Reducer class

public static class E_EReduce extends MapReduceBase implements
Reducer< Text, IntWritable, Text, IntWritable >
{
//Reduce function
public void reduce(Text key, Iterator <IntWritable> values, OutputCollector>Text, IntWritable> output, Reporter reporter) throws IOException
{
int maxavg=30;
int val=Integer.MIN_VALUE;
while (values.hasNext())
{
if((val=values.next().get())>maxavg)
{
output.collect(key, new IntWritable(val));
}
}
}
}

//Main function

public static void main(String args[])throws Exception
{
JobConf conf = new JobConf(Visitors.class);

conf.setJobName(“max_visitors”);

conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);

conf.setMapperClass(E_EMapper.class);
conf.setCombinerClass(E_EReduce.class);
conf.setReducerClass(E_EReduce.class);

conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);

FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));

JobClient.runJob(conf);
}
}

Save the above program by the name Intellipaat_visitors.java

Store the compiled Java classes in new directory. Use the below command to create a new directory.

$ mkdir visitors

Using the below link to download the jar

http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-core/1.2.1

Compile the  Intellipaat_visitors and create  jar for the program.

$ javac -classpath hadoop-core-1.2.1.jar -d visitors Intellipaat_visitors.java
$ jar -cvf visitors.jar -C visitors/ 
Create an  input directory in HDFS using below command

$HADOOP_HOME/bin/hadoop fs -mkdir input_dir

Copy the input file named Intellipaat_visitors.txt in the input directory of HDFS.

$HADOOP_HOME/bin/hadoop fs -put /home/hadoop/Intellipaat_visitors.txt input_dir
$HADOOP_HOME/bin/hadoop jar visitors.jar hadoop.Intellipaat_visitors input_dir output_dir

Output

INFO mapreduce.Job: Job job_1414748220717_0002
completed successfully
14/10/31 06:02:52
INFO mapreduce.Job: Counters: 49
 
File System Counters
  
   FILE: Number of bytes read=61
   FILE: Number of bytes written=279400
   FILE: Number of read operations=0
   FILE: Number of large read operations=0
   FILE: Number of write operations=0
 
   HDFS: Number of bytes read=546
   HDFS: Number of bytes written=40
   HDFS: Number of read operations=9
   HDFS: Number of large read operations=0
   HDFS: Number of write operations=2 Job Counters
  
   Launched map tasks=2
   Launched reduce tasks=1
   Data-local map tasks=2
        
   Total time spent by all maps in occupied slots (ms)=146137
   Total time spent by all reduces in occupied slots (ms)=441
   Total time spent by all map tasks (ms)=14613
   Total time spent by all reduce tasks (ms)=44120
        
   Total vcore-seconds taken by all map tasks=146137
   Total vcore-seconds taken by all reduce tasks=44120
        
   Total megabyte-seconds taken by all map tasks=149644288
   Total megabyte-seconds taken by all reduce tasks=45178880
 
Map-Reduce Framework
  
   Map input records=5
        
   Map output records=5
   Map output bytes=45
   Map output materialized bytes=67
        
   Input split bytes=208
   Combine input records=5
   Combine output records=5
        
   Reduce input groups=5
   Reduce shuffle bytes=6
   Reduce input records=5
   Reduce output records=5
        
   Spilled Records=10
   Shuffled Maps =2
   Failed Shuffles=0
   Merged Map outputs=2
   GC time elapsed (ms)=948
   CPU time spent (ms)=5160
   Physical memory (bytes) snapshot=47749120
   Virtual memory (bytes) snapshot=2899349504
   Total committed heap usage (bytes)=277684224
 File Output Format Counters
    Bytes Written=40

Using the below command verified the resultant in the output folder

$HADOOP_HOME/bin/hadoop fs -ls output_dir/

The final output of mapreduce framework is

2010 34
2014 40
2016 45

 

This blog will help you get a better understanding of Hadoop MapReduce – What it Refers To?

"0 Responses on Implementation Of Mapreduce"

Leave a Message

Your email address will not be published.

Training in Cities

Bangalore, Hyderabad, Chennai, Delhi, Kolkata, UK, London, Chicago, San Francisco, Dallas, Washington, New York, Orlando, Boston

100% Secure Payments. All major credit & debit cards accepted Or Pay by Paypal.

top

Sales Offer

  • To avail this offer, enroll before 06th December 2016.
  • This offer cannot be combined with any other offer.
  • This offer is valid on selected courses only.
  • Please use coupon codes mentioned below to avail the offer
offer-june

Sign Up or Login to view the Free Implementation Of Mapreduce.