第二节:win 7下编译eclispe hadoop plugin
本文参照后面地址的Linux编译方式:http://www.cnblogs.com/chenying99/archive/2013/05/31/3109566.html
1、下载ANT,apache-ant-1.9.2-bin.tar.gz
eclipse-java-indigo-SR2-win32.zip
eclipse-java-indigo-SR2-win32-x86_64.zip
2、解压,并配置环境变量
ANT_HOME=D:\devlop_apps\apache-ant-1.9.2
PATH=%ANT_HOME%\bin;.......
省略号表示后面还有其他参数值,我们在PATH值的前面加上ant bin目录就可以了。
3、解压hadoop 1.2.1
4、修改%HADOOP_HOME%/src/contrib目录的build-contrib.xml文件,添加eclipse路径和hadoop版本信息(D:\devlop_apps\eclipse-java-indigo-SR2-win32-x86_64为我的eclipse路径)
<property name="eclipse.home" location="D:/devlop_apps/eclipse-java-indigo-SR2-win32-x86_64" /> <property name="version" value="1.2.1"/>5、修改javac.deprecation属性
<property name="javac.deprecation" value="on"/>6、修改%HADOOP_HOME%/src/contrib/eclipse-plugin目录下的build.xml文件,在id为classpath的path节点添加hadoop-core的jar依赖
<!-- Override classpath to include Eclipse SDK jars --> <path id="classpath"> <pathelement location="${build.classes}"/> <pathelement location="${hadoop.root}/build/classes"/> <!--hadoop-core --> <pathelement location="${hadoop.root}/hadoop-core-${version}.jar"/> <path refid="eclipse-sdk-jars"/> </path>7、找到name为jar的target,将相应的jar文件打包进插件的lib目录
<!-- Override jar target to specify manifest --> <target name="jar" depends="compile" unless="skip.contrib"> <mkdir dir="${build.dir}/lib"/> <!--<copy file="${hadoop.root}/build/hadoop-core-${version}.jar" tofile="${build.dir}/lib/hadoop-core.jar" verbose="true"/> <copy file="${hadoop.root}/build/ivy/lib/Hadoop/common/commons-cli-${commons-cli.version}.jar" todir="${build.dir}/lib" verbose="true"/>--> <!-- 修改这两个hadoop-core-version.jar和commons-cli-version.jar的位置 --> <copy file="${hadoop.root}/hadoop-core-${version}.jar" tofile="${build.dir}/lib/hadoop-core.jar" verbose="true"/> <copy file="${hadoop.root}/lib/commons-cli-${commons-cli.version}.jar" todir="${build.dir}/lib" verbose="true"/> <!-- 将以下jar包打进hadoop-eclipse-1.1.2.jar中 --> <copy file="${hadoop.root}/lib/commons-lang-2.4.jar" todir="${build.dir}/lib" verbose="true"/> <copy file="${hadoop.root}/lib/commons-configuration-1.6.jar" todir="${build.dir}/lib" verbose="true"/> <copy file="${hadoop.root}/lib/jackson-mapper-asl-1.8.8.jar" todir="${build.dir}/lib" verbose="true"/> <copy file="${hadoop.root}/lib/jackson-core-asl-1.8.8.jar" todir="${build.dir}/lib" verbose="true"/> <copy file="${hadoop.root}/lib/commons-httpclient-3.0.1.jar" todir="${build.dir}/lib" verbose="true"/> <jar jarfile="${build.dir}/hadoop-${name}-${version}.jar" manifest="${root}/META-INF/MANIFEST.MF"> <fileset dir="${build.dir}" includes="classes/ lib/"/> <fileset dir="${root}" includes="resources/ plugin.xml"/> </jar> </target>8、修改MANIFEST.MF文件里面Bundle-ClassPath属性值
Bundle-ClassPath: classes/,
lib/hadoop-core.jar,
lib/commons-cli-1.2.jar,
lib/commons-configuration-1.6.jar,
lib/commons-httpclient-3.0.1.jar,
lib/commons-lang-2.4.jar,
lib/jackson-core-asl-1.8.8.jar,
lib/jackson-mapper-asl-1.8.8.jar
9、在命令行进入 %HADOOP_HOME%/src/contrib/eclipse-plugin目录,输入ant命令,就开始打包了。
10、最后在%HADOOP_HOME%/build/contrib/eclipse-plugin目录生成打包好的插件,将hadoop-eclipse-plugin-1.2.1.jar文件复制到eclipse的dropins目录即可
11、打开eclipse,window-->Preferences打开配置选项。
12、配置Hadoop的根目录
13、配置远程Hadoop集群选项,Window-->Show view-->Other-->Map Reduce Tools,参照下图输入选项
Location Name : 此处为参数设置名称,可以任意填写
Map/Reduce Master (此处为Hadoop集群的Map/Reduce地址,应该和mapred-site.xml中的mapred.job.tracker设置相同)
DFS Master (此处为Hadoop的master服务器地址,应该和core-site.xml中的 fs.default.name 设置相同)
设置完成后,点击Finish就应用了该设置。
此时,在最左边的Project Explorer中就能看到DFS的目录,如下图所示。
14、测试创建DFS目录和上传文件,上传的文件内容都是英文的(测试文件下载地址),接下来会测试Map Reduce的运行情况
15、新建项目:File-->New-->Other-->Map/Reduce Project ,项目名可以随便取,如hadoop_test_01
16、建完项目后,贴入以下代码:
package com.hadoop;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordAverage {
public static class Map extends
Mapper<LongWritable, Text, Text, IntWritable> {
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer st = new StringTokenizer(line);
String name = st.nextToken();
String score = st.nextToken();
context.write(new Text(name),
new IntWritable(Integer.parseInt(score)));
}
}
public static class Reduce extends
Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {
int sum = 0;
int count = 0;
for (IntWritable val : values) {
sum += val.get();
count++;
}
int avg = (int) sum / count;
context.write(key, new IntWritable(avg));
}
}
public static void main(String[] args) throws Exception {
Job job = new Job();
job.setJarByClass(WordCount1.class);
job.setJobName("Word Average");
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(Map.class);
job.setCombinerClass(Reduce.class);// 叠加相同关键字的次数
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
17、设置运行时参数,并执行:
19、hadoop 1.2.1的eclipse 插件下载地址