eclipse_win7_hadoop1.2.1开发环境搭建3
3.6 查看WordCount运行结果
查看Eclipse软件左侧,右击"DFS LocationsàWin7ToHadoopàuseràhadoop",点击刷新按钮"Refresh",我们刚才出现的文件夹"newoutput"会出现。记得"newoutput"文件夹是运行程序时自动创建的,如果已经存在相同的的文件夹,要么程序换个新的输出文件夹,要么删除HDFS上的那个重名文件夹,不然会出错。
打开"newoutput"文件夹,打开"part-r-00000"文件,可以看见执行后的结果。
到此为止,Eclipse开发环境设置已经完毕,并且成功运行Wordcount程序,下一步我们真正开始Hadoop之旅。
4、常见问题FAQ
4.1 "error: failure to login"问题
下面以网上找的"hadoop-0.20.203.0"为例,我在使用"V1.0"时也出现这样的情况,原因就是那个"hadoop-eclipse-plugin-1.0.0_V1.0.jar",是直接把源码编译而成,故而缺少相应的Jar包。具体情况如下
详细地址:http://blog.****.net/chengfei112233/article/details/7252404
在我实践尝试中,发现hadoop-0.20.203.0版本的该包如果直接复制到eclipse的插件目录中,在连接DFS时会出现错误,提示信息为: "error: failure to login"。
弹出的错误提示框内容为"An internal error occurred during: "Connecting to DFS hadoop".org/apache/commons/configuration/Configuration". 经过察看Eclipse的log,发现是缺少jar包导致的。进一步查找资料后,发现直接复制hadoop-eclipse-plugin-0.20.203.0.jar,该包中lib目录下缺少了jar包。
经过网上资料搜集,此处给出正确的安装方法:
首先要对hadoop-eclipse-plugin-0.20.203.0.jar进行修改。用归档管理器打开该包,发现只有commons-cli-1.2.jar 和hadoop-core.jar两个包。将hadoop/lib目录下的:
- commons-configuration-1.6.jar ,
- commons-httpclient-3.0.1.jar ,
- commons-lang-2.4.jar ,
- jackson-core-asl-1.0.1.jar
- jackson-mapper-asl-1.0.1.jar
一共5个包复制到hadoop-eclipse-plugin-0.20.203.0.jar的lib目录下,如下图:
然后,修改该包META-INF目录下的MANIFEST.MF,将classpath修改为一下内容:
Bundle-ClassPath:classes/,lib/hadoop-core.jar,lib/commons-cli-1.2.jar,lib/commons-httpclient-3.0.1.jar,lib/jackson-core-asl-1.0.1.jar,lib/jackson-mapper-asl-1.0.1.jar,lib/commons-configuration-1.6.jar,lib/commons-lang-2.4.jar
这样就完成了对hadoop-eclipse-plugin-0.20.203.0.jar的修改。
最后,将hadoop-eclipse-plugin-0.20.203.0.jar复制到Eclipse的plugins目录下。
备注:上面的操作对"hadoop-1.0.0"一样适用。
4.2 "Permission denied"问题
网上试了很多,有提到"hadoop fs -chmod 777 /user/hadoop ",有提到"dfs.permissions 的配置项,将value值改为 false",有提到"hadoop.job.ugi",但是通通没有效果。
参考文献:
地址1:http://www.cnblogs.com/acmy/archive/2011/10/28/2227901.html
地址2:http://sunjun041640.blog.163.com/blog/static/25626832201061751825292/
错误类型:org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=*********, access=WRITE, inode="hadoop":hadoop:supergroup:rwxr-xr-x
解决方案:
我的解决方案直接把系统管理员的名字改成你的Hadoop集群运行hadoop的那个用户。
4.3 "Failed to set permissions of path"问题
参考文献:https://issues.apache.org/jira/browse/HADOOP-8089
错误信息如下:
ERROR security.UserGroupInformation: PriviledgedActionException as: hadoop cause:java.io.IOException Failed to set permissions of path:\usr\hadoop\tmp\mapred\staging\hadoop753422487\.staging to 0700 Exception in thread "main" java.io.IOException: Failed to set permissions of path: \usr\hadoop\tmp \mapred\staging\hadoop753422487\.staging to 0700
解决方法:
Configuration conf = new Configuration();
conf.set("mapred.job.tracker", "[server]:9001");
"[server]:9001"中的"[server]"为Hadoop集群Master的IP地址。
4.4 "hadoop mapred执行目录文件权"限问题
参考文献:http://blog.****.net/azhao_dn/article/details/6921398
错误信息如下:
job Submission failed with exception 'java.io.IOException(The ownership/permissions on the staging directory /tmp/hadoop-hadoop-user1/mapred/staging/hadoop-user1/.staging is not as expected. It is owned by hadoop-user1 and permissions are rwxrwxrwx. The directory must be owned by the submitter hadoop-user1 or by hadoop-user1 and permissions must be rwx------)
修改权限:
这样就能解决问题。
记住:eclipse版本必须用http://mirror.bit.edu.cn/eclipse/technology/epp/downloads/release/indigo/SR2/eclipse-jee-indigo-SR2-win32-x86_64.zip
否则会出现:java.lang.RuntimeException: java.lang.ClassNotFoundException: com.kingdee.hadoop.WordCount$TokenizerMapper 这种问题。