猪UDF在Java中:错误---错误1066:无法打开迭代器的别名
问题描述:
我是新来的猪 我输入的数据是猪UDF在Java中:错误---错误1066:无法打开迭代器的别名
(消息,NIL,2015-07-01,22: 58:53.66,E,machine.com.name,12,0xd6,字符串,字符串 ,0,0.0,键=值&键= 123456789 &键=值&键= US &键=公司&键=消息&关键= 123456789 & key = String & key = String & Key = String & Key = String)
我写的Java UDF如下
package com.pig.udf;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.Map;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
public class PigUDF extends EvalFunc<Map> {
@Override
public Map<String, String> exec(Tuple input) throws IOException {
// If tuple is null, has fewer than 3 values, or has an even number of
// values
if (input == null || input.size() < 3 || (input.size() % 2 == 0)) {
throw new IOException("Incorrect number of values.");
}
String source = (String) input.get(0);
System.out.println("input Source"+source);
String delim = (input.size() > 1) ? (String) input.get(1) : "&";
int length = (input.size() > 2) ? (Integer) input.get(2) : 0;
if (source == null || delim == null) {
return null;
}
String[] splits = source.split(delim, length);
System.out.println("Splits"+ splits);
ArrayList<String> arrayList = new ArrayList<String>(
Arrays.asList(splits));
Map<String, String> map = new HashMap<String, String>();
for (String keyValue : arrayList) {
int end = keyValue.indexOf('=');
if (end != -1) {
map.put(keyValue.substring(0, end), keyValue.substring(end + 1));
}
}
System.out.println("map"+map);
return map;
}
}
当我与上面的Java UDF我收到以下错误运行我的猪脚本解析输入数据的最后一个字符串
Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias C
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias C
at org.apache.pig.PigServer.openIterator(PigServer.java:892)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:774)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:607)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: Job terminated with anomalous status FAILED
at org.apache.pig.PigServer.openIterator(PigServer.java:884)
... 13 more
Application Log
-------------------------------------------------------------------
Application application_1436453941326_0020 failed 2 times due to AM Container for appattempt_1436453941326_0020_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://quickstart.cloudera:8088/proxy/application_1436453941326_0020/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1436453941326_0020_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
我的脚本运行良好,没有Java UDF功能,也给我outfile。 当我在我的Pig脚本中包含Java UDF时,就会出现这个问题。 有运行猪 任何指针我的Java UDF和机器之间没有Java版本不匹配可以理解
猪脚本:
Register '/home/cloudera/Pig/PigUDF_1.7.jar';
Register '/home/cloudera/Pig/pig.jar';
A= Load 'Logs_message.txt' using PigStorage(',') as (component:chararray,Nil:chararray,date:chararray,time:chararray,E:chararray,machine_address:chararray,number1:chararray,hex_number:chararray,cal_type:chararray,cal_name:chararray,number2:chararray,number3:chararray,data:chararray)
B = filter A by cal_name matches 'CHANGEDMESSAGE';
C = foreach B generate cal_name ,com.pig.udf.PigUDF(data) as dataMap;
dump C ;
答
我看到3个问题与您的代码:
- 你在第一行中错过了一个分号。不知道它是如何运行的,假设这是将它复制到StackOverflow的错误
- 您将变量“E”命名为:这是一个保留变量。不知道这会有什么影响,但我不会这样做是安全的。请参阅here获取Pig关键字列表
- (这可能是导致错误的原因)。您的验证没有意义。它看起来像你创建了一个分割函数,用来设置3个或更少的参数(要分割的字符串,分隔符和最大分割大小)。但是,您正在验证输入的参数超过3个。你也正在验证它有偶数个参数。这看起来像是一个验证,旨在为之后的字符串,而不是之前。
应该是这样的:
if (input == null || input.size() == 0 || input.size() > 3) {
throw new IOException("Incorrect number of values.");
}
//...
if(splits.length % 2 != 0)
throw new IOException("Invalid key value pairs");
我建议,直到你已经调试它们不运行在Hadoop上云中的程序,让他们先在本地工作。如果使用PigServer类,则可以通过eclipse或其他IDE在开发计算机上调试UDF。
你是怎么称呼udf的?另外,请查找更详细的日志。 – Frederic
您可以将猪脚本粘贴到您要调用UDF的地方,我认为它是您猪脚本中的问题 – Abhi
Hi @Fred,我在哪里可以找到更详细的日志? – Divya