通过水槽将事件数据写入HDFS时出错
问题描述:
我正在使用cdh3更新4 tarball进行开发。我已经运行起来了。现在,我也从cloudera viz 1.1.0下载了等价的flume tarball,并试图使用hdfs-sink将日志文件尾部写入hdfs。当我运行flume agent时,它会开始好,但当它试图将新的事件数据写入hdfs时会出错。我无法找到更好的组来发布此问题比stackoverflow。 这里是水槽配置我使用通过水槽将事件数据写入HDFS时出错
agent.sources=exec-source
agent.sinks=hdfs-sink
agent.channels=ch1
agent.sources.exec-source.type=exec
agent.sources.exec-source.command=tail -F /locationoffile
agent.sinks.hdfs-sink.type=hdfs
agent.sinks.hdfs-sink.hdfs.path=hdfs://localhost:8020/flume
agent.sinks.hdfs-sink.hdfs.filePrefix=apacheaccess
agent.channels.ch1.type=memory
agent.channels.ch1.capacity=1000
agent.sources.exec-source.channels=ch1
agent.sinks.hdfs-sink.channel=ch1
而且,这是一个被显示在控制台当收到新的事件数据,并试图将其写入到HDFS错误的小片段。
13/03/16 17:59:21 INFO hdfs.BucketWriter: Creating hdfs://localhost:8020/user/hdfs-user/flume/apacheaccess.1363436060424.tmp
13/03/16 17:59:22 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Failed on local exception: java.io.IOException: Broken pipe; Host Details : local host is: "sumit-HP-Pavilion-dv3-Notebook-PC/127.0.0.1"; destination host is: "localhost":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
at org.apache.hadoop.ipc.Client.call(Client.java:1164)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy9.create(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy9.create(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:192)
at org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1298)
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1317)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1215)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1173)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:272)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:261)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:78)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:805)
at org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1060)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:270)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:369)
at org.apache.flume.sink.hdfs.HDFSSequenceFile.open(HDFSSequenceFile.java:65)
at org.apache.flume.sink.hdfs.HDFSSequenceFile.open(HDFSSequenceFile.java:49)
at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:190)
at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:50)
at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:157)
at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:154)
at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:127)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:154)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:316)
at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:718)
at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:715)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:100)
at sun.nio.ch.IOUtil.write(IOUtil.java:71)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:62)
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:143)
at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:153)
at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:114)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:861)
at org.apache.hadoop.ipc.Client.call(Client.java:1141)
... 37 more
13/03/16 17:59:27 INFO hdfs.BucketWriter: Creating hdfs://localhost:8020/user/hdfs-user/flume/apacheaccess.1363436060425.tmp
13/03/16 17:59:27 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Failed on local exception: java.io.IOException: Broken pipe; Host Details : local host is: "sumit-HP-Pavilion-dv3-Notebook-PC/127.0.0.1"; destination host is: "localhost":8020;
答
随着人们Cloudera的邮件列表suggest,有此错误的可能的理由:
- HDFS的安全模式已开启。尝试运行
hadoop fs -safemode leave
并查看错误是否消失。 - Flume和Hadoop版本不匹配。要检查这个问题,请将hadoop-core.jar替换为hadoop安装文件夹中的hadoop-core.jar。
该命令实际上是'hadoop dfsadmin -safemode leave'。在我看来,这是问题的瓶子。 – maksimov 2013-07-16 16:28:25