windows(64位)本地(local)用eclipse调试mapreduce程序

一、环境准备

java环境、eclipse、 hadoop -2.x (windows环境下)

此处本人所用hadoop包 链接:https://pan.baidu.com/s/1230HUG2HluDsP1FT-tXa-g 密码:sodt  (此处文件已全部替换完毕)

首先从网上下载64位winutils.exe、hadoop.dll将文件复制到hadoop/bin目录下,将lib文件中的native库替换为windows版本库,新建系统环境变量 HADOOP_HOME Path

windows(64位)本地(local)用eclipse调试mapreduce程序windows(64位)本地(local)用eclipse调试mapreduce程序

此时,hadoop\bin中的文件

windows(64位)本地(local)用eclipse调试mapreduce程序

lib中文件

windows(64位)本地(local)用eclipse调试mapreduce程序

二、本次运行mapreduce

此处运行例子为简单的wordcount统计程序,代码如下

1.WordcountMapper:

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;



public class WordcountMapper extends Mapper<LongWritable, Text, Text, IntWritable>
{
    @Override
    protected  void map (LongWritable key ,Text  value,Context context) throws IOException, InterruptedException
    {
        //拿到一行数据转换为string
        String line=value.toString();
        //将这一行切分出各个单词
        String[] words =line.split(" " );
        //遍历数组,输出<单词,1>
        for (String word : words) {
            context.write(new Text(word ), new IntWritable(1));
        }
            
    }    
    

}

2.WordcountReducer

import java.io.IOException;


import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;




public class WordcountReducer   extends Reducer<Text, IntWritable, Text, IntWritable>    
{
    //key单词 //value:[1,1]  //Iterable<IntWritable> 迭代器
    @Override
    protected void reduce (Text key, Iterable<IntWritable> values, Context  context ) throws IOException, InterruptedException
            {
        //定义一个计数器
        int count=0;
        //遍历这一组kv的所有v,累加到count中
        for (IntWritable value :values) {
            count +=value.get();
        }
        context.write(key, new IntWritable(count));

            }

}
  3. WordcountDriver:
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;



public class WordcountDriver {
public static void main(String[] args) throws Exception {
    Configuration  conf =new Configuration();
    //是否运行为本地模式,就是看这个参数值是否为local,默认就是local
    conf.set("mapreduce.framework.name", "local");
    conf.set("fs.defaults", "file:///");
    

    Job  job=Job.getInstance(conf);
    //jar包路径
    job.setJarByClass(WordcountDriver.class);
    
    
    //指定本业务job
    job.setMapperClass(WordcountMapper.class);
    job.setReducerClass(WordcountReducer.class);
    //指定mapper输出的kv类型
    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(IntWritable.class);
 
    //指定最终输出的类型
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
 
    
    //指定job的输入原始文件在目录
    FileInputFormat.setInputPaths(job, new Path(args[0]));
     //指定job的输出结果目录
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    
    boolean res =job.waitForCompletion(true);
    System.exit(res?0:1);
    
}
}

此处注意:mr程序中一定将运行参数改为本地

conf.set("mapreduce.framework.name", "local");

    conf.set("fs.defaults", "file:///");
三、利用eclipse调试

main右键-->debug configure

windows(64位)本地(local)用eclipse调试mapreduce程序

设置完成后,debug

控制台输出:

2018-03-05 21:24:33,151 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1019)) - session.id is deprecated. Instead, use dfs.metrics.session-id
2018-03-05 21:24:33,155 INFO  [main] jvm.JvmMetrics (JvmMetrics.java:init(76)) - Initializing JVM Metrics with processName=JobTracker, sessionId=
2018-03-05 21:24:33,505 WARN  [main] mapreduce.JobSubmitter (JobSubmitter.java:copyAndConfigureFiles(150)) - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2018-03-05 21:24:33,507 WARN  [main] mapreduce.JobSubmitter (JobSubmitter.java:copyAndConfigureFiles(259)) - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2018-03-05 21:24:33,776 INFO  [main] input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 1
2018-03-05 21:24:33,837 INFO  [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(396)) - number of splits:1
2018-03-05 21:24:33,934 INFO  [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(479)) - Submitting tokens for job: job_local2074308044_0001
2018-03-05 21:24:33,966 WARN  [main] conf.Configuration (Configuration.java:loadProperty(2368)) - file:/tmp/hadoop-Administrator/mapred/staging/hadoop2074308044/.staging/job_local2074308044_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2018-03-05 21:24:33,972 WARN  [main] conf.Configuration (Configuration.java:loadProperty(2368)) - file:/tmp/hadoop-Administrator/mapred/staging/hadoop2074308044/.staging/job_local2074308044_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
2018-03-05 21:24:34,125 WARN  [main] conf.Configuration (Configuration.java:loadProperty(2368)) - file:/tmp/hadoop-Administrator/mapred/local/localRunner/hadoop/job_local2074308044_0001/job_local2074308044_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2018-03-05 21:24:34,131 WARN  [main] conf.Configuration (Configuration.java:loadProperty(2368)) - file:/tmp/hadoop-Administrator/mapred/local/localRunner/hadoop/job_local2074308044_0001/job_local2074308044_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
2018-03-05 21:24:34,138 INFO  [main] mapreduce.Job (Job.java:submit(1289)) - The url to track the job: http://localhost:8080/
2018-03-05 21:24:34,139 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1334)) - Running job: job_local2074308044_0001
2018-03-05 21:24:34,140 INFO  [Thread-4] mapred.LocalJobRunner (LocalJobRunner.java:createOutputCommitter(471)) - OutputCommitter set in config null
2018-03-05 21:24:34,152 INFO  [Thread-4] mapred.LocalJobRunner (LocalJobRunner.java:createOutputCommitter(489)) - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
2018-03-05 21:24:34,203 INFO  [Thread-4] mapred.LocalJobRunner (LocalJobRunner.java:runTasks(448)) - Waiting for map tasks
2018-03-05 21:24:34,203 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner (LocalJobRunner.java:run(224)) - Starting task: attempt_local2074308044_0001_m_000000_0
2018-03-05 21:24:34,242 INFO  [LocalJobRunner Map Task Executor #0] util.ProcfsBasedProcessTree (ProcfsBasedProcessTree.java:isAvailable(181)) - ProcfsBasedProcessTree currently is supported only on Linux.
2018-03-05 21:24:34,585 INFO  [LocalJobRunner Map Task Executor #0] mapred.Task (Task.java:initialize(587)) -  Using ResourceCalculatorProcessTree : [email protected]
2018-03-05 21:24:34,590 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:runNewMapper(733)) - Processing split: file:/D:/flowsum/input/w.txt:0+47
2018-03-05 21:24:34,632 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:createSortingCollector(388)) - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2018-03-05 21:24:34,675 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:setEquator(1182)) - (EQUATOR) 0 kvi 26214396(104857584)
2018-03-05 21:24:34,675 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:init(975)) - mapreduce.task.io.sort.mb: 100
2018-03-05 21:24:34,675 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:init(976)) - soft limit at 83886080
2018-03-05 21:24:34,676 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:init(977)) - bufstart = 0; bufvoid = 104857600
2018-03-05 21:24:34,676 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:init(978)) - kvstart = 26214396; length = 6553600
2018-03-05 21:24:34,687 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(591)) -
2018-03-05 21:24:34,687 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:flush(1437)) - Starting flush of map output
2018-03-05 21:24:34,687 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:flush(1455)) - Spilling map output
2018-03-05 21:24:34,687 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:flush(1456)) - bufstart = 0; bufend = 84; bufvoid = 104857600
2018-03-05 21:24:34,688 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:flush(1458)) - kvstart = 26214396(104857584); kvend = 26214360(104857440); length = 37/6553600
2018-03-05 21:24:34,717 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:sortAndSpill(1641)) - Finished spill 0
2018-03-05 21:24:34,731 INFO  [LocalJobRunner Map Task Executor #0] mapred.Task (Task.java:done(1001)) - Task:attempt_local2074308044_0001_m_000000_0 is done. And is in the process of committing
2018-03-05 21:24:34,788 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(591)) - map
2018-03-05 21:24:34,788 INFO  [LocalJobRunner Map Task Executor #0] mapred.Task (Task.java:sendDone(1121)) - Task 'attempt_local2074308044_0001_m_000000_0' done.
2018-03-05 21:24:34,788 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner (LocalJobRunner.java:run(249)) - Finishing task: attempt_local2074308044_0001_m_000000_0
2018-03-05 21:24:34,789 INFO  [Thread-4] mapred.LocalJobRunner (LocalJobRunner.java:runTasks(456)) - map task executor complete.
2018-03-05 21:24:34,796 INFO  [Thread-4] mapred.LocalJobRunner (LocalJobRunner.java:runTasks(448)) - Waiting for reduce tasks
2018-03-05 21:24:34,797 INFO  [pool-3-thread-1] mapred.LocalJobRunner (LocalJobRunner.java:run(302)) - Starting task: attempt_local2074308044_0001_r_000000_0
2018-03-05 21:24:34,808 INFO  [pool-3-thread-1] util.ProcfsBasedProcessTree (ProcfsBasedProcessTree.java:isAvailable(181)) - ProcfsBasedProcessTree currently is supported only on Linux.
2018-03-05 21:24:35,142 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1355)) - Job job_local2074308044_0001 running in uber mode : false
2018-03-05 21:24:35,143 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1362)) -  map 100% reduce 0%
2018-03-05 21:24:35,252 INFO  [pool-3-thread-1] mapred.Task (Task.java:initialize(587)) -  Using ResourceCalculatorProcessTree : [email protected]
2018-03-05 21:24:35,284 INFO  [pool-3-thread-1] mapred.ReduceTask (ReduceTask.java:run(362)) - Using ShuffleConsumerPlugin: [email protected]
2018-03-05 21:24:35,333 INFO  [pool-3-thread-1] reduce.MergeManagerImpl (MergeManagerImpl.java:<init>(193)) - MergerManager: memoryLimit=1321893888, maxSingleShuffleLimit=330473472, mergeThreshold=872449984, ioSortFactor=10, memToMemMergeOutputsThreshold=10
2018-03-05 21:24:35,365 INFO  [localfetcher#1] reduce.LocalFetcher (LocalFetcher.java:copyMapOutput(140)) - localfetcher#1 about to shuffle output of map attempt_local2074308044_0001_m_000000_0 decomp: 106 len: 110 to MEMORY
2018-03-05 21:24:35,368 INFO  [EventFetcher for fetching Map Completion Events] reduce.EventFetcher (EventFetcher.java:run(61)) - attempt_local2074308044_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
2018-03-05 21:24:40,808 INFO  [communication thread] mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(591)) - reduce > copy
2018-03-05 21:24:42,990 INFO  [localfetcher#1] reduce.InMemoryMapOutput (InMemoryMapOutput.java:shuffle(100)) - Read 106 bytes from map-output for attempt_local2074308044_0001_m_000000_0
2018-03-05 21:24:43,808 INFO  [communication thread] mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(591)) - reduce > copy
2018-03-05 21:24:46,809 INFO  [communication thread] mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(591)) - reduce > copy
2018-03-05 21:24:48,472 INFO  [localfetcher#1] reduce.MergeManagerImpl (MergeManagerImpl.java:closeInMemoryFile(307)) - closeInMemoryFile -> map-output of size: 106, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->106
2018-03-05 21:24:48,949 INFO  [EventFetcher for fetching Map Completion Events] reduce.EventFetcher (EventFetcher.java:run(76)) - EventFetcher is interrupted.. Returning
2018-03-05 21:24:48,951 INFO  [pool-3-thread-1] mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(591)) - 1 / 1 copied.
2018-03-05 21:24:48,951 INFO  [pool-3-thread-1] reduce.MergeManagerImpl (MergeManagerImpl.java:finalMerge(667)) - finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
2018-03-05 21:24:48,995 INFO  [pool-3-thread-1] mapred.Merger (Merger.java:merge(591)) - Merging 1 sorted segments
2018-03-05 21:24:48,995 INFO  [pool-3-thread-1] mapred.Merger (Merger.java:merge(690)) - Down to the last merge-pass, with 1 segments left of total size: 99 bytes
2018-03-05 21:24:48,997 INFO  [pool-3-thread-1] reduce.MergeManagerImpl (MergeManagerImpl.java:finalMerge(742)) - Merged 1 segments, 106 bytes to disk to satisfy reduce memory limit
2018-03-05 21:24:48,997 INFO  [pool-3-thread-1] reduce.MergeManagerImpl (MergeManagerImpl.java:finalMerge(772)) - Merging 1 files, 110 bytes from disk
2018-03-05 21:24:48,998 INFO  [pool-3-thread-1] reduce.MergeManagerImpl (MergeManagerImpl.java:finalMerge(787)) - Merging 0 segments, 0 bytes from memory into reduce
2018-03-05 21:24:48,998 INFO  [pool-3-thread-1] mapred.Merger (Merger.java:merge(591)) - Merging 1 sorted segments
2018-03-05 21:24:48,999 INFO  [pool-3-thread-1] mapred.Merger (Merger.java:merge(690)) - Down to the last merge-pass, with 1 segments left of total size: 99 bytes
2018-03-05 21:24:48,999 INFO  [pool-3-thread-1] mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(591)) - 1 / 1 copied.
2018-03-05 21:24:49,045 INFO  [pool-3-thread-1] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1019)) - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
2018-03-05 21:24:49,096 INFO  [pool-3-thread-1] mapred.Task (Task.java:done(1001)) - Task:attempt_local2074308044_0001_r_000000_0 is done. And is in the process of committing
2018-03-05 21:24:49,097 INFO  [pool-3-thread-1] mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(591)) - 1 / 1 copied.
2018-03-05 21:24:49,098 INFO  [pool-3-thread-1] mapred.Task (Task.java:commit(1162)) - Task attempt_local2074308044_0001_r_000000_0 is allowed to commit now
2018-03-05 21:24:49,100 INFO  [pool-3-thread-1] output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_local2074308044_0001_r_000000_0' to file:/D:/flowsum/output/_temporary/0/task_local2074308044_0001_r_000000
2018-03-05 21:24:49,102 INFO  [pool-3-thread-1] mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(591)) - reduce > reduce
2018-03-05 21:24:49,103 INFO  [pool-3-thread-1] mapred.Task (Task.java:sendDone(1121)) - Task 'attempt_local2074308044_0001_r_000000_0' done.
2018-03-05 21:24:49,103 INFO  [pool-3-thread-1] mapred.LocalJobRunner (LocalJobRunner.java:run(325)) - Finishing task: attempt_local2074308044_0001_r_000000_0
2018-03-05 21:24:49,103 INFO  [Thread-4] mapred.LocalJobRunner (LocalJobRunner.java:runTasks(456)) - reduce task executor complete.
2018-03-05 21:24:49,145 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1362)) -  map 100% reduce 100%
2018-03-05 21:24:49,145 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1373)) - Job job_local2074308044_0001 completed successfully
2018-03-05 21:24:49,154 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1380)) - Counters: 33
    File System Counters
        FILE: Number of bytes read=640
        FILE: Number of bytes written=473924
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
    Map-Reduce Framework
        Map input records=4
        Map output records=10
        Map output bytes=84
        Map output materialized bytes=110
        Input split bytes=93
        Combine input records=0
        Combine output records=0
        Reduce input groups=10
        Reduce shuffle bytes=110
        Reduce input records=10
        Reduce output records=10
        Spilled Records=20
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=0
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
        Total committed heap usage (bytes)=464388096
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters
        Bytes Read=47
    File Output Format Counters
        Bytes Written=76

调试成功,看看我们的输出目录:

windows(64位)本地(local)用eclipse调试mapreduce程序

ok!!