MapReduce错误集-map端jvm堆空间不足

任务:INSERT_ADD_BD_DW_GENERAL_PUSH

脚本内容:

hive -v -e "
use db_ecar;
set hive.map.aggr.hash.percentmemory = 0.25;


INSERT INTO TABLE BD_DW_GENERAL_PUSH  
SELECT  t4.USER_ID
       ,t1.TERMINAL_ID
  ,t4.OPEN_ID
  ,substr(t1.LASTUPDATE_TIME,1,10)                      AS TIME_ID
  ,t3.CHANNEL_TYPE                                     AS PUSH_ID
  ,t1.CONTENT_TYPE                                     AS CONTENT_ID
  ,sum(case when t1.flag in (1,2) then 1 else 0 end )  AS PUSH_SUCC_NUM
  ,sum(case when t1.flag = 3 then 1 else 0 end )       AS PUSH_FAIL_NUM
  ,sum(case when t1.flag in (0,1,2,3,4) then 1 else 0 end) AS PUSH_NUM
  ,sum(case when t1.flag = 2 then 1 else 0 end )       AS RESPONSE_NUM
  ,sum(case when t1.flag = 2 then t1.LASTUPDATE_TIME - t1.PUSH_TIME else 0 end )  AS RESPONSE_TIME
  
FROM (
  SELECT * 
  FROM MYSQL1_EOC_PUSH_SEVER_RECORD
  WHERE YEAR = substr(date_sub('$data_date',1),1,4) 
    AND MONTH  = substr(date_sub('$data_date',1),6,2)
AND DAY = substr(date_sub('$data_date',1),9,2)
) t1
join BD_DW_BASIC_USER_INFO t4 on t1.TERMINAL_ID = t4.TERMINAL_ID
left join MYSQL1_EOC_PUSH_SCENES t3 on t1.SCENES_ID = t3.SCENES_ID
GROUP BY t4.USER_ID,t1.TERMINAL_ID,t4.OPEN_ID,substr(t1.LASTUPDATE_TIME,1,10),t3.CHANNEL_TYPE,t1.CONTENT_TYPE 
;  
"

报错日志:

MapReduce错误集-map端jvm堆空间不足

错误分析:

  报错信息Java heap space,表示jvm堆空间不足。查询mapreduce.map.java.optsmap端的jvm堆空间为1G),mapreduce.task.io.sort.mbmap排序)为512M。剩余内存为512M,而表的数据比较大,这点内存不够。

 

解决方案:将mapreduce.task.io.sort.mb调为100。(注意100后面不需要加“M”)