MapReduce错误集-map端jvm堆空间不足
任务:INSERT_ADD_BD_DW_GENERAL_PUSH
脚本内容:
hive -v -e "use db_ecar;
set hive.map.aggr.hash.percentmemory = 0.25;
INSERT INTO TABLE BD_DW_GENERAL_PUSH
SELECT t4.USER_ID
,t1.TERMINAL_ID
,t4.OPEN_ID
,substr(t1.LASTUPDATE_TIME,1,10) AS TIME_ID
,t3.CHANNEL_TYPE AS PUSH_ID
,t1.CONTENT_TYPE AS CONTENT_ID
,sum(case when t1.flag in (1,2) then 1 else 0 end ) AS PUSH_SUCC_NUM
,sum(case when t1.flag = 3 then 1 else 0 end ) AS PUSH_FAIL_NUM
,sum(case when t1.flag in (0,1,2,3,4) then 1 else 0 end) AS PUSH_NUM
,sum(case when t1.flag = 2 then 1 else 0 end ) AS RESPONSE_NUM
,sum(case when t1.flag = 2 then t1.LASTUPDATE_TIME - t1.PUSH_TIME else 0 end ) AS RESPONSE_TIME
FROM (
SELECT *
FROM MYSQL1_EOC_PUSH_SEVER_RECORD
WHERE YEAR = substr(date_sub('$data_date',1),1,4)
AND MONTH = substr(date_sub('$data_date',1),6,2)
AND DAY = substr(date_sub('$data_date',1),9,2)
) t1
join BD_DW_BASIC_USER_INFO t4 on t1.TERMINAL_ID = t4.TERMINAL_ID
left join MYSQL1_EOC_PUSH_SCENES t3 on t1.SCENES_ID = t3.SCENES_ID
GROUP BY t4.USER_ID,t1.TERMINAL_ID,t4.OPEN_ID,substr(t1.LASTUPDATE_TIME,1,10),t3.CHANNEL_TYPE,t1.CONTENT_TYPE
;
"
报错日志:
错误分析:
报错信息Java heap space,表示jvm堆空间不足。查询mapreduce.map.java.opts(map端的jvm堆空间为1G),mapreduce.task.io.sort.mb(map排序)为512M。剩余内存为512M,而表的数据比较大,这点内存不够。
解决方案:将mapreduce.task.io.sort.mb调为100。(注意100后面不需要加“M”)