Hive多个子查询和组由
问题描述:
我正在将统计数据从MySQL切换到Amazon DynamoDB和Elastic MapReduce。Hive多个子查询和组由
我有查询波纹管与MySQL的工作,我有蜂巢上的同一个表,并需要在MySQL(last_week,last_month和last_year的产品视图)相同的结果。
SELECT product_id,
SELECT COUNT(product_id) from dev_product_views_hive as P2 where P2.product_id=P.product_id and created >= DATE_SUB(NOW(), INTERVAL 1 WEEK) as weekly,
SELECT count(product_id) from dev_product_views_hive as P3 where P3.product_id=P.product_id and created >= DATE_SUB(NOW(), INTERVAL 1 MONTH) as monthly,
SELECT count(product_id) from dev_product_views_hive as P4 where P4.product_id=P.product_id and created >= DATE_SUB(NOW(), INTERVAL 1 YEAR) as yearly
from dev_product_views_hive as P group by product_id;
我想通了,如何获得具有蜂巢例如结果上个月:
SELECT product_id, COUNT(product_id) as views from dev_product_views_hive WHERE created >= UNIX_TIMESTAMP(CONCAT(DATE_SUB(FROM_UNIXTIME(UNIX_TIMESTAMP()), 31)," ","00:00:00")) GROUP BY product_id;
,但我需要分组结果就像我得到与MySQL:
product_id views_last_week views_last_month views_last_year
2 564 2460 29967
4 980 3986 54982
是否有可能与蜂巢做到这一点?
谢谢你在前进,
阿米尔
答
您可以case when
和sum()
或count()
例如,做到这一点。
select product_id,
sum(case when created >= concat(date_sub(to_date(from_unixtime(unix_timestamp())), 7)," 00:00:00") then 1 else 0 end) as weekly,
sum(case when created >= concat(date_sub(to_date(from_unixtime(unix_timestamp())), 31)," 00:00:00") then 1 else 0 end) as monthly,
sum(case when created >= concat(date_sub(to_date(from_unixtime(unix_timestamp())), 365)," 00:00:00") then 1 else 0 end) as yearly
from dev_product_views_hive
group by product_id;
concat(date_sub(to_date(from_unixtime(unix_timestamp())), days)," 00:00:00")
将返回当前时间过去的格式化字符串。创建>=
了您的预期
您也可以与蜂巢内置函数做count()
只计算那些行返回非NULL
和
case when
将reterun 1
你试过了吗? – 2013-03-04 12:39:26
是的,总是得到解析错误,我不知道如何在hiveql中正确编写这个MySQL子查询。 Hiveql不像MySQL那样的语法。 – trkich 2013-03-04 14:07:25
hive ql仅支持子句 – 2013-03-04 17:25:16