大数据开发面试题
1、一个股票的表里有如下字段id,code,time,price价格,请编写一个SQL统计改股票的所有波峰和波谷。(面试官口述的)
当时面试的时候我没有思路,在回来的路上突然想到一个思路也不知道是否正确。做了一个实验如下
hive> CREATE TABLE IF NOT EXISTS stock(
id int ,
code bigint ,
time string ,
price double )
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS textfile
;
1 | 1 | 2017-12-14 12:03:06 | 12 | |
2 | 2 | 1 | 2017-12-14 12:03:07 | 11 |
3 | 3 | 1 | 2017-12-14 12:03:08 | 13 |
4 | 4 | 1 | 2017-12-14 12:03:09 | 10 |
5 | 5 | 1 | 2017-12-14 12:03:10 | 15 |
6 | 6 | 2 | 2017-12-14 12:03:11 | 18 |
7 | 7 | 2 | 2017-12-14 12:03:12 | 16 |
8 | 8 | 2 | 2017-12-14 12:03:13 | 19 |
select a.id,a.price,
case when b.price is null then "未知"
when c.price is null then "未知"
when a.price>b.price and a.price>c.price then "波峰"
when a.price<b.price and a.price<c.price then "波谷"
else "未知" end as mark
from default.stock a
left join
default.stock b on a.code=b.code and b.id=a.id-1
left join
default.stock c on a.code=c.code and c.id=a.id+1
或者
select code,
time,price,
case when g_price is null or d_price is null
then 'unknown'
when price<g_price and price<d_price
then 'lower'
when price>g_price and price>d_price
then 'hight'
ELSE NULL END as p
from (select code,time,price,lag(price,1)over(order by time asc) as g_price,lead(price,1)over(order by time asc) as d_price
from stock
) a
2、表数据如何变成
实现如下结果
code p1
1 12,11 ,13,10 ,15
2 18,16,19
select code,concat_ws(',',collect_set(price)) p1 from stock group by code
3.如何把上面的结果
code p1
1 12,11 ,13,10 ,15
2 18,16,19 变回如下结果
select code,add_new from (select code,concat_ws(',',collect_set(price)) p1 from stock group by code) a lateral view explode(split(a.p1,',')) ad as add_new
大数据运维群584912368,欢迎运维同学加入