1.题目
A,B两组产品的月平均值,月平均值是当月的前三个月值的一个平均值,注意月份是不连续的,如果当月的前面的月份不存在,则为0。如A组2023-04的月平均值为2023年1月的数据加2023-02月的数据的平均值,因为没有其他月份则需要再加一个0,再求平均值。要求:求出每个月的月平均值。
‘A’,‘2023-01’,3030
‘A’,‘2023-02’,5464
‘A’,‘2023-04’,5467
‘A’,‘2023-05’,4646
‘A’,‘2023-06’,8546
‘B’,‘2022-01’,9846
‘B’,‘2022-02’,1562
‘B’,‘2022-03’,2733
‘B’,‘2022-05’,8833
‘B’,‘2022-06’,8787
2.建表
create table if not exists non_continuous_time(
product string comment '产品号',
pro_time string comment '时间',
pro_values int comment '值'
)comment '非连续时间表'
insert into non_continuous_time values
('A','2023-01',3030),
('A','2023-02',5464),
('A','2023-04',5467),
('A','2023-05',4646),
('A','2023-06',8546),
('B','2022-01',9846),
('B','2022-02',1562),
('B','2022-03',2733),
('B','2022-05',8833),
('B','2022-06',8787)
3.思路
使用lag窗口函数,lag的偏移量可以锁定前三个月的数据,没有的显示为0;
select
product,
pro_time,
pro_values,
coalesce(lag(pro_values,1) over(partition by product order by pro_time),0) lg_one,
coalesce(lag(pro_values,2) over(partition by product ORDER BY pro_time),0) lg_two,
coalesce(lag(pro_values,3) over(partition by product ORDER BY pro_time),0) lg_thr
from non_continuous_time
最终结果:
select
a.product,
a.pro_time,
(lg_one+lg_two+lg_thr)/3 sum_values
from
(
select
product,
pro_time,
pro_values,
coalesce(lag(pro_values,1) over(partition by product order by pro_time),0) lg_one,
coalesce(lag(pro_values,2) over(partition by product ORDER BY pro_time),0) lg_two,
coalesce(lag(pro_values,3) over(partition by product ORDER BY pro_time),0) lg_thr
from non_continuous_time
)a