免費做問卷的網(wǎng)站好精準營銷理論
注:參考文章:
SQL連續(xù)增長問題--HQL面試題35_sql判斷一個列是否連續(xù)增長-CSDN博客文章瀏覽閱讀2.6k次,點贊6次,收藏30次。目錄0 需求分析1 數(shù)據(jù)準備3 小結(jié)0 需求分析假設我們有一張訂單表shop_order shop_id,order_id,order_time,order_amt 我們需要計算過去至少3天銷售金額連續(xù)增長的商戶shop_id。數(shù)據(jù)如下:shop_idorder_amtorder_time11002021-05-10 10:03:5411012021-05-10 10:04:5413002021-0_sql判斷一個列是否連續(xù)增長https://blog.csdn.net/godlovedaniel/article/details/119080882
0 需求分析
? 現(xiàn)有一張訂單表shop_order ,含有字段shop_id,order_id,order_time,order_amt, 需要統(tǒng)計過去至少連續(xù)3天銷售金額連續(xù)增長的商戶shop_id。
1 數(shù)據(jù)準備
create table shop_order(shop_id int,order_amt int,order_time string
)
row format delimited fields terminated by '\t';
load data local inpath "/opt/module/hive_data/shop_order.txt" into table shop_order;
2 數(shù)據(jù)分析
? ?完整的代碼如下:
with tmp as (selectshop_id,to_date(order_time) as dt,sum(order_amt) as amtfrom shop_ordergroup by shop_id, to_date(order_time)
)
selectshop_id
from (select *,-- 判斷日期是否連續(xù)date_sub(dt, row_number() over (partition by shop_id order by dt )) as order_date_difffrom (selectshop_id,dt,amt,--判斷銷售額是否增長-- 當前行的銷售金額與上一行的銷售金額之間的差值 order_amt_diffamt - lag(amt, 1, 0) over (partition by shop_id order by dt) as order_amt_diff from tmp) t1-- 差值大于0的代表銷售額增長where order_amt_diff > 0) t2
group by shop_id, order_date_diff
having count(1) >=3;
輸出結(jié)果為 shop_id 為2
上述代碼分析:
?step1: 求出每家商戶銷售金額連續(xù)增長的記錄
with tmp as (selectshop_id,to_date(order_time) as dt,sum(order_amt) as amtfrom shop_ordergroup by shop_id, to_date(order_time)
)select *
from (selectshop_id,dt,amt,--判斷銷售額是否增長-- 當前行的銷售金額與上一行的銷售金額之間的差值 order_amt_diffamt - lag(amt, 1, 0) over (partition by shop_id order by dt) as order_amt_difffrom tmp) t1-- 差值大于0的代表銷售額增長
where order_amt_diff > 0
?step2: 求出每家商戶至少連續(xù)3天銷售金額連續(xù)增長,在step1的基礎上,還要求dt是連續(xù)的
with tmp as (selectshop_id,to_date(order_time) as dt,sum(order_amt) as amtfrom shop_ordergroup by shop_id, to_date(order_time)
)select *,-- 判斷日期是否連續(xù)date_sub(dt, row_number() over (partition by shop_id order by dt )) as order_date_diff
from (selectshop_id,dt,amt,--判斷銷售額是否增長-- 當前行的銷售金額與上一行的銷售金額之間的差值 order_amt_diffamt - lag(amt, 1, 0) over (partition by shop_id order by dt) as order_amt_difffrom tmp) t1-- 差值大于0的代表銷售額增長
where order_amt_diff > 0
step3: 對商戶shop_id以及日期差值order_date_diff這兩個字段分組,求出最終結(jié)果
with tmp as (selectshop_id,to_date(order_time) as dt,sum(order_amt) as amtfrom shop_ordergroup by shop_id, to_date(order_time)
)
selectshop_id
from (select *,-- 判斷日期是否連續(xù)date_sub(dt, row_number() over (partition by shop_id order by dt )) as order_date_difffrom (selectshop_id,dt,amt,--判斷銷售額是否增長-- 當前行的銷售金額與上一行的銷售金額之間的差值 order_amt_diffamt - lag(amt, 1, 0) over (partition by shop_id order by dt) as order_amt_diff --判斷是否增長from tmp) t1-- 差值大于0的代表銷售額增長where order_amt_diff > 0) t2
group by shop_id, order_date_diff
having count(1) >=3;
3 小結(jié)
? ?date_sub(日期減少函數(shù))
- 語法:date_sub(string startdate,int days)
- 返回值:string
- 說明:返回? ?開始日期startdate 減去days天后的日期
- 舉例:select??date_sub('2024-02-01',3) --->2024-01-29
lag
- 語法:lag(column,n,default) over(partition by ....order by....)
- 說明:取得column列前邊的第n行數(shù)據(jù),如果存在則返回,如果不存在,返回默認值default
? ? ?針對【日期連續(xù)】等類型的題型,一般處理思路:先計算date_sub(dt, row_number() over (partition by shop_id order by dt )) as dt_diff ,再對dt_diff 分組,求count()值
? ? 針對【xx連續(xù)增長】等類型的題型,一般處理思路:利用前后函數(shù)lag或者lead往前/往后取一行,計算兩者的差值diff,再利用 if( diff >0,1,0) as flag 等條件判斷函數(shù) 進行打標簽,基于標簽再進行后續(xù)的分組計算.......