首页 > 数据库 >【Leetcode 2474. 购买量严格增加的客户】[MySQL 用户变量/Pandas]面向过程编程解决严格递增连续子序列问题

【Leetcode 2474. 购买量严格增加的客户】[MySQL 用户变量/Pandas]面向过程编程解决严格递增连续子序列问题

时间:2024-01-22 10:06:49浏览次数:37  
标签:customer 2474 t1 id 严格 year prev my Pandas

目录

  • 题目地址
  • MySQL代码
  • 等效pandas代码

题目地址

https://leetcode.cn/problems/customers-with-strictly-increasing-purchases/description/

MySQL代码

# Write your MySQL query statement below

with t1 as(
    select customer_id, year(order_date) as my_year
    , sum(price) as total_spend
    from Orders
    group by customer_id,my_year
)
,
t2 as(
    select
    *

    # --------------------------只需要改动这里的逻辑,其他不要动。注意里面的语句是“顺序执行的”----------------------
    
    -- 如果customer_id切换了
    ,(case when @prev_customer_id is null or @prev_customer_id<>customer_id  then @reg:=0 end)
    ,(case when @prev_customer_id is null or @prev_customer_id<>customer_id  then @prev_total_spend:=0 end)
    ,(case when @prev_customer_id is null or @prev_customer_id<>customer_id  then @prev_my_year:= my_year - 1 end)  -- 当前年的上一年,哨兵
    
    , @prev_my_year as prev_my_year  -- 没啥用,纯粹调试用的,看这个用户变量@prev_my_year是否符合预期。



    -- 年购买量严格增加的年数(连续年)该任务是否成功,ok不一定成功,failure一定是失败了
    ,(case when total_spend>@prev_total_spend and @prev_my_year + 1 =my_year then "flag=ok" else "flag=failure" end) as flag

    -- 这是维护一个【严格递增连续子序列的长度】
    ,(case when total_spend>@prev_total_spend and @prev_my_year + 1 =my_year then @reg:=@reg+1 else @reg:=1 end)

    ,@reg as reg   -- 没啥用,纯粹调试用的,看这个用户变量@reg是否符合预期。



    -- 准备好处理下一行了,一些@prev_*用户变量更新一下
    ,@prev_total_spend:=total_spend
    ,@prev_customer_id:=customer_id
    ,@prev_my_year:=my_year

    # --------------------------只需要改动这里的逻辑,其他不要动。注意里面的语句是“顺序执行的”----------------------

    from (select * from t1 order by  customer_id asc, my_year asc ) soerted_t1,
    (select @reg := 0    -- 某用户年购买量严格增加的年数(连续年)
    ,@prev_total_spend:=0  ,@prev_customer_id := null, @prev_my_year :=null ) vars
)



select distinct customer_id
from t2
where customer_id not in (select customer_id from t2 where flag="flag=failure")
order by customer_id asc, my_year asc 


-- 下面注释掉的代码是用来调试的
# select 
# customer_id,my_year,reg,total_spend,prev_my_year,flag
# from t2
# where customer_id=6
# order by customer_id asc, my_year asc

等效pandas代码

import pandas as pd

def find_specific_customers(orders: pd.DataFrame) -> pd.DataFrame:
    # First, we mimic the 't1' CTE by grouping and summing the price per customer per year
    orders['my_year'] = pd.DatetimeIndex(orders['order_date']).year
    t1 = orders.groupby(['customer_id', 'my_year'], as_index=False)['price'].sum().rename(columns={'price': 'total_spend'})
    
    # Sort the DataFrame similarly to the 'soerted_t1' in the query
    t1_sorted = t1.sort_values(by=['customer_id', 'my_year'])
    
    # Now we need to perform the logic that's done in 't2' CTE
    # We will use the shift operation to compare the current row with the previous one
    t1_sorted['prev_customer_id'] = t1_sorted['customer_id'].shift(1)
    t1_sorted['prev_my_year'] = t1_sorted['my_year'].shift(1)
    t1_sorted['prev_total_spend'] = t1_sorted['total_spend'].shift(1)
    
    # Initialize the reg (registration) to track the strictly increasing sequence length
    t1_sorted['reg'] = 1
    t1_sorted['flag'] = 'flag=ok'
    
    # Loop through the DataFrame to calculate conditional variables
    for i in range(1, len(t1_sorted)):
        row = t1_sorted.iloc[i]
        prev_row = t1_sorted.iloc[i - 1]
        
        if row['customer_id'] == prev_row['customer_id']:
            if row['total_spend'] > prev_row['total_spend'] and row['prev_my_year'] + 1 == row['my_year']:
                t1_sorted.at[i, 'reg'] = prev_row['reg'] + 1
            else:
                t1_sorted.at[i, 'flag'] = 'flag=failure'
                t1_sorted.at[i, 'reg'] = 1

    # Identify customers with no 'flag=failure'
    customers_without_failure = t1_sorted[~t1_sorted['customer_id'].isin(t1_sorted.loc[t1_sorted['flag'] == 'flag=failure', 'customer_id'])]['customer_id'].unique()
    
    # Return a DataFrame with the customer_id's that meet the criteria
    result = pd.DataFrame({'customer_id': customers_without_failure})
    
    return result.sort_values('customer_id').reset_index(drop=True)



标签:customer,2474,t1,id,严格,year,prev,my,Pandas
From: https://blog.51cto.com/u_15247503/9360648

相关文章

  • 孩子从小生长在知识分子的家庭,或者传统的书香门第,家里规矩多管教严格,会对孩子性格造成
    农家子弟 单亲家庭 农耕之家    工薪阶层 农民工  中国旧社会的人总爱是“穷有穷根,富有富根”、“龙生龙,凤生凤,老鼠儿子会打洞”,现在这些老话都被推翻了,都说这是封建血统论和封建门第观。可是事实胜过强辩,我们发现民国学者大多是出身于书香门第和富贵之家。孩......
  • Excel中这日期老是出来00:00:00,怎么用Pandas把这个去除
    大家好,我是皮皮。一、前言前几天在Python白银交流群【上海新年人】问了一个Pandas数据筛选的问题。问题如下:这日期老是出来00:00:00,怎么把这个去除。二、实现过程后来【论草莓如何成为冻干莓】给了一个思路和代码如下:pd.to_excel之前把这一列转换成字符串,或者你在Excel上设置......
  • pandas
    语句定义格式importpandasaspds=pd.Series(data,index,dtype,copy)data输入的数据,可以是列表、常量、ndarray数组等。index索引值必须是惟一的,如果没有传递索引,则默认为np.arrange(n)。dtypedtype表示数据类型,如果没有提供,则会自动判断得出。copy表示对da......
  • 一个excel文件,我用pandas如何取行,只能用pandas?
    大家好,我是皮皮。一、前言前几天在Python白银交流群粉丝问了一个Pandas数据筛选的问题。问题如下:我突然想到,如果一个excel文件,我用pandas如何取行,只能用pandas。二、实现过程后来【月神】给了一个思路和代码如下:其实粉丝之前也问了一个类似的花式索引的问题,可能时就间太久......
  • Python Pandas 数据分析项目实例
    ​ 假设我们有一个电子商务公司的销售数据,包含了客户购买记录、商品价格、购买日期等信息。我们的目标是分析这些数据,提取有价值的信息,例如总销售额、最畅销的产品、销售趋势等。1、安装引用Pandas如没有安装Pandas,可以通过pip安装它。参考下面的文档。然后在Python脚本......
  • pandas典型应用
    #header默认为0,即以第一列为列名,这里设为None,意味不设置第一行为列名;df11=pd.read_table(url,header=None)df11.head()#names=***,可以自定义设置列名user_cols=['id','quantity','name','choice','price']df=pd.read_table(url,header=None......
  • 【LeetCode 2701. 连续递增交易】MySQL用户变量编程得到严格递增连续子序列的开始位置
    题目地址https://leetcode.cn/problems/consecutive-transactions-with-increasing-amounts/代码#WriteyourMySQLquerystatementbelowwitht1as(select*#--------------------------只需要改动这里的逻辑,其他不要动。注意里面的语句是“顺序执行的......
  • 【Leetcode 2474. 购买量严格增加的客户】MySQL用户变量编程解决严格递增连续子序列问
    题目地址https://leetcode.cn/problems/customers-with-strictly-increasing-purchases/description/代码#WriteyourMySQLquerystatementbelowwitht1as(selectcustomer_id,year(order_date)asmy_year,sum(price)astotal_spendfromOrders......
  • 在 Pandas 中,`and` 和 `or` 是 Python 中的关键字,用于逻辑运算。但是在 Pandas 中,我们
    在Pandas中,`and`和`or`是Python中的关键字,用于逻辑运算。但是在Pandas中,我们使用`&`和`|`来表示逻辑与和逻辑或。这是因为在Pandas中,`and`和`or`会产生歧义,而`&`和`|`则不会。因此,在Pandas中,我们应该使用`&`和`|`来表示逻辑与和逻辑或,而不是`and`和......
  • # yyds干货盘点 # pandas单独设一个新列,譬如从2023-11-1到2023-11-31怎么搞法?
    大家好,我是皮皮。一、前言前几天在Python白银交流群【上海新年人】问了一个Pandas日期数据生成的问题,问题如下:大佬们,如果我想单独设一个新列,譬如从2023-11-1到2023-11-31怎么搞法?或者这个数据有11行,可不可以自行设置成2023-11-15到2023-11-26这样子。二、实现过程后来【论草莓如何......