首页 > 数据库 >【LeetCode 2701. 连续递增交易】[MySQL 用户变量/Pandas]面向过程编程得到严格递增连续子序列的开始位置,结束位置,长度等统计信息

【LeetCode 2701. 连续递增交易】[MySQL 用户变量/Pandas]面向过程编程得到严格递增连续子序列的开始位置,结束位置,长度等统计信息

时间:2024-01-22 10:07:17浏览次数:43  
标签:customer transaction 递增 2701 id date consecutive prev Pandas

目录

  • 题目地址
  • MySQL代码
  • 等效pandas代码

题目地址

https://leetcode.cn/problems/consecutive-transactions-with-increasing-amounts/

MySQL代码

# Write your MySQL query statement below

with t1 as(
    select 
    *

    # --------------------------只需要改动这里的逻辑,其他不要动。注意里面的语句是“顺序执行的”----------------------

    -- 如果customer_id切换了
    ,(case when (@prev_customer_id is null or @prev_customer_id<>customer_id) then @interval_id := @interval_id +1 end) -- 这句看你需要,如果跨顾客需要interval_id额外间隔1,你应该保留这一句。这句注释或者不注释都能AC
    ,(case when @prev_customer_id is null or @prev_customer_id<>customer_id  then @reg:=0 end)
    ,(case when @prev_customer_id is null or @prev_customer_id<>customer_id  then @prev_amoutn:=0 end)
    ,(case when @prev_customer_id is null or @prev_customer_id<>customer_id  then @consecutive_start:=transaction_date end)
    ,(case when @prev_customer_id is null or @prev_customer_id<>customer_id  then @consecutive_end:=transaction_date end)
    ,(case when @prev_customer_id is null or @prev_customer_id<>customer_id  then @prev_transaction_date:= transaction_date - interval 1 day end)  -- 当前日的上一日,哨兵
    
    , @prev_transaction_date as prev_transaction_date  -- 把这个用户变量@prev_transaction_date存到列里。



     -- 这是维护一个【严格递增连续子序列的长度】
     ,(case when amount>@prev_amount and @prev_transaction_date + interval 1 day =transaction_date then @reg:=@reg+1 else @reg:=1 end)

     ,@reg as reg   -- 把这个用户变量@reg存到列里。

     -- 更新一下consecutive_start
     ,(case when customer_id=@prev_customer_id and amount>@prev_amount and @prev_transaction_date + interval 1 day =transaction_date 
        then @consecutive_start:=@consecutive_start 
     else @consecutive_start:=transaction_date end) as consecutive_start


     -- 不管什么情况,这么更新consecutive_end总是对的
     ,@consecutive_end:=transaction_date as consecutive_end


     -- 更新一下interval_id
     ,(case when customer_id=@prev_customer_id and  amount>@prev_amount and @prev_transaction_date + interval 1 day =transaction_date 
        then @interval_id:=@interval_id 
     else @interval_id:=@interval_id+1 end) as interval_id



    -- 准备好处理下一行了,一些@prev_*用户变量更新一下
    ,@prev_amount:=amount
    ,@prev_customer_id:=customer_id
    ,@prev_transaction_date:=transaction_date
    ,@prev_amoutn:=amount

    # --------------------------只需要改动这里的逻辑,其他不要动。注意里面的语句是“顺序执行的”----------------------

    from 
    (select * from Transactions order by customer_id asc,  transaction_date asc) sorted_ticks,
    (select @reg:=0 -- 某用户每日购买量严格增加的天数(连续日)
    ,@prev_amount:=null, @prev_customer_id:=null, @prev_transaction_date:=null,@prev_amoutn:=null
    ,@consecutive_start:=null , @consecutive_end:=null,@interval_id := 0)vars
)


-- 对于每一个interval_id,找到reg最大的那一行,就是该区间最终的统计信息

select customer_id,consecutive_start,consecutive_end
from (select *, row_number() over(partition by interval_id order by reg desc) as rn from t1) tmp1
where rn=1 and reg>=3
order by customer_id,consecutive_start,consecutive_end



-- 下面注释掉的代码是用来调试的
# select 
# customer_id,transaction_date,reg,amount,prev_transaction_date,interval_id
# ,consecutive_start,consecutive_end
# from t1
# order by customer_id asc, transaction_date asc

等效pandas代码

import pandas as pd

def consecutive_increasing_transactions(transactions: pd.DataFrame) -> pd.DataFrame:
    # Ensure the DataFrame is sorted appropriately
    transactions = transactions.sort_values(by=['customer_id', 'transaction_date'])

    # Create the 'interval_id' and 'reg' columns
    transactions['interval_id'] = 0
    transactions['reg'] = 1

    # Initialize variables to track the previous row values
    prev_customer_id = None
    prev_amount = None
    prev_transaction_date = None
    interval_id = 0

    # Lists to hold the processed column data
    consecutive_starts = []
    consecutive_ends = []
    regs = []
    interval_ids = []

    for index, row in transactions.iterrows():
        customer_id = row['customer_id']
        amount = row['amount']
        transaction_date = row['transaction_date']

        # Check if it's a new customer or a new increasing transaction sequence
        if prev_customer_id != customer_id or amount <= prev_amount or \
           prev_transaction_date + pd.Timedelta(days=1) != transaction_date:
            interval_id += 1
            reg = 1
            consecutive_start = transaction_date
        else:
            reg += 1
        
        # Append the new values to the lists
        consecutive_starts.append(consecutive_start)
        consecutive_ends.append(transaction_date)
        regs.append(reg)
        interval_ids.append(interval_id)

        # Update the previous row values
        prev_customer_id = customer_id
        prev_amount = amount
        prev_transaction_date = transaction_date

    # Assign the processed lists to the DataFrame
    transactions['consecutive_start'] = consecutive_starts
    transactions['consecutive_end'] = consecutive_ends
    transactions['reg'] = regs
    transactions['interval_id'] = interval_ids

    # Filter the DataFrame to only include sequences with a 'reg' of at least 3
    transactions = transactions[transactions['reg'] >= 3]

    # Group by 'interval_id' and take the row with the maximum 'reg' in each group
    transactions = transactions.loc[transactions.groupby('interval_id')['reg'].idxmax()]

    # Select the required columns and return
    return transactions[['customer_id', 'consecutive_start', 'consecutive_end']].reset_index(drop=True)



标签:customer,transaction,递增,2701,id,date,consecutive,prev,Pandas
From: https://blog.51cto.com/u_15247503/9360642

相关文章

  • 【Leetcode 2474. 购买量严格增加的客户】[MySQL 用户变量/Pandas]面向过程编程解决严
    目录题目地址MySQL代码等效pandas代码题目地址https://leetcode.cn/problems/customers-with-strictly-increasing-purchases/description/MySQL代码#WriteyourMySQLquerystatementbelowwitht1as(selectcustomer_id,year(order_date)asmy_year,sum(price)......
  • Vue3 Diff算法之最长递增子序列,学不会来砍我!
    专栏分享:vue2源码专栏,vue3源码专栏,vuerouter源码专栏,玩具项目专栏,硬核......
  • Excel中这日期老是出来00:00:00,怎么用Pandas把这个去除
    大家好,我是皮皮。一、前言前几天在Python白银交流群【上海新年人】问了一个Pandas数据筛选的问题。问题如下:这日期老是出来00:00:00,怎么把这个去除。二、实现过程后来【论草莓如何成为冻干莓】给了一个思路和代码如下:pd.to_excel之前把这一列转换成字符串,或者你在Excel上设置......
  • pandas
    语句定义格式importpandasaspds=pd.Series(data,index,dtype,copy)data输入的数据,可以是列表、常量、ndarray数组等。index索引值必须是惟一的,如果没有传递索引,则默认为np.arrange(n)。dtypedtype表示数据类型,如果没有提供,则会自动判断得出。copy表示对da......
  • 一个excel文件,我用pandas如何取行,只能用pandas?
    大家好,我是皮皮。一、前言前几天在Python白银交流群粉丝问了一个Pandas数据筛选的问题。问题如下:我突然想到,如果一个excel文件,我用pandas如何取行,只能用pandas。二、实现过程后来【月神】给了一个思路和代码如下:其实粉丝之前也问了一个类似的花式索引的问题,可能时就间太久......
  • Python Pandas 数据分析项目实例
    ​ 假设我们有一个电子商务公司的销售数据,包含了客户购买记录、商品价格、购买日期等信息。我们的目标是分析这些数据,提取有价值的信息,例如总销售额、最畅销的产品、销售趋势等。1、安装引用Pandas如没有安装Pandas,可以通过pip安装它。参考下面的文档。然后在Python脚本......
  • pandas典型应用
    #header默认为0,即以第一列为列名,这里设为None,意味不设置第一行为列名;df11=pd.read_table(url,header=None)df11.head()#names=***,可以自定义设置列名user_cols=['id','quantity','name','choice','price']df=pd.read_table(url,header=None......
  • 【LeetCode 2701. 连续递增交易】MySQL用户变量编程得到严格递增连续子序列的开始位置
    题目地址https://leetcode.cn/problems/consecutive-transactions-with-increasing-amounts/代码#WriteyourMySQLquerystatementbelowwitht1as(select*#--------------------------只需要改动这里的逻辑,其他不要动。注意里面的语句是“顺序执行的......
  • 【Leetcode 2474. 购买量严格增加的客户】MySQL用户变量编程解决严格递增连续子序列问
    题目地址https://leetcode.cn/problems/customers-with-strictly-increasing-purchases/description/代码#WriteyourMySQLquerystatementbelowwitht1as(selectcustomer_id,year(order_date)asmy_year,sum(price)astotal_spendfromOrders......
  • 在 Pandas 中,`and` 和 `or` 是 Python 中的关键字,用于逻辑运算。但是在 Pandas 中,我们
    在Pandas中,`and`和`or`是Python中的关键字,用于逻辑运算。但是在Pandas中,我们使用`&`和`|`来表示逻辑与和逻辑或。这是因为在Pandas中,`and`和`or`会产生歧义,而`&`和`|`则不会。因此,在Pandas中,我们应该使用`&`和`|`来表示逻辑与和逻辑或,而不是`and`和......