首页 > 数据库 >SQLZOO记录 2:窗口函数

SQLZOO记录 2:窗口函数

时间:2022-09-18 19:25:26浏览次数:100  
标签:窗口 函数 whn SELECT party constituency ORDER SQLZOO name

SQL 窗口函数包括 LAG、 LEAD、 RANK 和 NTILE。这些函数在行的“窗口”上运行——通常是表中某种意义上相邻的行。

9- Window functions

  1. Show the lastName, party and votes for the constituency 'S14000024' in 2017.
SELECT lastName, party, votes
  FROM ge
 WHERE constituency = 'S14000024' AND yr = 2017
ORDER BY votes DESC
  1. You can use the RANK function to see the order of the candidates. If you RANK using (ORDER BY votes DESC) then the candidate with the most votes has rank 1.

Show the party and RANK for constituency S14000024 in 2017. List the output by party

学习 RANK() 用法

SELECT party, votes,
       RANK() OVER (ORDER BY votes DESC) AS posn
  FROM ge
 WHERE constituency = 'S14000024' AND yr = 2017
ORDER BY party

结果显示系统错误???

  1. The 2015 election is a different PARTITION to the 2017 election. We only care about the order of votes for each year.

Use PARTITION to show the ranking of each party in S14000021 in each year. Include yr, party, votes and ranking (the party with the most votes is 1).

SELECT yr, party, votes,
      RANK() OVER (PARTITION BY yr ORDER BY votes DESC) AS posn
  FROM ge
 WHERE constituency = 'S14000021'
ORDER BY party, yr
  1. Edinburgh constituencies are numbered S14000021 to S14000026.

Use PARTITION BY constituency to show the ranking of each party in Edinburgh in 2017. Order your results so the winners are shown first, then ordered by constituency.

SELECT constituency, party, votes, RANK() OVER (PARTITION BY constituency ORDER BY votes DESC) AS posn
  FROM ge
 WHERE constituency BETWEEN 'S14000021' AND 'S14000026'
   AND yr  = 2017
ORDER BY posn, constituency
  1. You can use SELECT within SELECT to pick out only the winners in Edinburgh.

Show the parties that won for each Edinburgh constituency in 2017.

SELECT constituency, party
  FROM (SELECT constituency, party, votes, RANK() OVER (PARTITION BY constituency ORDER BY votes DESC) AS posn
          FROM ge
         WHERE constituency BETWEEN 'S14000021' AND 'S14000026'
           AND yr  = 2017) AS x
 WHERE x.posn = 1
  1. Show how many seats for each party in Scotland in 2017.

You can use COUNT and GROUP BY to see how each party did in Scotland. Scottish constituencies start with 'S'

SELECT party, COUNT(*)
  FROM (SELECT constituency, party, votes,
        RANK() OVER (PARTITION BY constituency ORDER BY votes DESC) AS posn
          FROM ge
         WHERE constituency LIKE 'S%'
           AND yr = 2017) AS x
 WHERE x.posn = 1
GROUP BY x.party

SELECT party, COUNT(*)
  FROM ge x
 WHERE constituency LIKE 'S%'
   AND yr  = 2017
   AND votes = (SELECT MAX(votes)
                  FROM ge y
                 WHERE y.constituency = x.constituency
                   AND yr = 2017)
GROUP BY party

9+ Window LAG - COVID 19

  1. The example uses a WHERE clause to show the cases in 'Italy' in March 2020.

Modify the query to show data from Spain

SELECT name, DAY(whn), confirmed, deaths, recovered
  FROM covid
 WHERE name = 'Spain'
   AND MONTH(whn) = 3 AND YEAR(whn) = 2020
ORDER BY whn
  1. The LAG function is used to show data from the preceding row or the table. When lining up rows the data is partitioned by country name and ordered by the data whn. That means that only data from Italy is considered.

Modify the query to show confirmed for the day before.

SELECT name, DAY(whn), confirmed, LAG(confirmed, 1) OVER (PARTITION BY name ORDER BY whn) AS dbf
  FROM covid
 WHERE name = 'Italy'
   AND MONTH(whn) = 3
   AND YEAR(whn) = 2020
ORDER BY whn

从结果中了解 dbf 列中的值如何与左上方对角线上方的行的值匹配
image

  1. The number of confirmed case is cumulative - but we can use LAG to recover the number of new cases reported for each day.

Show the number of new cases for each day, for Italy, for March.

SELECT name, DAY(whn), (confirmed - LAG(confirmed, 1) OVER (PARTITION BY name ORDER BY whn)) AS new
  FROM covid
 WHERE name = 'Italy'
   AND MONTH(whn) = 3
   AND YEAR(whn) = 2020
ORDER BY whn
  1. The data gathered are necessarily estimates and are inaccurate. However by taking a longer time span we can mitigate some of the effects.

You can filter the data to view only Monday's figures WHERE WEEKDAY(whn) = 0.

Show the number of new cases in Italy for each week in 2020 - show Monday only.

SELECT name, DATE_FORMAT(whn,'%Y-%m-%d'), (confirmed - LAG(confirmed, 1) OVER (PARTITION BY name ORDER BY whn)) AS newThisWeek
  FROM covid
 WHERE name = 'Italy'
   AND WEEKDAY(whn) = 0
   AND YEAR(whn) = 2020
ORDER BY whn
  1. You can JOIN a table using DATE arithmetic. This will give different results if data is missing.

Show the number of new cases in Italy for each week - show Monday only.

In the sample query we JOIN this week tw with last week lw using the DATE_ADD function.

SELECT tw.name, DATE_FORMAT(tw.whn,'%Y-%m-%d'), tw.confirmed - lw.confirmed
  FROM covid tw LEFT JOIN covid lw ON (DATE_ADD(lw.whn, INTERVAL 1 WEEK) = tw.whn)
   AND (tw.name = lw.name)
 WHERE tw.name = 'Italy'
   AND WEEKDAY(tw.whn) = 0
ORDER BY tw.whn
  1. The query shown shows the number of confirmed cases together with the world ranking for cases.

United States has the highest number, Spain is number 2...

Notice that while Spain has the second highest confirmed cases, Italy has the second highest number of deaths due to the virus.

Include the ranking for the number of deaths in the table.

SELECT name, confirmed, RANK() OVER (ORDER BY confirmed DESC) rc, deaths, RANK() OVER (ORDER BY deaths DESC) rc
  FROM covid
 WHERE whn = '2020-04-20'
ORDER BY confirmed DESC
  1. The query shown includes a JOIN t the world table so we can access the total population of each country and calculate infection rates (in cases per 100,000).

Show the infect rate ranking for each country. Only include countries with a population of at least 10 million.

有结果但错误
SELECT world.name, ROUND(100000 * confirmed / population) AS infectionRate, RANK() OVER (ORDER BY confirmed / population DESC) rk
  FROM covid JOIN world ON (covid.name = world.name)
 WHERE whn = '2020-04-20'
   AND population >= 10000000
ORDER BY rk

没有给出正确输出

  1. For each country that has had at last 1000 new cases in a single day, show the date of the peak number of new cases.
结果和答案一样但错误
WITH temp1 AS (
  SELECT *, (confirmed - LAG(confirmed, 1) OVER (PARTITION BY name ORDER BY whn)) day_count
  FROM covid
),
 temp2 AS (
  SELECT name, MAX(day_count) peak_cases
  FROM temp1
  GROUP BY name
  HAVING peak_cases > 1000
)
SELECT temp2.name, DATE_FORMAT(whn, '%Y-%m-%d') date_, peak_cases
FROM temp2 LEFT JOIN temp1 ON (temp2.name = temp1.name) AND (temp2.peak_cases = temp1.day_count)
ORDER BY date

标签:窗口,函数,whn,SELECT,party,constituency,ORDER,SQLZOO,name
From: https://www.cnblogs.com/Jojo-L/p/16705151.html

相关文章

  • 多对一查找四种函数公式
    问题:多对一查找函数解决:{=INDEX(C:C,MATCH(F2&G2,A:A&B:B,))}=XLOOKUP(F2&G2,A:A&B:B,C:C,"查无此妖")=FILTER(C$1:C$7,(A$1:A$7=F2)*(B$1:B$7=G2),"查无此妖")=SUMI......
  • 9.2.2 信号函数signal详解
    信号处理函数的定义为:voidsignal_handler(intsignum)可以理解为:参数为int型,返回值为void型的函数;信号函数signal()定义如下:void(*signal(intsignum,void(*ha......
  • 【Azure 应用服务】Azure Durable Function(持久函数)在执行Activity Function时候,因
    问题描述在使用AzureDurableFunction函数,调用函数链模式来调用多个ActivityFunction。函数链:https://docs.azure.cn/zh-cn/azure-functions/durable/durable-function......
  • 【C#】设置窗口随意移动
     窗口标题栏隐藏后无法在拖动需要自己获取移动坐标来设置窗口位置  //外部需要两个集变量privatePointmouseOff;//鼠标移动位置变量......
  • 【java8新特性】01:函数式编程及Lambda入门
    我们首先需要先了解什么是函数式编程、函数式编程是一种结构化编程范式、类似于数学函数、它关注的重点在于数据操作、或者说它所提倡的思想是做什么,而不是如何去做。自J......
  • Python函数-2
    一.导入$$f(x,y)=2x+3y$$上面括号里面的就是数学公式里的自变量,自变量就相当于函数里的参数。二.为什么要有参数如果一个大楼里有两种尺寸不一的窗户,显然在......
  • 修改-Python函数-2
    一.导入$$f(x,y)=2x+3y$$上面括号里面的就是数学公式里的自变量,自变量就相当于函数里的参数。二.为什么要有参数如果一个大楼里有两种尺寸不一的窗户,显然在......
  • 了解 SQL 排名函数:Row_Number、Rank 和 Dense_Rank
    了解SQL排名函数:Row_Number、Rank和Dense_RankPhotoby卡斯帕卡米尔鲁宾on不飞溅自从我开始学习SQL以来,一直很困惑,无法清楚地理解RankingFunctions之间的......
  • 《UNIX环境高级编程》系统函数open read write lseek stat等
    linux下文件权限设置中的数字表示:https://blog.csdn.net/u013063153/article/details/53113979chmodABCfile其中A、B、C各为一个数字,分别表示User、Group、及Other......
  • Flink的时间和窗口
    Flink中的时间及时流处理是有状态流处理的扩展,实现及时流处理的时间起到了很大的作用。在Flink的时间概念中主要分为下面两种:事件时间:事件时间是每个单独事件在其生......