1. 聚合函数介绍
- 什么是聚合函数
- 聚合函数作用于一组数据,并对一组数据返回一个值
- 聚合函数类型
- AVG()
- SUM()
- MAX()
- MIN()
- COUNT()
1.1 AVG和SUM函数
- 可以对数值型数据使用AVG 和 SUM 函数
mysql> SELECT AVG(salary), MAX(salary),MIN(salary), SUM(salary)
-> FROM employees WHERE job_id LIKE '%REP%';
+-------------+-------------+-------------+-------------+
| AVG(salary) | MAX(salary) | MIN(salary) | SUM(salary) |
+-------------+-------------+-------------+-------------+
| 8272.727273 | 11500.00 | 6000.00 | 273000.00 |
+-------------+-------------+-------------+-------------+
1 row in set (0.00 sec)
mysql>
1.2 MIN和MAX函数
- 可以对任意数据类型的数据使用 MIN 和 MAX 函数
mysql> SELECT MIN(hire_date), MAX(hire_date) FROM employees;
+----------------+----------------+
| MIN(hire_date) | MAX(hire_date) |
+----------------+----------------+
| 1987-06-17 | 2000-04-21 |
+----------------+----------------+
1 row in set (0.00 sec)
mysql>
1.3 COUNT函数
- COUNT(*)返回表中记录总数,适用于任意数据类型
- COUNT(expr) 返回expr不为空的记录总数
mysql> SELECT COUNT(employee_id), COUNT(salary), COUNT(1), COUNT(commission_pct) FROM employees;
+--------------------+---------------+----------+-----------------------+
| COUNT(employee_id) | COUNT(salary) | COUNT(1) | COUNT(commission_pct) |
+--------------------+---------------+----------+-----------------------+
| 107 | 107 | 107 | 35 |
+--------------------+---------------+----------+-----------------------+
1 row in set (0.00 sec)
mysql>
# COUNT(expr) 返回expr不为空的记录总数
mysql> SELECT COUNT(commission_pct) FROM employees WHERE department_id = 50;
+-----------------------+
| COUNT(commission_pct) |
+-----------------------+
| 0 |
+-----------------------+
1 row in set (0.00 sec)
mysql>
- 查询表中有多少条记录的实现方式:
- 方式1:COUNT(*)
- 方式1:COUNT(1) # 常数
- 方式1:COUNT(具体字段) :返回的个数不一定对!因为计算指定字段出现的个数时,是不计算NULL值的
# 公式:AVG = SUM / COUNT
mysql> SELECT AVG(salary), SUM(salary)/COUNT(salary),
-> AVG(commission_pct), SUM(commission_pct)/COUNT(commission_pct),SUM(commission_pct)/107
-> FROM employees;
+-------------+---------------------------+---------------------+-------------------------------------------+-------------------------+
| AVG(salary) | SUM(salary)/COUNT(salary) | AVG(commission_pct) | SUM(commission_pct)/COUNT(commission_pct) | SUM(commission_pct)/107 |
+-------------+---------------------------+---------------------+-------------------------------------------+-------------------------+
| 6461.682243 | 6461.682243 | 0.222857 | 0.222857 | 0.072897 |
+-------------+---------------------------+---------------------+-------------------------------------------+-------------------------+
1 row in set (0.00 sec)
mysql>
2. 分组
2.1 基本使用
- 可以使用GROUP BY子句将表中的数据分成若干组
SELECT column, group_function(column)
FROM table
[WHERE condition]
[GROUP BY group_by_expression]
[ORDER BY column];
明确:
WHERE一定放在FROM后面
- 在SELECT列表中所有未包含在组函数中的列都应该包含在 GROUP BY子句中
mysql> SELECT department_id, AVG(salary) FROM employees GROUP BY department_id ;
+---------------+--------------+
| department_id | AVG(salary) |
+---------------+--------------+
| NULL | 7000.000000 |
| 10 | 4400.000000 |
| 20 | 9500.000000 |
| 30 | 4150.000000 |
| 40 | 6500.000000 |
| 50 | 3475.555556 |
| 60 | 5760.000000 |
| 70 | 10000.000000 |
| 80 | 8955.882353 |
| 90 | 19333.333333 |
| 100 | 8600.000000 |
| 110 | 10150.000000 |
+---------------+--------------+
12 rows in set (0.00 sec)
mysql>
- 包含在 GROUP BY 子句中的列不必包含在SELECT 列表中
2.2 使用多个列分组
SELECT中出现的非组函数的字段必须声明再GROUP BY中
- 反之,GROUP BY中声明的字段可以不出现在SELECT中
mysql> SELECT department_id dept_id, job_id, SUM(salary) FROM employees GROUP BY department_id, job_id;
+---------+------------+-------------+
| dept_id | job_id | SUM(salary) |
+---------+------------+-------------+
| 90 | AD_PRES | 24000.00 |
| 90 | AD_VP | 34000.00 |
| 60 | IT_PROG | 28800.00 |
| 100 | FI_MGR | 12000.00 |
| 100 | FI_ACCOUNT | 39600.00 |
| 30 | PU_MAN | 11000.00 |
| 30 | PU_CLERK | 13900.00 |
| 50 | ST_MAN | 36400.00 |
| 50 | ST_CLERK | 55700.00 |
| 80 | SA_MAN | 61000.00 |
| 80 | SA_REP | 243500.00 |
| NULL | SA_REP | 7000.00 |
| 50 | SH_CLERK | 64300.00 |
| 10 | AD_ASST | 4400.00 |
| 20 | MK_MAN | 13000.00 |
| 20 | MK_REP | 6000.00 |
| 40 | HR_REP | 6500.00 |
| 70 | PR_REP | 10000.00 |
| 110 | AC_MGR | 12000.00 |
| 110 | AC_ACCOUNT | 8300.00 |
+---------+------------+-------------+
20 rows in set (0.01 sec)
mysql>
- GROUP BY 声明在FROM后面、WHERE后面,ORDER BY 前面、LIMIT前面
- 使用 WITH ROLLUP 关键字之后,在所有查询出的分组记录之后增加一条记录,该记录计算查询出的所有记录的总和,即统计记录数量。
mysql> SELECT department_id,AVG(salary)
-> FROM employees
-> GROUP BY department_id;
+---------------+--------------+
| department_id | AVG(salary) |
+---------------+--------------+
| NULL | 7000.000000 |
| 10 | 4400.000000 |
| 20 | 9500.000000 |
| 30 | 4150.000000 |
| 40 | 6500.000000 |
| 50 | 3475.555556 |
| 60 | 5760.000000 |
| 70 | 10000.000000 |
| 80 | 8955.882353 |
| 90 | 19333.333333 |
| 100 | 8600.000000 |
| 110 | 10150.000000 |
+---------------+--------------+
12 rows in set (0.00 sec)
mysql>
mysql> SELECT department_id,AVG(salary)
-> FROM employees
-> GROUP BY department_id WITH ROLLUP;
+---------------+--------------+
| department_id | AVG(salary) |
+---------------+--------------+
| NULL | 7000.000000 |
| 10 | 4400.000000 |
| 20 | 9500.000000 |
| 30 | 4150.000000 |
| 40 | 6500.000000 |
| 50 | 3475.555556 |
| 60 | 5760.000000 |
| 70 | 10000.000000 |
| 80 | 8955.882353 |
| 90 | 19333.333333 |
| 100 | 8600.000000 |
| 110 | 10150.000000 |
| NULL | 6461.682243 |
+---------------+--------------+
13 rows in set (0.00 sec)
mysql>
注意:
当使用ROLLUP时,不能同时使用ORDER BY子句进行结果排序,即ROLLUP和ORDER BY是互相排斥的
3. HAVING(用来过滤数据的)
3.1 基本使用
- 过滤分组:HAVING子句
- 行已经被分组
- 使用了聚合函数
- 满足HAVING 子句中条件的分组将被显示
- HAVING 不能单独使用,必须要跟 GROUP BY 一起使用
# 查询各个部门中最高工资比10000高的部门信息
# 错误的写法:
mysql> SELECT department_id,MAX(salary)
-> FROM employees
-> WHERE MAX(salary) > 10000
-> GROUP BY department_id;
ERROR 1111 (HY000): Invalid use of group function
mysql>
#正确的写法:
mysql> SELECT department_id,MAX(salary)
-> FROM employees
-> GROUP BY department_id
-> HAVING MAX(salary) > 10000;
+---------------+-------------+
| department_id | MAX(salary) |
+---------------+-------------+
| 20 | 13000.00 |
| 30 | 11000.00 |
| 80 | 14000.00 |
| 90 | 24000.00 |
| 100 | 12000.00 |
| 110 | 12000.00 |
+---------------+-------------+
6 rows in set (0.00 sec)
mysql>
3.2 WHERE和HAVING的对比
- 区别1:从适用范围上来讲,HAVING的适用范围更广
- 区别2:如果过滤条件中没有聚合函数:这种情况下,WHERE的执行效率要高于HAVING
- 开发中的选择:
- WHERE 和 HAVING 也不是互相排斥的,我们可以在一个查询里面同时使用 WHERE 和 HAVING。
- 包含分组统计函数的条件用 HAVING,普通条件用 WHERE。
- 这样,我们就既利用了 WHERE 条件的高效快速,又发挥了 HAVING 可以使用包含分组统计函数的查询条件的优点。当数据量特别大的时候,运行效率会有很
大的差别。
# 练习:查询部门id为10,20,30,40这4个部门中最高工资比10000高的部门信息
# 方式1:推荐,执行效率高于方式2.
mysql> SELECT department_id,MAX(salary)
-> FROM employees
-> WHERE department_id IN (10,20,30,40)
-> GROUP BY department_id
-> HAVING MAX(salary) > 10000;
+---------------+-------------+
| department_id | MAX(salary) |
+---------------+-------------+
| 20 | 13000.00 |
| 30 | 11000.00 |
+---------------+-------------+
2 rows in set (0.01 sec)
mysql>
# 方式2:
mysql> SELECT department_id,MAX(salary)
-> FROM employees
-> GROUP BY department_id
-> HAVING MAX(salary) > 10000 AND department_id IN (10,20,30,40);
+---------------+-------------+
| department_id | MAX(salary) |
+---------------+-------------+
| 20 | 13000.00 |
| 30 | 11000.00 |
+---------------+-------------+
2 rows in set (0.00 sec)
mysql>
4. SELECT的执行过程
4.1 SELECT 语句的完整结构
- sql92语法:
#sql92语法:
SELECT ....,....,....(存在聚合函数)
FROM ...,....,....
WHERE 多表的连接条件 AND 不包含聚合函数的过滤条件
GROUP BY ...,....
HAVING 包含聚合函数的过滤条件
ORDER BY ....,...(ASC / DESC )
LIMIT ...,....
- sql99语法:
#sql99语法:
SELECT ....,....,....(存在聚合函数)
FROM ... (LEFT / RIGHT)JOIN ....ON 多表的连接条件
(LEFT / RIGHT)JOIN ... ON ....
WHERE 不包含聚合函数的过滤条件
GROUP BY ...,....
HAVING 包含聚合函数的过滤条件
ORDER BY ....,...(ASC / DESC )
LIMIT ...,....
4.2 SQL语句的执行过程
- 关键字的顺序是不能颠倒的
- SELECT ... FROM ... WHERE ... GROUP BY ... HAVING ... ORDER BY ... LIMIT...
- SELECT 语句的执行顺序(在 MySQL 和 Oracle 中,SELECT 执行顺序基本相同)
- FROM ...,...-> ON -> (LEFT/RIGNT JOIN) -> WHERE -> GROUP BY -> HAVING -> SELECT -> DISTINCT -> ORDER BY -> LIMIT