标签：及格 24 mask 82 不及格 where pandas math

pandas条件替换值(where&mask)

在日常分析中，经常会遇到对数据的筛选处理相关的工作，我们可以使用loc和iloc定位分析筛选的列或行数据，下面介绍一种高级筛选的用法where和mask。

pd.where: 替换条件（condition）为Flase处的值

pd.mask: 替换条件（condition）为True处的值

np.where: 替换条件，类似三元表达式

# 条件不成立时，值替换成other
pd.where(self, cond, other=nan, inplace=False,
        axis=None, level=None, errors='raise', try_cast=False)
# 条件成立时，值替换成other
pd.mask(self, cond, other=nan, inplace=False,
		axis=None, level=None, errors='raise', try_cast=False)
# 条件成立时，值为x；不成立时，值为y
np.where(condition, x, y)

首先模拟一组学生成绩表数据：

import pandas as pd
import numpy as np

# 设置学科
subjects = ['math', 'chinese', 'english', 'history']

# 设置学生
students = ['Tom', 'Alice', 'Bobby', 'Candy', 'David', 'Eva', 'Frank', 'Grace', 'Howard', 'Ivy',
            'John', 'Karen', 'Larry', 'Marie', 'Nancy', 'Oscar', 'Peter', 'Queen', 'Robert', 'Susan']

# 随机生成成绩
score = np.random.randint(low=0, high=100, size=(len(students), len(subjects)))

# 生成DataFrame
df = pd.DataFrame(
    score,
    columns=subjects,
    index=students
)

df

	math	chinese	english	history
Tom	24	57	60	44
Alice	92	25	64	26
Bobby	96	61	94	96
Candy	36	87	10	38
David	29	73	37	64
Eva	94	40	30	81
Frank	24	44	40	14
Grace	37	70	50	5
Howard	82	86	46	10
Ivy	24	7	30	30
John	39	32	97	48
Karen	68	29	34	11
Larry	82	5	3	78
Marie	96	83	73	63
Nancy	25	33	37	53
Oscar	2	65	49	73
Peter	9	19	11	67
Queen	44	19	85	23
Robert	75	35	47	77
Susan	71	6	10	82

1 pd.where

where(条件, pd.NA)

值替换：pandas中的where方法，如果条件为真，保持原来的值，否则替换为other

增加字段 math_pass, 数学成绩大于60，为及格，否则为不及格

df1 = df.copy()
# 默认及格
df1['math_pass'] = '及格'
df1['math_pass'] = df1['math_pass'].where(df1['math'] > 60, '不及格')

df1

	math	chinese	english	history	math_pass
Tom	24	57	60	44	不及格
Alice	92	25	64	26	及格
Bobby	96	61	94	96	及格
Candy	36	87	10	38	不及格
David	29	73	37	64	不及格
Eva	94	40	30	81	及格
Frank	24	44	40	14	不及格
Grace	37	70	50	5	不及格
Howard	82	86	46	10	及格
Ivy	24	7	30	30	不及格
John	39	32	97	48	不及格
Karen	68	29	34	11	及格
Larry	82	5	3	78	及格
Marie	96	83	73	63	及格
Nancy	25	33	37	53	不及格
Oscar	2	65	49	73	不及格
Peter	9	19	11	67	不及格
Queen	44	19	85	23	不及格
Robert	75	35	47	77	及格
Susan	71	6	10	82	及格

2 np.where

在numpy中的where使用,与pandas有所不同

# 条件成立时，值为x；不成立时，值为y
np.where(condition, x, y)

增加字段 math_pass2, 数学成绩大于60，为及格，否则为不及格

df2 = df.copy()
# 默认及格
df2['math_pass2'] = np.where(df2['math'] > 60, '及格', '不及格')

df2

	math	chinese	english	history	math_pass2
Tom	24	57	60	44	不及格
Alice	92	25	64	26	及格
Bobby	96	61	94	96	及格
Candy	36	87	10	38	不及格
David	29	73	37	64	不及格
Eva	94	40	30	81	及格
Frank	24	44	40	14	不及格
Grace	37	70	50	5	不及格
Howard	82	86	46	10	及格
Ivy	24	7	30	30	不及格
John	39	32	97	48	不及格
Karen	68	29	34	11	及格
Larry	82	5	3	78	及格
Marie	96	83	73	63	及格
Nancy	25	33	37	53	不及格
Oscar	2	65	49	73	不及格
Peter	9	19	11	67	不及格
Queen	44	19	85	23	不及格
Robert	75	35	47	77	及格
Susan	71	6	10	82	及格

3 pd.mask

值替换：pandas中的mask方法，如果条件为真，值替换为other

增加字段 math_pass3, 数学成绩大于60，为及格，否则为不及格

df3 = df.copy()
df3['math_pass3'] = '不及格'
df3['math_pass3'] = df3['math_pass3'].mask(df3['math'] > 60, '及格')

df3

	math	chinese	english	history	math_pass3
Tom	24	57	60	44	不及格
Alice	92	25	64	26	及格
Bobby	96	61	94	96	及格
Candy	36	87	10	38	不及格
David	29	73	37	64	不及格
Eva	94	40	30	81	及格
Frank	24	44	40	14	不及格
Grace	37	70	50	5	不及格
Howard	82	86	46	10	及格
Ivy	24	7	30	30	不及格
John	39	32	97	48	不及格
Karen	68	29	34	11	及格
Larry	82	5	3	78	及格
Marie	96	83	73	63	及格
Nancy	25	33	37	53	不及格
Oscar	2	65	49	73	不及格
Peter	9	19	11	67	不及格
Queen	44	19	85	23	不及格
Robert	75	35	47	77	及格
Susan	71	6	10	82	及格

标签：及格,24,mask,82,不及格,where,pandas,math
From： https://www.cnblogs.com/itelephant/p/17147902.html

pandas条件替换值(where&mask)

pandas条件替换值(where&mask)

1 pd.where

2 np.where

3 pd.mask

相关文章

赞助商

阅读排行