DataFrame
拆分-应用-合并 split-apply-combine
apply() 方法是针对某些行或列进行操作的,
applymap()方法是针对所有元素进行操作的
DataFrame 对象,apply 函数的语法如下:
DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwds)
Series 对象,apply 函数的语法如下:
Series.apply(func, convert_dtype=True, args=(), **kwds)
args:要传递给函数的额外参数。
*kwds:要传递给函数的额外关键字参数
自定义函数
定义:第一个参数是 DataFrame的行或者列,第二个参数是可以
使用:这个函数不带任何括号地传递给apply()方法 其他参数 args=(2,)
split-apply-combine”(拆分-应用-合并)很好地描述了分组运算的整个过程
代码示例
import os.path
import pandas as pd
def check_effect_detail(data, check_type=1):
effect_type_list = []
effect_fields_type_words = ['A', 'B',]
effect_reason_type_words = ['C', 'D',]
if check_type == 1:
effect_key_words = effect_fields_type_words
else:
effect_key_words = effect_reason_type_words
for effect_type in effect_key_words:
if effect_type in str(data):
effect_type_list.append(effect_type)
return effect_type_list
def combine_effect_info(data):
info = data.iloc[0] + data.iloc[1] + data.iloc[2]
set_data = set(info)
out = ' '.join(set_data)
return out
if __name__ == "__main__":
input_dir = r"C:\Desktop"
input_file_nm = r"effect_info.xlsx"
out_file_nm = r"effect_judge.xlsx"
input_file_path = os.path.join(input_dir, input_file_nm)
out_file_path = os.path.join(input_dir, out_file_nm)
# 读取Excel文件
df = pd.read_excel(input_file_path)
# 显示数据框内容
# print(df["班级编号"],df["班级内容"],df["功课内容"],df["功课结论"])
# Series.apply
df['班级内容_field'] = df["班级内容"].apply(check_effect_detail, args=(1,))
df['功课内容_field'] = df["功课内容"].apply(check_effect_detail, args=(1,))
df['结论内容_field'] = df["功课结论"].apply(check_effect_detail, args=(1,))
# dataFrame.apply()
df['field'] = df[['班级内容_field', '功课内容_field', '结论内容_field']].apply(combine_effect_info, axis=1)
# Series.apply
df['班级内容_reason'] = df["班级内容"].apply(check_effect_detail, args=(2,))
df['功课内容_reason'] = df["功课内容"].apply(check_effect_detail, args=(2,))
df['结论内容_reason'] = df["功课结论"].apply(check_effect_detail, args=(2,))
# dataFrame.apply()
df['reason'] = df[['班级内容_reason', '功课内容_reason', '结论内容_reason']].apply(combine_effect_info, axis=1)
data_out_df = df[["班级编号", "班级内容", "功课内容", "功课结论", "field", "reason"]]
data_out_df.to_excel(out_file_path, index=False)
参考
数据预处理的示例
标签:示例,Python,args,effect,DataFrame,df,reason,apply,type
From: https://www.cnblogs.com/ytwang/p/18213058