现在我的问题是,有一堆数据,用一些字符串开头,然后换成数字
原本我用的死办法,直接先做字典,然后用字典的内容作为map的映射规则
但是由于这些字符太多了,做字典很容易出错,导致映射出来很多NaN值,所以改用了新方法
使用函数作为映射规则,而函数再进行判断的时候,可以就用开头的字符串进行判断,准确率提升很多
#这几列看起来像是学生家长的职业信息,考虑是否需要需要数字化它,只用后面的学生家长职业的大类能否达到处理结果 #认识到处理这些数据本身,其实就是一个过程 def fx_asbg20(x): if x.startswith('Not applicable'): return 0 elif x.startswith('Has never worked'): return 1 elif x.startswith( 'Small Business Owner'): return 2 elif x.startswith( 'Clerical Worker Includes'): return 3 elif x.startswith('Service or Sales Worker'): return 4 elif x.startswith('Skilled Agricultural'): return 5 elif x.startswith('Craft or Trade Worker'): return 6 elif x.startswith('Plant or Machine Operator'): return 7 elif x.startswith('General Laborers'): return 8 elif x.startswith('Corporate Manager'): return 9 elif x.startswith('Professional Includes scientists'): return 10 elif x.startswith('Technician or Associate Professional'): return 11 #dfASH_avg['ASBH20A']=dfASH_avg['ASBH20A'].map(mapASBH20) dfASH_avg['ASBH20A']=dfASH_avg['ASBH20A'].map(fx_asbg20) #dfASH_avg['ASBH20B']=dfASH_avg['ASBH20B'].map(mapASBH20) dfASH_avg['ASBH20B']=dfASH_avg['ASBH20B'].map(fx_asbg20) dfASH_avg.head()
原来写的复杂字典,费时又费力,效率还差,纪念一下
mapASBH20={'Has never worked for pay':1, 'Small Business Owner Includes owners of small businesses (fewer than 25 employees) such as retail shops, services, resta':2, 'Clerical Worker Includes office clerks; secretaries; typists; data entry operators;customer service cler':3, ' Service or Sales Worker Includes travel attendants; restaurant service workers;personal care workers; protective service workers; junior military; salespersons; street vend':4, 'Skilled Agricultural or Fishery Worker Includes farmers; forestry workers; fishery workers; hunters and trapper':5, 'Craft or Trade Worker Includes builders, carpenters, plumbers, electricians,metal workers; machine mechanics; handicraft workers':6, ' Plant or Machine Operator Includes plant and machine operators;assembly-line operators; motor-vehicle drivers':7, 'General Laborers Includes domestic helpers and cleaners; building caretakers;messengers, porters, and doorkeepers; farm, fishery,agricultural, and construction workers':8, 'Corporate Manager or Senior Official Includes corporate managers such as managers of large companies (25 or more employees) or managers of departments within large companies; legislators or senior government officials; senior officials of special-interest organizations; military officers':9, 'Professional Includes scientists; mathematicians; computer scientists;architects; engineers; life science and health professionals;teachers; legal professionals; police officers; social scientists;writers and artists; religious professionals':10, 'Technician or Associate Professional Includes science, engineering, and computer associates and technicians; life science and health technicians and assistants;teacher aides; finance and sales associate professionals;business service agents; administrative assistants':11, 'Not applicable':0 }
标签:map,elif,return,函数,自定义,startswith,dfASH,Includes,avg From: https://www.cnblogs.com/bojiandkake/p/17120387.html