首页 > 其他分享 >pandas中的 Series的讲解

pandas中的 Series的讲解

时间:2023-02-06 19:33:09浏览次数:35  
标签:index series 讲解 custom Series print fandango pandas


# coding=utf-8


import pandas as pd
import numpy as np
from pandas import Series
'''
Series的方法与属性
属性:
values 获取某一列的数据值 获取的值为numpy.ndarray类型
index 获取series数据

方法:
Series(数值项,index=索引的项) #数值项 与索引的项必须一一对应 ,索引项可以为字符串
index.tolist() names
sort_index() 用索引项排序
sort_values() 用数值项排序
'''


#Series (collection of values)
#DataFrame (collection of Series objects)
#Panel (collection of DataFrame objects)

#A Series object can hold many data types, including
#float - for representing float values
#int - for representing integer values
#bool - for representing Boolean values
#datetime64[ns] - for representing date & time, without time-zone
#datetime64[ns, tz] - for representing date & time, with time-zone
#timedelta[ns] - for representing differences in dates & times (seconds, minutes, etc.)
#category - for representing categorical values
#object - for representing String values

#FILM - film name
#RottenTomatoes - Rotten Tomatoes critics average score
#RottenTomatoes_User - Rotten Tomatoes user average score
#RT_norm - Rotten Tomatoes critics average score (normalized to a 0 to 5 point system)
#RT_user_norm - Rotten Tomatoes user average score (normalized to a 0 to 5 point system)
#Metacritic - Metacritic critics average score
#Metacritic_User - Metacritic user average score


#Series相当与DataFrame的一行或一列

fandango = pd.read_csv('fandango_score_comparison.csv')
series_film = fandango['FILM'] #取出 FILM这一列
#print(type(series_film)) #class 'pandas.core.series.Series'
#print(series_film[0:5])
series_rt = fandango['RottenTomatoes']
#print(series_rt[0:5])

film_names = series_film.values #获取
#print(type(film_names))
rt_scores = series_rt.values
series_custom = Series(rt_scores , index=film_names)
series_custom[['Minions (2015)', 'Leviathan (2014)']] #以film_names为参数进行索引
fiveten = series_custom[5:10]
#print(fiveten)

#reindex更多的不是修改pandas对象的索引,而只是修改索引的顺序,
# 如果修改的索引不存在就会使用默认的None代替此行。且不会修改原数组,
# 要修改需要使用赋值语句。
original_index = series_custom.index.tolist()
#print(original_index)
sorted_index = sorted(original_index)
#print(sorted_index)
sorted_by_index = series_custom.reindex(sorted_index)
#print(sorted_by_index)



sc2 = series_custom.sort_index()
#print(sc2[0:10])
sc3 = series_custom.sort_values()
#print(sc3[0:10])

#The values in a Series object are treated as an ndarray, the core data type in NumPy
import numpy as np
# Add each value with each other
print(np.add(series_custom, series_custom)) #add 对Series的value数值项,求和
# Apply sine function to each value
np.sin(series_custom) #对series_custom 的Series的value求sin函数
# Return the highest value (will return a single value not a Series)
print(np.max(series_custom)) #求最大值


# 计算value在50~75之间的数值
#will actually return a Series object with a boolean value for each film
series_custom > 50
series_greater_than_50 = series_custom[series_custom > 50]

criteria_one = series_custom > 50
criteria_two = series_custom < 75
both_criteria = series_custom[criteria_one & criteria_two]
print(both_criteria)


#data alignment same index
#求两个数值的平均值
rt_critics = Series(fandango['RottenTomatoes'].values, index=fandango['FILM'])
rt_users = Series(fandango['RottenTomatoes_User'].values, index=fandango['FILM'])
rt_mean = (rt_critics + rt_users)/2
print(rt_mean)
# coding=utf-8

import pandas as pd


#will return a new DataFrame that is indexed by the values in the specified column
#and will drop that column from the DataFrame
#without the FILM column dropped
fandango = pd.read_csv('fandango_score_comparison.csv')
print(type(fandango))
fandango_films = fandango.set_index('FILM', drop=False) #以set_index作为索引
print(fandango_films.index)


# Slice using either bracket notation or loc[]
fandango_films["Avengers: Age of Ultron (2015)":"Hot Tub Time Machine 2 (2015)"]
fandango_films.loc["Avengers: Age of Ultron (2015)":"Hot Tub Time Machine 2 (2015)"]

# Specific movie
fandango_films.loc['Kumiko, The Treasure Hunter (2015)']

# Selecting list of movies
movies = ['Kumiko, The Treasure Hunter (2015)', 'Do You Believe? (2015)', 'Ant-Man (2015)']
fandango_films.loc[movies]
#When selecting multiple rows, a DataFrame is returned,
#but when selecting an individual row, a Series object is returned instead




#-------------------------
#The apply() method in Pandas allows us to specify Python logic
#The apply() method requires you to pass in a vectorized operation
#that can be applied over each Series object.
import numpy as np

# returns the data types as a Series
types = fandango_films.dtypes
#print types
# filter data types to just floats, index attributes returns just column names
float_columns = types[types.values == 'float64'].index
# use bracket notation to filter columns to just float columns
float_df = fandango_films[float_columns]
#print float_df
# `x` is a Series object representing a column
deviations = float_df.apply(lambda x: np.std(x))
print(deviations)



rt_mt_user = float_df[['RT_user_norm', 'Metacritic_user_nom']]
rt_mt_user.apply(lambda x: np.std(x), axis=1)



标签:index,series,讲解,custom,Series,print,fandango,pandas
From: https://blog.51cto.com/u_15955675/6040308

相关文章

  • django框架之drf(部分讲解)
    一、各个视图子类两个视图基类五个视图扩展类九个视图子类-----》视图类,不需要额外继承GenericAPIView,只需要继承九个州其中之一,就会有某个或某几个接口路由......
  • php双向队列实例讲解
    双向队列是指一种具有队列和栈的性质的数据结构。双向队列中的元素可以从两端弹出,其限定插入和删除操作在表的两端进行。双向队列就像是一个队列,但是你可以在任何一端添......
  • Node.JS包简介(讲解了json文件怎么看)
    视频json文件不能写注释......
  • Java JDK1.5: 泛型 新特性的讲解说明
    JavaJDK1.5:泛型新特性的讲解说明每博一文案听到过这样一句话:“三观没有标准。在乌鸦的世界里,天鹅也有罪。”环境、阅历的不同,造就了每个人独有的世界观、人生观、价......
  • php7 安装mysqli实例讲解
    php7怎么安装Mysqli?Centosphp7安装mysqli扩展心得在新配服务器时发现,php无法连接到mysql。通过phpinfo发现。根本没有显示mysqli的相关配置。经过一系列研究。总结了......
  • 百度信息流搭建实操讲解
    百度信息流推广和百度搜索推广是在一个平台上,通过用户名和密码登录网址:www2.baidu.com即可进入后台,然后便可进行账户搭建。  首先我们认识一下账户主页:  ......
  • Pandas 人口密度案例分析
    fromturtleimportleftimportpandasaspd"""需求:1.导入文件,查看原始数据2.将人口数据和各州简称数据进行合并3.将合并的数据中重复的abbreviation列进行删除......
  • 05 构造器讲解
    构造器讲解packagecom.zhan.base05Oop;publicclassTest05{publicstaticvoidmain(String[]args){/*构造器的两个作用1.使用new关......
  • spring boot集成mybatis-plus——Mybatis Plus 多表联查(包含分页关联查询,图文讲解)
    MybatisPlus多表联查(包含分页关联查询,图文讲解) 更新时间2023-01-0321:41:38大家好,我是小哈。本小节中,我们将学习如何通过MybatisPlus实现多表关联查询,以及分......
  • spring boot集成mybatis-plus——Mybatis Plus 批量 Insert_新增数据(图文讲解)
    MybatisPlus批量Insert_新增数据(图文讲解) 更新时间2023-01-1016:02:58前言大家好,我是小哈。本小节中,我们将学习如何通过MybatisPlus实现MySQL批量插入数据......