Pandas备忘录

时间：2024-08-22 20:18:54浏览次数：18

标签：df column DataFrame 备忘录 dataframe print my Pandas

DataFrames are the central data structure in the pandas API. It‘s like a spreadsheet, with numbered rows and named columns.

为方便引入例程，先导入对应模块。

1 import pandas as pd

View Code

The following code instantiates a pd.DataFrame class to generate a DataFrame.

 1 # Create and populate a 5x2 NumPy array.
 2 my_data = np.array([[0, 3], [10, 7], [20, 9], [30, 14], [40, 15]])
 3 
 4 # Create a Python list that holds the names of the two columns.
 5 my_column_names = ['temperature', 'activity']
 6 
 7 # Create a DataFrame.
 8 my_dataframe = pd.DataFrame(data=my_data, columns=my_column_names)
 9 
10 # Print the entire DataFrame
11 print(my_dataframe)

View Code

You may add a new column to an existing pandas DataFrame just by assigning values to a new column name.

1 # Create a new column named adjusted.
2 my_dataframe["adjusted"] = my_dataframe["activity"] + 2
3 
4 # Print the entire DataFrame
5 print(my_dataframe)

View Code

Pandas provide multiples ways to isolate specific rows, columns, slices or cells in a DataFrame.

print("Rows #0, #1, and #2:")
print(my_dataframe.head(3), '\n')

print("Row #2:")
print(my_dataframe.iloc[[2]], '\n') # The type of result is DataFrame.
print("Row #2:")
print(my_dataframe.iloc[2], '\n') # The type of the result is Series.
print("Rows #1, #2, and #3:")
print(my_dataframe[1:4], '\n') # Note the index starts from the second row not 
# 1st

print("Column 'temperature':")
print(my_dataframe['temperature'])

View Code

Q: What's the difference between Series and DataFrame?

A: The former is a column(Google Gemini insists row but I don't know why) of the latter.

How to index a particular cell of the DataFrame?

 1 # Create a Python list that holds the names of the four columns.
 2 my_column_names = ['Eleanor', 'Chidi', 'Tahani', 'Jason']
 3 
 4 # Create a 3x4 numpy array, each cell populated with a random integer.
 5 my_data = np.random.randint(low=0, high=101, size=(3, 4))
 6 
 7 # Create a DataFrame.
 8 df = pd.DataFrame(data=my_data, columns=my_column_names)
 9 
10 # Print the entire DataFrame
11 print(df)
12 
13 # Print the value in row #1 of the Eleanor column.
14 print("\nSecond row of the Eleanor column: %d\n" % df['Eleanor'][1]) #Chained # indexing

View Code

The following code shows how to create a new column to an existing DataFrame through row-by-row calculation between or among columns:

1 # Create a column named Janet whose contents are the sum
2 # of two other columns.
3 df['Janet'] = df['Tahani'] + df['Jason']
4 
5 # Print the enhanced DataFrame
6 print(df)

View Code

Pandas provides two different ways to duplicate a DataFrame:

Referencing: 藕不断丝连。
Copying: 相互独立。

 1 # Create a reference by assigning my_dataframe to a new variable.
 2 print("Experiment with a reference:")
 3 reference_to_df = df
 4 
 5 # Print the starting value of a particular cell.
 6 print("  Starting value of df: %d" % df['Jason'][1])
 7 print("  Starting value of reference_to_df: %d\n" % reference_to_df['Jason'][1])
 8 
 9 # Modify a cell in df.
10 df.at[1, 'Jason'] = df['Jason'][1] + 5 # Why not using Chained Indexing for #DataFrame assignment?
11 print("  Updated df: %d" % df['Jason'][1])
12 print("  Updated reference_to_df: %d\n\n" % reference_to_df['Jason'][1])

View Code

There're a lot of differences among .iloc , .at and Chained indexing. It seems the last one might not be a proper way for assignment, though it can exchange positions freely with .at generating exactly the same output, superficially.

The following code shows an experiment of a copy(to B finished)

copy_of_my_dataframe = my_dataframe.copy()

View Code

标签：df,column,DataFrame,备忘录,dataframe,print,my,Pandas
From： https://www.cnblogs.com/ArmRoundMan/p/18360508

df['料品分类'].apply(format_value) 是一个 Pandas 操作，用于对 DataFrame 中的 '料品
df['料品分类'].apply(format_value)是一个Pandas操作，用于对DataFrame中的'料品分类'列的每个值应用一个名为format_value的函数，并将处理后的结果返回给这一列。分解解释df['料品分类']:这部分代码选择DataFramedf中名为'料品分类'的列。df是一个PandasDat......
df.iterrows() 是 Pandas 中的一个方法，用于在遍历 DataFrame 时，逐行返回每一行的索引
df.iterrows()是Pandas中的一个方法，用于在遍历DataFrame时，逐行返回每一行的索引和数据。它生成一个迭代器，每次迭代时返回一个(index,Series)对，index是行索引，Series是该行的数据。详细解释df.iterrows():这个方法遍历DataFrame的每一行。每次迭代时，返回的是(ind......
豆瓣评分8.7！Python pandas创始人亲码的数据分析入门手册！
在众多解释型语言中，Python最大的特点是拥有一个巨大而活跃的科学计算社区。进入21世纪以来，在行业应用和学术研究中采用python进行科学计算的势头越来越猛。近年来，由于Python有不断改良的库(主要是pandas)，使其成为数据处理任务的一大代替方案，结合其在通用编程方面的强大实力，完......
备忘录——C#创建钉钉OA审批实例
目录1.钉钉接口文档及SDK2.钉钉中创建应用3.代码段3.1获取Token3.2通过手机号获取钉钉UserID等信息3.3创建流程审批实例1.钉钉接口文档及SDK完整发起审批流程实例的步骤：https://open.dingtalk.com/document/orgapp/tutorial-creating-or-updating-an-approval-template调用......
python入门机器学习4：pandas入门
一.Series：一维数组，listimportnumpyasnpimportpandasaspdmyarray=np.array([1,2,3])myindex=['a','b','c']myseries=pd.Series(myarray,index=myindex)print(myseries)print(myseries[0])#第一个元素print(myseries['c'])#in......
Langchain pandas agent - Azure OpenAI account
Langchainpandasagent结合AzureOpenAI账户使用时，主要涉及到通过AzureOpenAI提供的自然语言处理能力，来操作pandasDataFrame或进行相关的数据处理任务。以下是关于这一结合使用的详细解析：一、Langchainpandasagent概述在LangChain中，Agent是一个核心概念，它代表了......
git command 工作中常用命令备忘录
模拟目前工作流程在gitlabfork需要开发的项目到自己仓库分配一个工作任务（feature、improvment、bug）本地从个人仓库克隆项目gitclonehttp://mylocal/group/project本地添加对于远端项目gitremoteaddupstreamhttp://dev.xxx.io/group/project基于远端仓库切出本......
python 利用高德得到地址对应的经伟度，由于地址原因在指定时间范围内得不到经伟度而终
importrequests,sysimportjson,math,xlrd,xlwt,time#!/usr/bin/envpython#-*-coding:utf-8-*-frommathimportsin,asin,cos,radians,fabs,sqrtimportpandasaspdfromgeopy.distanceimportgeodesicfromopenpyxlimportload_workbookimportred......

Pandas备忘录

相关文章

赞助商

阅读排行