标签：dtype 数组 np 基本操作 array Numpy ndarray axis

Numpy介绍

　　Numpy（Numerical Python）是一个开源的Python科学计算库，用于快速处理任意维度的数组。

　　Numpy支持常见的数组和矩阵操作。对于同样的数值计算任务，使用Numpy比直接使用Python要简洁的多。

　　Numpy使用ndarray对象来处理多维数组，该对象是一个快速而灵活的大数据容器。

性能

　　ndarray支持并行化运算（向量化运算）

　　ndarray在存储数据的时候，数据与数据的地址都是连续的，这样就给使得批量操作数组元素时速度更快。

数据集合介绍

　　NumPy提供了一个N维数组类型ndarray，它描述了相同类型的“items”的集合。

代码展示：

这是一个二维数组

import numpy as np

# 创建ndarray
score = np.array([[80, 89, 86, 67, 79],
[78, 97, 89, 67, 81],
[90, 94, 78, 67, 74],
[91, 91, 90, 67, 69],
[76, 87, 75, 67, 86],
[70, 79, 84, 67, 84],
[94, 92, 93, 67, 64],
[86, 85, 83, 67, 80]])

ndarray的属性

数组属性反映了数组本身固有的信息。

属性名字    属性解释
ndarray.shape    数组维度的元组
ndarray.ndim    数组维数
ndarray.size    数组中的元素数量
ndarray.itemsize    一个数组元素的长度（字节）
ndarray.dtype    数组元素的类型

ndarray的类型

>>> type(score.dtype)

<type 'numpy.dtype'>

名称    描述    简写
np.bool    用一个字节存储的布尔类型（True或False）    'b'
np.int8    一个字节大小，-128 至 127    'i'
np.int16    整数，-32768 至 32767    'i2'
np.int32    整数，-2 31 至 2 32 -1    'i4'
np.int64    整数，-2 63 至 2 63 - 1    'i8'
np.uint8    无符号整数，0 至 255    'u'
np.uint16    无符号整数，0 至 65535    'u2'
np.uint32    无符号整数，0 至 2 ** 32 - 1    'u4'
np.uint64    无符号整数，0 至 2 ** 64 - 1    'u8'
np.float16    半精度浮点数：16位，正负号1位，指数5位，精度10位    'f2'
np.float32    单精度浮点数：32位，正负号1位，指数8位，精度23位    'f4'
np.float64    双精度浮点数：64位，正负号1位，指数11位，精度52位    'f8'
np.complex64    复数，分别用两个32位浮点数表示实部和虚部    'c8'
np.complex128    复数，分别用两个64位浮点数表示实部和虚部    'c16'
np.object_    python对象    'O'
np.string_    字符串    'S'
np.unicode_    unicode类型    'U'

创建数组的时候指定类型

>>> a = np.array([[1, 2, 3],[4, 5, 6]], dtype=np.float32)
>>> a.dtype
dtype('float32')

生成0和1的数组

方法
empty(shape[, dtype, order]) empty_like(a[, dtype, order, subok])
eye(N[, M, k, dtype, order])
identity(n[, dtype])
ones(shape[, dtype, order])
ones_like(a[, dtype, order, subok])
zeros(shape[, dtype, order]) zeros_like(a[, dtype, order, subok])
full(shape, fill_value[, dtype, order])
full_like(a, fill_value[, dtype, order, subok])


示例：
>>> zero = np.zeros([3, 4])
array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

从现有数组生成

方法介绍
　　array(object[, dtype, copy, order, subok, ndmin])
　　asarray(a[, dtype, order])
　　asanyarray(a[, dtype, order]) ascontiguousarray(a[, dtype])
　　asmatrix(data[, dtype])
　　copy(a[, order])

示例：

a = np.array([[1,2,3],[4,5,6]])
# 从现有的数组当中创建
a1 = np.array(a)
# 相当于索引的形式，并没有真正的创建一个新的
a2 = np.asarray(a)

生成固定范围的数组

方法介绍　　np.linspace (start, stop, num, endpoint, retstep, dtype)

　　start 序列的起始值
　　stop 序列的终止值，
　　如果endpoint为true，该值包含于序列中
　　num 要生成的等间隔样例数量，默认为50
　　endpoint 序列中是否包含stop值，默认为ture
　　retstep 如果为true，返回样例，
　　以及连续数字之间的步长
　　dtype 输出ndarray的数据类型

示例：

# 生成等间隔的数组
np.linspace(0, 100, 10)

其它的还有

numpy.arange(start,stop, step, dtype)
　　numpy.logspace(start,stop, num, endpoint, base, dtype)

示例：

np.arange(10, 50, 2)

生成随机数组 np.random模块

均匀分布
- np.random.rand(d0, d1, ..., dn)
  
  返回[0.0，1.0)内的一组均匀分布的数。
- np.random.uniform(low=0.0, high=1.0, size=None)
  
  功能：从一个均匀分布[low,high)中随机采样，注意定义域是左闭右开，即包含low，不包含high.
  
  参数介绍:
  
  low: 采样下界，float类型，默认值为0；
  
  high: 采样上界，float类型，默认值为1；
  
  size: 输出样本数目，为int或元组(tuple)类型，例如，size=(m,n,k), 则输出mnk个样本，缺省时输出1个值。
  
  返回值：ndarray类型，其形状和参数size中描述一致。
- np.random.randint(low, high=None, size=None, dtype='l')
  
  从一个均匀分布中随机采样，生成一个整数或N维整数数组，取数范围：若high不为None时，取[low,high)之间随机整数，否则取值[0,low)之间随机整数。

　　补充：均匀分布

　　　　均匀分布（Uniform Distribution）是概率统计中的重要分布之一。顾名思义，均匀，表示可能性相等的含义。均匀分布在自然情况下极为罕见，而人工栽培的有一定株行距的植物群落即是均匀分布。

示例：

# 生成均匀分布的随机数
x1 = np.random.uniform(-1, 1, 100000000)

正态分布

np.random.randn(d0, d1, …, dn)

功能：从标准正态分布中返回一个或多个样本值
np.random.normal(loc=0.0, scale=1.0, size=None)

loc：float

此概率分布的均值（对应着整个分布的中心centre）

scale：float

此概率分布的标准差（对应于分布的宽度，scale越大越矮胖，scale越小，越瘦高）

size：int or tuple of ints

输出的shape，默认为None，只输出一个值
np.random.standard_normal(size=None)

返回指定形状的标准正态分布的数组。

示例：

# 生成正态分布的随机数
x2 = np.random.normal(1.75, 1, 100000000)

数组的索引、切片

# 二维的数组，两个维度 
stock_change[0, 0:3]

形状修改

ndarray.reshape(shape[, order]) Returns an array containing the same data with a new shape.

ndarray.T 数组的转置

将数组的行、列进行互换

stock_change.shape
(8, 10)
stock_change.T.shape
(10, 8)

ndarray.resize(new_shape[, refcheck]) Change shape and size of array in-place.

类型修改

ndarray.astype(type)

ndarray.tostring([order])或者ndarray.tobytes([order]) Construct Python bytes containing the raw data bytes in the array.

数组的去重

temp = np.array([[1, 2, 3, 4],[3, 4, 5, 6]])
>>> np.unique(temp)
array([1, 2, 3, 4, 5, 6])

逻辑运算

>>> stock_change = np.random.normal(0, 1, (8, 10))
>>> stock_change = stock_change[0:5, 0:5]
# 逻辑判断, 如果涨跌幅大于0.5就标记为True 否则为False
>>> stock_change > 0.5
array([[ True, False, False,  True, False],
       [ True,  True, False, False, False],
       [ True, False,  True, False,  True],
       [False,  True, False, False, False],
       [False, False, False,  True,  True]])
# BOOL赋值, 将满足条件的设置为指定的值-布尔索引
>>> stock_change[stock_change > 0.5] = 1
array([[ 1.        , -0.72404879, -1.33045773,  1.        ,  0.3869043 ],
       [ 1.        ,  1.        ,  0.20815446, -1.67860823,  0.06612823],
       [ 1.        ,  0.42753488,  1.        , -0.24375089,  1.        ],
       [-0.971945  ,  1.        , -0.95444661, -0.2602084 , -0.48736497],
       [-0.32183056, -0.92544956, -0.42126604,  1.        ,  1.        ]])

通用判断函数

# 判断stock_change[0:2, 0:5]是否全是上涨的
>>> np.all(stock_change[0:2, 0:5] > 0)
False

# 判断前5个这段期间是否有上涨的
>>> np.any(stock_change[0:5, :] > 0 )
True

np.where（三元运算符）

# 判断前四个 前四天的涨跌幅 大于0的置为1，否则为0
temp = stock_change[:4, :4]
np.where(temp > 0, 1, 0)

复合逻辑需要结合np.logical_and和np.logical_or使用

# 判断前四个 前四天的涨跌幅 大于0.5并且小于1的，换为1，否则为0
# 判断前四个 前四天的涨跌幅 大于0.5或者小于-0.5的，换为1，否则为0
np.where(np.logical_and(temp > 0.5, temp < 1), 1, 0)
np.where(np.logical_or(temp > 0.5, temp < -0.5), 1, 0)

统计运算

np.min(a[, axis, out, keepdims])

Return the minimum of an array or minimum along an axis.
np.max(a[, axis, out, keepdims])

Return the maximum of an array or maximum along an axis.
np.median(a[, axis, out, overwrite_input, keepdims])

Compute the median along the specified axis.
np.mean(a[, axis, dtype, out, keepdims])

Compute the arithmetic mean along the specified axis.
np.std(a[, axis, dtype, out, ddof, keepdims])

Compute the standard deviation along the specified axis.
np.var(a[, axis, dtype, out, ddof, keepdims])

Compute the variance along the specified axis.

示例:

# 接下来对于这4只股票的4天数据，进行一些统计运算
# 指定行 去统计
print("前四只股票前四天的最大涨幅{}".format(np.max(temp, axis=1)))
# 使用min, std, mean
print("前四只股票前四天的最大跌幅{}".format(np.min(temp, axis=1)))
print("前四只股票前四天的波动程度{}".format(np.std(temp, axis=1)))
print("前四只股票前四天的平均涨跌幅{}".format(np.mean(temp, axis=1)))

np.argmax(temp, axis=)

np.argmin(temp, axis=)

示例：

# 获取股票指定哪一天的涨幅最大
print("前四只股票前四天内涨幅最大{}".format(np.argmax(temp, axis=1)))
print("前四天一天内涨幅最大的股票{}".format(np.argmax(temp, axis=0)))

计算拥有广播机制

执行 broadcast 的前提在于，两个 ndarray 执行的是 element-wise的运算，Broadcast机制的功能是为了方便不同形状的ndarray（numpy库的核心数据结构）进行数学运算。

当操作两个数组时，numpy会逐个比较它们的shape（构成的元组tuple），只有在下述情况下，两个数组才能够进行数组与数组的运算。

维度相等
shape（其中相对应的一个地方为1）

什么是矩阵

矩阵，英文matrix，和array的区别矩阵必须是2维的，但是array可以是多维的。

np.mat()
将数组转换成矩阵类型

a = np.array([[80, 86],
[82, 80],
[85, 78],
[90, 90],
[86, 82],
[82, 90],
[78, 80],
[92, 94]])
b = np.array([[0.3], [0.7]])

np.mat(a)

矩阵乘法运算

矩阵乘法的两个关键：

形状改变
运算规则

矩阵乘法api：

np.matmul
np.dot

>>> a = np.array([[80, 86],
[82, 80],
[85, 78],
[90, 90],
[86, 82],
[82, 90],
[78, 80],
[92, 94]])
>>> b = np.array([[0.7], [0.3]])

>>> np.matmul(a, b)
array([[81.8],
       [81.4],
       [82.9],
       [90. ],
       [84.8],
       [84.4],
       [78.6],
       [92.6]])
>>> np.dot(a,b)
array([[81.8],
       [81.4],
       [82.9],
       [90. ],
       [84.8],
       [84.4],
       [78.6],
       [92.6]])

合并、分割

合并

numpy.hstack(tup) Stack arrays in sequence horizontally (column wise).
numpy.vstack(tup) Stack arrays in sequence vertically (row wise).
numpy.concatenate((a1, a2, ...), axis=0)

标签：dtype,数组,np,基本操作,array,Numpy,ndarray,axis
From： https://www.cnblogs.com/hanzeng1993/p/16587091.html

Numpy基本操作

Numpy介绍

性能

数据集合介绍

ndarray的属性

ndarray的类型

创建数组的时候指定类型

生成0和1的数组

从现有数组生成

生成固定范围的数组

方法介绍　　np.linspace (start, stop, num, endpoint, retstep, dtype)

数组的索引、切片

形状修改

ndarray.T 数组的转置

类型修改

数组的去重

逻辑运算

通用判断函数

np.where（三元运算符）

统计运算

计算拥有广播机制

什么是矩阵

矩阵乘法运算

合并、分割

合并

相关文章

赞助商

阅读排行

Numpy基本操作

Numpy介绍

性能

数据集合介绍

ndarray的属性

ndarray的类型

创建数组的时候指定类型

生成0和1的数组

从现有数组生成

生成固定范围的数组

方法介绍 np.linspace (start, stop, num, endpoint, retstep, dtype)

数组的索引、切片

形状修改

ndarray.T 数组的转置

类型修改

数组的去重

逻辑运算

通用判断函数

np.where（三元运算符）

统计运算

计算拥有广播机制

什么是矩阵

矩阵乘法运算

合并、分割

合并

相关文章

赞助商

阅读排行

方法介绍　　np.linspace (start, stop, num, endpoint, retstep, dtype)