首页 > 其他分享 >数据分析,展现与R语言学习笔记(1)

数据分析,展现与R语言学习笔记(1)

时间:2022-10-30 18:34:18浏览次数:61  
标签:数据分析 展现 笔记 82 88 89 91 90 87


> x1=c(1,2,3,4,5,6,7,8,9)//c()=产生一个向量
> x1
[1] 1 2 3 4 5 6 7 8 9
> mode(x1)
[1] "numeric"
> length(x1)
[1] 9
> rbind(x1,x1)//整合连个向量,形成一个矩阵
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
x1 1 2 3 4 5 6 7 8 9
x1 1 2 3 4 5 6 7 8 9
> cbind(x1,x1)
x1 x1
[1,] 1 1
[2,] 2 2
[3,] 3 3
[4,] 4 4
[5,] 5 5
[6,] 6 6
[7,] 7 7
[8,] 8 8
[9,] 9 9
> mean(x1)//求平均数
[1] 5
> sum(x1)//求和
[1] 45
> max(x2)//求最大最小值
[1] 100
> min(x1)
[1] 1
> var(x1)//求方差(variance)
[1] 7.5
> prod(x1)
[1] 362880
> prod(x2)
[1] 9.332622e+157
>
> sd(x2)//标准差( standard deviation)
[1] 29.01149

一些语法

> 1:10
[1] 1 2 3 4 5 6 7 8 9 10
> 1:10-1
[1] 0 1 2 3 4 5 6 7 8 9
> 2:60*2+1
[1] 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
[19] 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75
[37] 77 79 81 83 85 87 89 91 93 95 97 99 101 103 105 107 109 111
[55] 113 115 117 119 121
> 1:10*2
[1] 2 4 6 8 10 12 14 16 18 20
> 2:60*2+1
[1] 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
[19] 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75
[37] 77 79 81 83 85 87 89 91 93 95 97 99 101 103 105 107 109 111
[55] 113 115 117 119 121
>
> 2:60*2+1
[1] 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
[19] 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75
[37] 77 79 81 83 85 87 89 91 93 95 97 99 101 103 105 107 109 111
[55] 113 115 117 119 121
> 2:60*2+1
[1] 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43
[21] 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83
[41] 85 87 89 91 93 95 97 99 101 103 105 107 109 111 113 115 117 119 121
> a=2:60*2+1
> a
[1] 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43
[21] 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83
[41] 85 87 89 91 93 95 97 99 101 103 105 107 109 111 113 115 117 119 121
> a[5]
[1] 13
> a[-5]
[1] 5 7 9 11 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45
[21] 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85
[41] 87 89 91 93 95 97 99 101 103 105 107 109 111 113 115 117 119 121
> a[1:5]
[1] 5 7 9 11 13
> a[-(1:5)]
[1] 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53
[21] 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93
[41] 95 97 99 101 103 105 107 109 111 113 115 117 119 121
> a[c(2,4,7)]
[1] 7 11 17
>
> a[3:8]
[1] 9 11 13 15 17 19
> a[a<20]
[1] 5 7 9 11 13 15 17 19
> a[a[3]]
[1] 21
> a[9]
[1] 21
> seq(5,20)//产生一个向量,可以指定
[1] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
> seq(5,121,by=2)
[1] 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43
[21] 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83
[41] 85 87 89 91 93 95 97 99 101 103 105 107 109 111 113 115 117 119 121
> seq(5,121,length=10)
[1] 5.00000 17.88889 30.77778 43.66667 56.55556 69.44444 82.33333 95.22222
[9] 108.11111 121.00000
> letters[1:30]
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t"
[21] "u" "v" "w" "x" "y" "z" NA NA NA NA
> letters//内置对象,存着26个字幕
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t"
[21] "u" "v" "w" "x" "y" "z"
> a=seq(2,40)
> which.max(a)  //找位置,各种找位置
[1] 39
> which(a==2)
[1] 2
> which(a>5)
[1] 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
> a=1:20
> a
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
> rev(a)//翻转
[1] 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
> sort(a)//排序
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
> rev(sort(a))
[1] 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
> a1=c(1:12)
> a1=c(1:12)
> matrix(a1,nrow=4,ncol=3)//矩阵
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
> matrix(a1,nrow=3,ncol=4)
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
> matrix(a1,nrow=4,ncol=3,byrow=T)
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
[4,] 10 11 12

矩阵的转置,加减法

> a=b=matrix(a1,nrow=4,ncol=3,byrow=T)
> a
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
[4,] 10 11 12
> b
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
[4,] 10 11 12
> t(a)//转置
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
> a+b//加法
[,1] [,2] [,3]
[1,] 2 4 6
[2,] 8 10 12
[3,] 14 16 18
[4,] 20 22 24
> a-b//减法
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 0
[3,] 0 0 0
[4,] 0 0 0
矩阵乘法
> a=matrix(1:12,nrow=3,ncol=4)
> b=matrix(1:12,nrow=4,ncol=3)
> a
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
> b
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
> a%*%b
[,1] [,2] [,3]
[1,] 70 158 246
[2,] 80 184 288
[3,] 90 210 330

方阵的对角线

> a=matrix(1:16,nrow=4)
> a
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 2 6 10 14
[3,] 3 7 11 15
[4,] 4 8 12 16
> diag(a)//结果是一个向量
[1] 1 6 11 16
> diag(diag(a))//产生以向量为对角线的矩阵
[,1] [,2] [,3] [,4]
[1,] 1 0 0 0
[2,] 0 6 0 0
[3,] 0 0 11 0
[4,] 0 0 0 16
> diag(4)//产生四阶单位矩阵
[,1] [,2] [,3] [,4]
[1,] 1 0 0 0
[2,] 0 1 0 0
[3,] 0 0 1 0
[4,] 0 0 0 1
> diag(seq(1,5))//产生以向量为对角线的矩阵
[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 0 0 0
[2,] 0 2 0 0 0
[3,] 0 0 3 0 0
[4,] 0 0 0 4 0
[5,] 0 0 0 0 5
> rnorm(16)//以正态分布产生16个随机数 [1] 0.79027687 1.14167897 1.27162428 -1.13071815 -1.46295346 -0.33647679 [7] -0.20166697 0.02592894 0.20498691 1.51331875 1.35167580 1.40470721 [13] -0.16802030 -0.35107031 -0.51437608 -0.09406821 > a=matrix(rnorm(16),4,4) > a [,1] [,2] [,3] [,4] [1,] -1.4205777 0.3643621 0.82097989 1.03121963 [2,] 0.1486225 -0.7520685 0.68004193 -0.03371108 [3,] -1.4458179 -0.8287518 1.48177576 0.09116119 [4,] -1.3000649 -0.1764955 0.02366358 -0.06364255
> solve(a)//求逆矩阵(这个果真是不好求啊,电脑明显顿了一下)
[,1] [,2] [,3] [,4]
[1,] -0.02472174 0.2219122 -0.07808618 -0.6299696
[2,] -0.31118268 -2.6935027 1.43350614 -1.5621103
[3,] -0.27601213 -1.4377140 1.51227778 -1.5445831
[4,] 1.26536035 2.4019986 -1.81803407 0.9137964



> a
[,1] [,2] [,3] [,4]
[1,] -1.4205777 0.3643621 0.82097989 1.03121963
[2,] 0.1486225 -0.7520685 0.68004193 -0.03371108
[3,] -1.4458179 -0.8287518 1.48177576 0.09116119
[4,] -1.3000649 -0.1764955 0.02366358 -0.06364255
> b=c(1:4)
> b
[1] 1 2 3 4
> solve(a,b)//解线性方程组,a*x=b的值
[1] -2.335034 -7.646111 -4.792939 4.270441

求矩阵的特征值和特征向量(考研的童鞋慢慢的回忆啊)

> a=diag(4)+1
> a
[,1] [,2] [,3] [,4]
[1,] 2 1 1 1
[2,] 1 2 1 1
[3,] 1 1 2 1
[4,] 1 1 1 2
> a.e=eigen(a,symmetric=T)
> a.e
$values
[1] 5 1 1 1

$vectors
[,1] [,2] [,3] [,4]
[1,] -0.5 0.8660254 0.0000000 0.0000000
[2,] -0.5 -0.2886751 -0.5773503 -0.5773503
[3,] -0.5 -0.2886751 -0.2113249 0.7886751
[4,] -0.5 -0.2886751 0.7886751 -0.2113249
> a.e$vectors%*%diag(a.e$values)%*%t(a.e$vectors)(没错,就是那个公式)
[,1] [,2] [,3] [,4]
[1,] 2 1 1 1
[2,] 1 2 1 1
[3,] 1 1 2 1
[4,] 1 1 1 2
>



上边是向量和矩阵两种数据类型,下边是数组类型

> x=c(1:6)
> is.vector(x)
[1] TRUE
> is.array(x)
[1] FALSE
> is.matrix(x)
[1] FALSE
> dim(x)=c(2,3)
> is.vector(x)
[1] FALSE
> is.array(x)
[1] TRUE
> is.matrix(x)//从这看以得知,矩阵就是2维的数组
[1] TRUE
> x
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6

数据框:矩阵形式,但列可以是不同类型。

                每列是一个变量(所以只能取到一列也就是一个变量的值),每行是一个观测值(样本)。

> x1=seq(6,10)
> x2=seq(19,23)
> x1
[1] 6 7 8 9 10
> x2
[1] 19 20 21 22 23
> x3=data.frame(x1,x2)//处理x1,x2,产生一个数据框
> x3
x1 x2
1 6 19
2 7 20
3 8 21
4 9 22
5 10 23
> x3[1]
x1
1 6
2 7
3 8
4 9
5 10
> x3[2]
x2
1 19
2 20
3 21
4 22
5 23
> x=data.frame("重量"=x1,"运费"=x2)
> x
重量 运费
1 6 19
2 7 20
3 8 21
4 9 22
5 10 23
plot(x)//以上边的数据框x变量中的数值,画散点图。有两列,两个变量,画出来就是两个坐标轴,2维的。



For循环
> for(i in 1:59) {a[i]=i*2+3} > a [1] 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 [21] 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 [41] 85 87 89 91 93 95 97 99 101 103 105 107 109 111 113 115 117 119 121 >

while循环


> a[1]=5 > i=1 > while(a[i]<121){i=i+1;a[i]=a[i-1]+2} > a [1] 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 [21] 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 [41] 85 87 89 91 93 95 97 99 101 103 105 107 109 111 113 115 117 119 121 >


产生各种分布的向量> x1=round(runif(100,min=80,max=100))
> x1
[1] 98 87 85 92 86 90 96 91 90 95 94 93 99 84 93 81 81 94 92 81
[21] 84 82 85 88 81 94 97 99 81 89 85 100 90 82 87 87 83 96 83 92
[41] 94 97 88 94 88 92 82 93 92 81 85 86 87 98 84 97 91 95 86 81
[61] 85 83 81 94 90 81 89 96 92 86 96 89 100 81 97 82 87 96 91 86
[81] 92 97 96 99 86 99 82 89 96 94 86 98 91 99 95 98 100 92 87 85
> x1=round(rnorm(100,mean=80,sd=7))
> x1
[1] 99 89 72 92 72 88 68 78 80 84 76 74 74 90 78 86 74 76 80 74 58 91 69 90 72 88 78
[28] 90 79 76 82 77 80 85 77 74 70 89 85 91 67 82 64 85 75 82 82 88 85 84 86 71 69 87
[55] 90 70 87 61 74 76 76 78 79 89 79 73 85 82 78 77 82 75 78 82 84 94 69 92 72 73 93
[82] 76 90 77 76 82 87 82 81 82 75 78 77 88 68 74 78 82 74 90
> x1[which(x1>100)]=100
> x1
[1] 99 89 72 92 72 88 68 78 80 84 76 74 74 90 78 86 74 76 80 74 58 91 69 90 72 88 78
[28] 90 79 76 82 77 80 85 77 74 70 89 85 91 67 82 64 85 75 82 82 88 85 84 86 71 69 87
[55] 90 70 87 61 74 76 76 78 79 89 79 73 85 82 78 77 82 75 78 82 84 94 69 92 72 73 93
[82] 76 90 77 76 82 87 82 81 82 75 78 77 88 68 74 78 82 74 90



生成数据框并写入文件

> a1=round(rnorm(100,mean=90,sd=10))
> a1
[1] 90 89 94 108 72 97 88 99 98 89 74 104 81 84 77 72 83 94 100 95
[21] 100 94 87 90 87 92 81 97 78 98 100 101 90 82 94 92 83 80 103 91
[41] 83 84 90 92 97 99 108 68 72 84 78 93 98 91 85 117 103 71 87 100
[61] 96 88 93 89 90 130 90 94 75 85 87 105 94 75 88 96 104 88 86 102
[81] 109 83 87 95 114 91 88 94 82 90 104 80 83 79 87 95 99 92 78 87
> a2=round(rnorm(100,mean=90,sd=50))
> a2
[1] 94 14 -23 80 119 149 83 105 91 189 192 67 173 34 69 65 207 144 69 115
[21] 145 95 49 103 29 59 103 126 66 137 112 104 84 167 137 78 123 2 63 60
[41] 127 62 69 149 35 52 136 84 23 48 110 68 58 151 59 123 -20 157 161 48
[61] 20 138 118 44 54 197 67 175 180 28 31 23 25 94 61 80 144 58 85 79
[81] 67 117 26 -7 82 138 130 80 40 59 157 126 93 -60 48 123 37 75 18 230
> a3=round(rnorm(100,mean=90,sd=50))
> a3=round(rnorm(100,mean=90,sd=2))
> a3
[1] 89 91 88 89 90 89 91 88 92 87 90 89 92 89 88 92 89 92 88 88 88 90 89 95 91 87 91
[28] 94 90 90 94 93 91 89 91 92 93 88 89 87 91 88 90 91 94 90 89 88 93 92 88 91 88 90
[55] 90 93 92 91 89 89 85 94 90 87 89 88 86 93 94 87 91 88 88 89 90 91 93 90 90 92 88
[82] 89 91 89 91 89 90 93 92 91 90 89 90 89 91 90 86 92 92 93
> a4=round(rnorm(100,mean=90,sd=60))
> a4
[1] 87 13 125 109 179 29 40 27 152 187 68 5 120 105 123 186 148 167 110 115
[21] 114 8 119 64 29 107 120 6 123 206 141 124 96 66 -19 192 33 163 195 156
[41] 167 116 72 69 45 146 98 54 127 102 83 68 43 129 26 83 138 53 92 218
[61] 245 98 132 36 93 46 44 -21 15 87 143 6 112 143 79 145 41 69 99 0
[81] 196 15 96 120 -4 126 104 63 156 62 -58 37 104 136 71 213 59 152 8 102
> a=data.frame(a1,a2,a3,a4)//向量生成数据框

> write.table(x,file="d:\\mark.txt",col.names=F,row.names=F,sep=" ") //写入文件



a是一个数据框


> colMeans(a) a1 a2 a3 a4 90.80 88.58 90.10 94.37 > colMeans(a)[c("a1","a2","a3")] a1 a2 a3 90.80 88.58 90.10 >


强大的apply函数,第一个参数是一个数据框,第二个参数,1为对行处理,2为对列处理,第三个参数为action,值可为max,min,mean,sum等
> apply(x,1,mean)
[1] 12.5 13.5 14.5 15.5 16.5
> apply(x,1,max)
[1] 19 20 21 22 23



标签:数据分析,展现,笔记,82,88,89,91,90,87
From: https://blog.51cto.com/xichenguan/5807701

相关文章

  • 数据分析,展现与R语言学习笔记(2)
    各种图的低级版本对a1进行直方图分析,a1为一个向量>hist(a$a1)绘制散点图>plot(a$a2,a$a3)列联表分析>table(a$a1)68717274757778798081828384......
  • 生物催化工程课堂笔记(二)
    生物催化工程(二)生物催化剂的筛选商业来源数据库来源动植物源微生物源产酶微生物的筛选原则:短时间内高产目标酶利用便宜和方便的原料产酶具有较好的专一性非致......
  • STA学习笔记-0
    如今的逻辑设计复杂度和工作频率要求越来越高。为了保证设计稳定可靠,必须对设计附加时序约束,对综合实现结果进行时序分析。导言时序约束:主要用于规范设计的时序行为,表达......
  • python 笔记
    虚拟环境创建虚拟环境python-mvenvenv_name激活虚拟环境进入虚拟环境路径下的script目录,执行activatecdE:\env\test_env\scripts./activate 按requiremen......
  • 学习笔记-权限提升
    权限提升免责声明本文档仅供学习和研究使用,请勿使用文中的技术源码用于非法用途,任何人造成的任何负面影响,与本人无关.大纲WinLinuxMysqlMSSQLPostg......
  • vue学习笔记
    今日内容概要计算属性监听属性组件介绍和定义父子通信之父传子父子通信之子传父ref属性动态组件插槽vue-cli今日内容详细计算属性我们可以通过计算属性c......
  • 20201208史逸霏第六章学习笔记
    6.1~6.3信号和中断中断:中断是I/O设备发送到CPU的外部请求,将CPU从正常执行转移到中断处理。信号:信号是发送给进程的请求,将进程从正常执行转移到中断处理。中断的类型:......
  • PNG文件格式-笔记
    PNG注:笔记中拓扑图xmind源文件在其图片目录下什么是PNGPNG是20世纪90年代中期开始开发的图像文件存储格式,其目的是替代GIF和TIFF文件格式,同时增加......
  • RAR文件格式-笔记
    RARRAR文件头526172211A0700RAR文件尾C43D7B00400700Rar文件主要由标记块,压缩文件头块,文件头块,结尾块组成。其每一块大致分为以下几个字段:名......
  • ZIP文件格式-笔记
    ZIPZIP文件头504B03040A000000ZIP文件尾504B050600000000+其他字符Zip文件主要由三部分构成,分别为压缩源文件数据区压缩源文件数据区中......