首页 > 其他分享 >PVE群晖NAS修复笔记

PVE群晖NAS修复笔记

时间:2024-01-16 09:45:04浏览次数:34  
标签:Matomo DB 29 var NAS PVE 群晖 root pveproxy

title: PVE群晖NAS修复笔记
tags: [NAS,家宽,docker,docker-compose,linux,pve]
新版原文: https://query.carlzeng.top:3/appsearch?q=PVE群晖NAS修复笔记
版权声明: 本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
date: 2023-12-29 08:51:31
categories: NAS

调皮的小伙伴把NAS玩坏了,贴出详尽修复过程

排查修复结果总结

  1. MariaDB 修复需要赋值两个映射出来目录的权限,给mysql用户,777的目录权限
  2. yourls的sql仍然无法修复,只能全部重装后,使用import export插件导入数据,丢失了几周的数据,哎
  3. 删除了pve,debian中的所有陈旧的日志文件,和大的冗余的日志文件。设置了日志文件的新规则。
  4. 进一步了解了pve的磁盘以及分区原理,为下一步重新划分磁盘,分配要更合理一些,迫不及待了....

有什么用

排查PVE中NAS的运行错误,控制台错误信息(导致linux无法正常启动):

sata boot support on this platform is experimental

关闭虚拟机,后尝试重启,pve错误:

WARN: no efidisk configured! Using temporary efivars disk.
Warning: unable to close filehandle GEN7208 properly: No space left on device at /usr/share/perl5/PVE/Tools.pm line 254.
TASK ERROR: unable to write '/tmp/105-ovmf.fd.tmp.29425' - No space left on device

当前的后果(2023.12.29)

NAS上面跑的docker也全部挂掉

​ Book

​ emby

​ aria2

数据,还有NAS中的数据(群龙无首了)

采取措施

qm list 
qm stop 105

cd /
du -sh *

102G    var                                                                                                   

发现这个var文件夹占用了102G的空间,继续往里面找找

100G    lib       
root@lgkdz:/var/lib/vz/images# du -sh *
26G     100
50G     101
656M    102
21G     105

#排序
du -s /usr/share/* | sort -nr

进debian ssh,tasksel 卸载桌面。

磁盘依旧99.99%

发现可能是Debian的磁盘占用不停侵蚀占用128G的SSD:

4.4G    usr                                                                                                   
18G     var                                                                                                   
5.8G    www 

var从上次排查(2023年12月15?;目前运行39天,当时运行24天;2周前)的12G,2周涨了6G空间出来。

root@Debian11:/var/lib/docker# du -sh *                                                                       
108K    buildkit                                                                                              
1.3G    containers                                                                                            
4.0K    engine-id                                                                                             
27M     image                                                                                                 
244K    network                                                                                               
15G     overlay2                                                                                              
16K     plugins                                                                                               
4.0K    runtimes                                                                                              
4.0K    swarm                                                                                                 
4.0K    tmp                                                                                                   
900K    volumes

清理docker:

> docker system df                                                      
TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE                                                     
Images          21        18        6.589GB   153.2MB (2%)                                                    
Containers      18        17        483.5MB   555.6kB (0%)                                                    
Local Volumes   0         0         0B        0B                                                              
Build Cache     0         0         0B        0B         

> docker system prune -a
y
Total reclaimed space: 392MB

> docker system df
TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
Images          17        17        6.197GB   7.335MB (0%)
Containers      17        17        482.9MB   0B (0%)
Local Volumes   0         0         0B        0B
Build Cache     0         0         0B        0B
sudo systemctl restart  docker   

仍然是:99.99% (108.18 GiB的108.20 GiB)

#PVE的情况
root@lgkdz:/# systemctl status pveproxy.service                                                                                  
● pveproxy.service - PVE API Proxy Server                                                                                        
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; preset: enabled)                                             
     Active: active (running) since Sun 2023-11-19 19:25:04 CST; 1 month 9 days ago                                              
    Process: 989 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)                                
    Process: 991 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)                                               
    Process: 56738 ExecReload=/usr/bin/pveproxy restart (code=exited, status=0/SUCCESS)                                          
   Main PID: 993 (pveproxy)                                                                                                      
      Tasks: 4                                                                                                                   
     Memory: 162.3M                                                                                                              
        CPU: 2h 15min 3.182s                                                                                                     
     CGroup: /system.slice/pveproxy.service                                                                                      
             ├─  993 pveproxy                                                                                                    
             ├─12296 "pveproxy worker"                                                                                           
             ├─12300 "pveproxy worker"                                                                                           
             └─12302 "pveproxy worker"                                                                                           
                                                                                                                                 
Dec 29 10:17:46 lgkdz pveproxy[993]: worker 12296 started                                                                        
Dec 29 10:17:49 lgkdz pveproxy[12280]: worker exit                                                                               
Dec 29 10:17:49 lgkdz pveproxy[993]: worker 12280 finished                                                                       
Dec 29 10:17:49 lgkdz pveproxy[993]: starting 1 worker(s)                                                                        
Dec 29 10:17:49 lgkdz pveproxy[993]: worker 12300 started                                                                        
Dec 29 10:17:49 lgkdz pveproxy[993]: worker 12295 finished                                                                       
Dec 29 10:17:49 lgkdz pveproxy[993]: starting 1 worker(s)                                                                        
Dec 29 10:17:49 lgkdz pveproxy[993]: worker 12302 started                                                                        
Dec 29 10:17:49 lgkdz pveproxy[12300]: Warning: unable to close filehandle GEN5 properly: No space left on device at /usr/share/p
erl5/PVE/APIServer/AnyEvent.pm line 1901.                                                                                        
Dec 29 10:17:49 lgkdz pveproxy[12300]: error writing access log 

加载下来看看,Kingchuxing里面的500G的数据情况:

mount /dev/sda5 /mnt/sda5

清理+管理Linux日志

rm -rf /log/*.gz  
rm -rf /var/log/*.1
journalctl --disk-usage       # 查看占用的磁盘                                                                          
Archived and active journals take up 2.5G in the file system.   

 
# 设置占用的磁盘空间,日志量大于这些后自动删除旧的
journalctl --vacuum-size=512M 

Vacuuming done, freed 2.0G of archived journals from /var/log/journal/2afbdd1662c14f99a11ce27fcda8ab85.
Vacuuming done, freed 0B of archived journals from /run/log/journal.

# 2d之前的自动删除
journalctl --vacuum-time=2d 

#这一顿清理日志以后,硬盘空间:
98.33% (106.39 GiB的108.20 GiB)

在此尝试启动NAS

WARN: no efidisk configured! Using temporary efivars disk.
TASK WARNINGS: 1

Debian的日志清理,维护
> journalctl --disk-usage       # 查看占用的磁盘
Archived and active journals take up 104.0M in the file system.     


1、find查找根下大于800M的文件

find / -size +800M -exec ls -lh {} ;

>root@lgkdz:/var/log# find / -size +800M -exec ls -lh {} \;                                                                   
-r-------- 1 root root 128T Nov 19 19:24 /proc/kcore                                                                         
find: ‘/proc/3193/task/3251/fd/34’: No such file or directory                                                                
find: ‘/proc/3193/task/3251/fd/35’: No such file or directory                                                                
find: ‘/proc/14759’: No such file or directory                                                                               
find: ‘/proc/14779’: No such file or directory                                                                               
find: ‘/proc/14780’: No such file or directory                                                                               
find: ‘/proc/14781/task/14781/fd/5’: No such file or directory                                                               
find: ‘/proc/14781/task/14781/fdinfo/5’: No such file or directory                                                           
find: ‘/proc/14781/fd/6’: No such file or directory                                                                          
find: ‘/proc/14781/fdinfo/6’: No such file or directory                                                                      
-rw-r--r-- 1 root root 1.3G Oct 14 20:50 /var/lib/vz/dump/vzdump-lxc-101-2023_10_14-20_48_21.tar.zst                         
-rw-r----- 1 root root 51G Dec 25 09:46 /var/lib/vz/images/100/vm-100-disk-0.qcow2                                           
-rw-r----- 1 root root 11G Dec 29 13:34 /var/lib/vz/images/102/vm-102-disk-0.qcow2                                           
-rw-r----- 1 root root 101G Dec 29 13:34 /var/lib/vz/images/105/vm-105-disk-2.qcow2                                          
-rw-r----- 1 root root 50G Dec 29 13:34 /var/lib/vz/images/101/vm-101-disk-0.raw                                             
-rw------- 1 root root 4.6G Nov  4 10:52 /core 
> root@Debian11:~# find / -size +800M -exec ls -lh {} \;
-rw-r----- 1 root root 1.2G Dec 29 13:43 /var/lib/docker/containers/a611cae746aa6c4b1e3bda308a7935180b79e0f684a75791910430989
1e2c979/a611cae746aa6c4b1e3bda308a7935180b79e0f684a757919104309891e2c979-json.log                                            
-r-------- 1 root root 128T Nov 19 19:26 /proc/kcore 
-r-------- 1 root root 128T Dec 17 09:32 /dev/.lxc/proc/kcore

检查异常大小的log文件
cd /var/lib/docker/containers/a611cae746aa6c4b1e3bda308a7935180b79e0f684a757919104309891e2c979

我认为这个a611cae746aa6c4b1e3bda308a7935180b79e0f684a757919104309891e2c979 是frp的docker,尝试删除这个1.2G的log文件!

直接rm;没有发现任何异常;怎么会生成这么大的log文件??

4:28PM >

分析两个磁盘的6个分区里面数据占用情况
使用df -h 命令查看文件系统及空间使用情况

> root@lgkdz:/var/log# df -h                                                                                                        
Filesystem            Size  Used Avail Use% Mounted on                                                                            
udev                  7.7G     0  7.7G   0% /dev                                                                                  
tmpfs                 1.6G  864K  1.6G   1% /run                                                                                  
/dev/mapper/pve-root  109G  107G     0 100% /                                                                                     
tmpfs                 7.8G   43M  7.7G   1% /dev/shm                                                                              
tmpfs                 5.0M     0  5.0M   0% /run/lock                                                                             
/dev/sdb2            1022M  352K 1022M   1% /boot/efi                                                                             
/dev/fuse             128M   16K  128M   1% /etc/pve                                                                              
tmpfs                 1.6G     0  1.6G   0% /run/user/0   

#也可用 df -T 查看文件系统的Type
> df -T
Filesystem           Type     1K-blocks      Used Available Use% Mounted on
udev                 devtmpfs   8066408         0   8066408   0% /dev
tmpfs                tmpfs      1620188       864   1619324   1% /run
/dev/mapper/pve-root ext4     113455880 111702412         0 100% /
tmpfs                tmpfs      8100928     43680   8057248   1% /dev/shm
tmpfs                tmpfs         5120         0      5120   0% /run/lock
/dev/sdb2            vfat       1046508       352   1046156   1% /boot/efi
/dev/fuse            fuse        131072        16    131056   1% /etc/pve
tmpfs                tmpfs      1620184         0   1620184   0% /run/user/0

 /dev/mapper/pve-root 就是pve卷组里的一个逻辑卷pve
 
 > root@lgkdz:/var/log# pvdisplay
  --- Physical volume ---
  PV Name               /dev/sdb3
  VG Name               pve
  PV Size               118.24 GiB / not usable <3.32 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              30269
  Free PE               0
  Allocated PE          30269
  PV UUID               jEzPvE-ELri-mvlq-5Jpi-s96g-a44F-SWWM4N
  
  
  > root@lgkdz:/var/log# vgdisplay
  --- Volume group ---
  VG Name               pve
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  9
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               <118.24 GiB
  PE Size               4.00 MiB
  Total PE              30269
  Alloc PE / Size       30269 / <118.24 GiB
  Free  PE / Size       0 / 0   
  VG UUID               aBqMlz-dH1H-PEif-LGT5-khnl-oEXf-sMtWTc
  
  > root@lgkdz:/var/log# lvdisplay
  --- Logical volume ---
  LV Path                /dev/pve/swap
  LV Name                swap
  VG Name                pve
  LV UUID                Wl0zSQ-Rlkj-4TLc-yyuM-Ntg1-1T27-QK3KOg
  LV Write Access        read/write
  LV Creation host, time proxmox, 2023-07-01 20:13:17 +0800
  LV Status              available
  # open                 2
  LV Size                8.00 GiB
  Current LE             2048
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
  --- Logical volume ---
  LV Path                /dev/pve/root
  LV Name                root
  VG Name                pve
  LV UUID                dFYnFo-1PQw-3qUM-sR9V-2eqf-BKn8-yTASwe
  LV Write Access        read/write
  LV Creation host, time proxmox, 2023-07-01 20:13:17 +0800
  LV Status              available
  # open                 1
  LV Size                <110.24 GiB
  Current LE             28221
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1

lsblk 查看所有存在的磁盘分区(不管使用挂载是否)

root@lgkdz:/var/log# lsblk                                                                                    
NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS                                                            
loop0          7:0    0    50G  0 loop                                                                        
sda            8:0    0 476.9G  0 disk                                                                        
├─sda1         8:1    0     8G  0 part                                                                        
├─sda2         8:2    0     2G  0 part                                                                        
├─sda3         8:3    0     1K  0 part                                                                        
└─sda5         8:5    0 466.7G  0 part                                                                        
sdb            8:16   0 119.2G  0 disk                                                                        
├─sdb1         8:17   0  1007K  0 part                                                                        
├─sdb2         8:18   0     1G  0 part /boot/efi                                                              
└─sdb3         8:19   0 118.2G  0 part                                                                        
  ├─pve-swap 253:0    0     8G  0 lvm  [SWAP]                                                                 
  └─pve-root 253:1    0 110.2G  0 lvm  / 

您可以通过清理占用磁盘空间较大的文件或目录、扩容磁盘或新购磁盘等几种方式来解决磁盘分区空间使用率达到100%的问题。具体操作步骤如下:

没办法,杀掉Windows

> scp [email protected]:/var/lib/vz/images/100/vm-100-disk-0.qcow2 .      
[email protected]'s password:                                                                                  
vm-100-disk-0.qcow2                                                         100%   50GB  45.1MB/s   18:55 

非正常地退出Matomo和yourls,导致数据库启动出错:

Matomo-DB  | 2023-12-29 10:46:40+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'               
Matomo-DB  | 2023-12-29 10:46:40+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:11.2.2+mari
a~ubu2204 started.                                                                                            
Matomo-DB  | 2023-12-29 10:46:40+00:00 [Note] [Entrypoint]: Initializing database files                       
Matomo-DB  | 2023-12-29 10:46:40 0 [Warning] Can't create test file '/var/lib/mysql/6be765c4a185.lower-test' (
Errcode: 13 "Permission denied")                                                                              
Matomo-DB  | /usr/sbin/mariadbd: Can't change dir to '/var/lib/mysql/' (Errcode: 13 "Permission denied")      
Matomo-DB  | 2023-12-29 10:46:40 0 [ERROR] Aborting                                                           
Matomo-DB  |                                                                                                  
Matomo-DB  | Installation of system tables failed!  Examine the logs in                                       
Matomo-DB  | /var/lib/mysql/ for more information.                                                            
Matomo-DB  |                                                                                                  
Matomo-DB  | The problem could be conflicting information in an external                                      
Matomo-DB  | my.cnf files. You can ignore these by doing:                                                     
Matomo-DB  |                                                                                                  
Matomo-DB  |     shell> /usr/bin/mariadb-install-db --defaults-file=~/.my.cnf                                 
Matomo-DB  |                                                                                                  
Matomo-DB  | You can also try to start the mariadbd daemon with:                                              
Matomo-DB  |                                                                                                  
Matomo-DB  |     shell> /usr/sbin/mariadbd --skip-grant-tables --general-log &                                
Matomo-DB  |                                                                                                  
Matomo-DB  | and use the command line tool /usr/bin/mariadb                                                   
Matomo-DB  | to connect to the mysql database and look at the grant tables:                                   
Matomo-DB  |                                                                                                  
Matomo-DB  |     shell> /usr/bin/mariadb -u root mysql                                                        
Matomo-DB  |     MariaDB> show tables;                                                                        
Matomo-DB  |                                                                                                  
Matomo-DB  | Try '/usr/sbin/mariadbd --help' if you...

......

继续阅读

请点击访问最新版内容

标签:Matomo,DB,29,var,NAS,PVE,群晖,root,pveproxy
From: https://www.cnblogs.com/backuper/p/17966925

相关文章

  • PVE磁盘占满解决方案实践
    title:PVE磁盘占满解决方案实践tags:[PVE,linux]新版原文:https://query.carlzeng.top:3/appsearch?q=PVE磁盘占满解决方案实践版权声明:本博客所有文章除特别声明外,均采用BY-NC-SA许可协议。转载请注明出处!date:2024-01-1109:31:48categories:linux应对PVE磁盘......
  • NFS安装及NAS配置
    问题起源:在安装NAS的过程中,需要执行mount命令;在执行mount命令的时候,报错如下:mount:/tdsqlbackup:badoption:forseveralfilesystem(e.g.nfs,cifs)youmightneeda/sbin/mount.<type>所以需要先安装NFS,再配置NAS。相关信息:NASIP地址及目录为:xx.yy.zz.ww:/shares/TDSQL一、......
  • docker compose 方式再次部署kodbox 可道云 nas服务
    本次dockercompsoe方式部署的可道云使用三个服务话不多说先上docker-compose.yml,着急的直接dockercomposeup-d运行version:'3.5'services:db:image:mariadbcommand:--transaction-isolation=READ-COMMITTED--binlog-format=ROWvolumes:-"/r......
  • 使用docker部署黑群晖
    一、需求公司内大部分的服务器都做了虚拟化,有些虚拟机非常重要,如svn、gitlab等,需要做天备份二、尝试VDP:安装复杂,比较重,操作较为复杂,且稳定性差Veeam:收费版(如有条件,推荐使用)群晖:ActiveBackupforBusiness三、安装docker系统yoda@yoda:~$lsb_release-aNoLSBmodulesareavaila......
  • 【2023.12.30】PVE的PCIE直通改VGPU授权
    之前使用直通有个坏处,就是其他的CT和虚拟机用不了GPU,只能使用核显在这里参考的链接是https://gitlab.com/polloloco/vgpu-proxmoxaptupdateaptdist-upgradeaptinstall-ygitbuild-essentialdkmspve-headersmdevctlgitclonehttps://gitlab.com/polloloco/vgpu-prox......
  • YOLO-NAS姿态简介:姿态估计技术的飞跃
    原创|文BFT机器人YOLO-NAS姿态模型是对姿态估计领域的最新贡献。今年早些时候,Deci凭借其开创性的物体检测基础模型YOLO-NAS获得了广泛认可。在YOLO-NAS成功的基础上,该公司现在推出了YOLO-NASPose作为其姿态估计的对应产品,这种姿势模型在延迟和准确性之间提供了很好的平衡。YOLO......
  • 群晖(Synology)Plex 的服务找不到文件夹
    当Plex在搜索NAS上的文件夹的时候找不到文件夹中的内容。如下图中显示的内容。上面的Public文件夹中找不到我们的子文件夹,但是我们的子文件夹是有内容的。问题和解决出现上面的问题主要还是权限的问题。选择需要访问的文件夹,然后在文件夹上选择上面的编辑按钮。 ......
  • 如何将铁威马NAS设置为固定IP?
    首先你需要配置正确的TNAS的网络设置,否则TNAS将无法连接到互联网或无法被访问。你可以在网络接口中设置TNAS的网络接口参数。TNAS设备可能配置有一个,两个或者两个以上的网络接口。你可以对网络接口逐一进行设置。1、登录铁威马TOS系统,打开控制面板,选择网络; 2、打开网络,......
  • pve 配置Ceph
    1.服务器的磁盘配置。检查服务器的阵列卡,看是否可以将磁盘设置为Job模式,如果没有这个功能,需要对ceph使用的测试做成raid0 ,如果所有磁盘作为ceph磁盘,可以考虑将阵列卡设置为HBA模式,但是有些阵列卡也不支持。这种只能将这些磁盘设置为raid0.2.ceph 存储如果设置3副本,则......
  • 使用PVE安装MikroTik-RouterOS-7.3最新稳定版
    使用PVE安装MikroTik-RouterOS-7.3最新稳定版1:下载并上传“MikroTik-RouterOS.qcow2.xz”文件到/var/lib/vz/images/https://drive.google.com/file/d/1DL2uaMfWz2mDHSE_0vRLz1Fw02isTfRe/view?usp=sharing2:解压“虚拟磁盘”文件cd/var/lib/vz/imagesmkdir101mvMikroTik-Route......