首页 > 编程语言 >How to Sync Files with an Offline Storage Using Python

How to Sync Files with an Offline Storage Using Python

时间:2022-09-29 15:00:21浏览次数:77  
标签:Files files log 04 Python Storage file folder backup

How to Sync Files with an Offline Storage Using Python

Guide to making a program for syncing files with offline storage

https://python.plainenglish.io/the-offline-syncing-files-with-python-71d7178de485

Photo by Samsung Memory on Unsplash

Long time no write articles on Medium. Today I wanna explain to you my new Python project about syncing files to offline storage such as an external hard drive or any shared folder.

The objective of this program is:

1. The program must have a config file for saving the path of the main folder and the backup folder.
2. The program must copy any files from the main folder to the backup folder. Any modification in the main folder must be updated in the backup folder.
3. The syncing process must happen every 5 minutes
4. The program must have log activity.

Time for Develop

Here are my steps for making this program.

Import the modules

The function of writing the log

The function for comparing two files

The function for comparing two folders

The function of checking the config file

The main looping script

Run the program

The program will output the prompt below on the first run.

config file: NOT FOUND
put your path in the computer:/home/user/folder
put your flashdisk path:/mnt/my_usb

And the config.txt will be created in the same directory as the sync-files.py.

If you are lazy to write the script above, just download my full script below:

Full script

import os
import hashlib
import time

LOG = 'log.txt'


# function area
# -------------------------------------------------------------

# log function
def log(message):
    # write log
    now = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
    with open(LOG, 'a') as f:
        f.write('['+now+']'+message+'\n')



def compare2file(file1, file2):
    # compare 2 files with hash
    with open(file1, 'rb') as f1:
        with open(file2, 'rb') as f2:
            if hashlib.md5(f1.read()).hexdigest() == hashlib.md5(f2.read()).hexdigest():
                return True
            else:
                return False

def compareHashFolder(folder, backup):
    # compare hash folder
    # return True if all file is same
    # return False if any file is different
     # get all file in folder
    files = os.listdir(folder)
    # get all file in backup
    files_backup = os.listdir(backup)
    # compare 2 list
    if len(files) != len(files_backup):
        return False

    for file in files:
        if file in files_backup:
            if not compare2file(folder+'/'+file, backup+'/'+file):
                return False
        else:
            return False
    return True

#------------------------------------------------------------------
log('Start')

if os.path.isfile('config.txt'):
    print("config file: OK")
    log ('config file: OK')
    # get variable from config
    with open('config.txt', 'r') as f:
        lines = f.readlines()
        folder = lines[0].split(':')[1].strip()
        backup = lines[1].split(':')[1].strip()

else:
    log('config file: NOT FOUND')
    print("config file: NOT FOUND")
    # register folder
    folder = input('put your path in the computer:')
    backup = input('put your flashdisk path:')
    # check if folder is exist
    if not os.path.isdir(folder):
        log('folder: NOT FOUND')
        print('folder is not exist')
        exit()
    # check if backup is exist
    if not os.path.isdir(backup):
        log('backup: NOT FOUND')
        print('backup is not exist')
        exit()
    # write config
    with open('config.txt', 'w') as f:
        f.write('folder:'+folder)
        f.write('\n')
        f.write('backup:'+backup)
    print('config file: CREATED')
    log('config file: CREATED')

# run loop every 5 minutes

while True:
    # check if folder is same with backup
    if compareHashFolder(folder, backup):
        now = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
        print(f'[{now}] file is up to date')
        log('file is up to date')
        # sleep for 5 minutes
        time.sleep(300)
        continue    

    # check folder
    if os.path.isdir(folder):
        print('folder: ONLINE')
        log('folder: ONLINE')
    else:
        print('folder is not exist')
        log('folder is not exist')
        print('please check your config file')
        break

    # check backup
    if os.path.isdir(backup):
        print('backup: ONLINE')
        log('backup: ONLINE')
    else:
        print('backup is not exist')
        log('backup is not exist')
        print('please check your config file')
        break

    # check file hash in folder and compare with backup
    countSync = 0
    updateFile = 0
    deleteFile = 0

    # get all file in folder
    files = os.listdir(folder)
    # get all file in backup
    files_backup = os.listdir(backup)
    # compare 2 list
    for file in files_backup:
        if file in files:
            if compare2file(folder+'/'+file, backup+'/'+file):
                log(f'{file} is up to date')
                countSync += 1
            else:
                # copy file from folder to backup
                updateFile += 1
                os.remove(backup+'/'+file)
                os.system('cp '+folder+'/'+file+' '+backup)
                log(f'{file} is updated')
        if file not in files:
            # delete file in backup
            log(f'{file} is deleted')
            deleteFile += 1
            os.remove(backup+'/'+file)

    for file in files:
        if file not in files_backup:
            # copy file from folder to backup
            updateFile += 1
            log(f'{file} is copied')
            os.system('cp '+folder+'/'+file+' '+backup)



    now = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
    print(f'[{now}] sync: {countSync}; update: {updateFile}; delete: {deleteFile};')

    # sleep for 5 minutes
    time.sleep(300)

 

the full script

After running that python program, the text below is what log.txt looks like:

[2022-04-29 07:59:52]Start
[2022-04-29 07:59:52]config file: OK
[2022-04-29 07:59:52]file is up to date
[2022-04-29 08:04:52]folder: ONLINE
[2022-04-29 08:04:52]backup: ONLINE
[2022-04-29 08:04:52]test is deleted
[2022-04-29 08:04:52]coba2 is updated
[2022-04-29 08:09:52]folder: ONLINE
[2022-04-29 08:09:52]backup: ONLINE
[2022-04-29 08:09:52]coba2.txt is deleted
[2022-04-29 08:09:52]coba2 is copied
[2022-04-29 08:09:52]testing.txt is copied
[2022-04-29 08:14:52]file is up to date
[2022-04-29 08:19:52]file is up to date
[2022-04-29 08:24:52]file is up to date
[2022-04-29 08:29:52]file is up to date
[2022-04-29 08:34:52]file is up to date
[2022-04-29 08:39:53]file is up to date

Conclusion

This program is helpful whenever you wanna make automation in your offline backup syncing files. But one important thing is that the program has gone in one direction. The backup folder will update depending on the main folder, and not vice versa.

How the program works:

demo

Thanks for reading.

More content at PlainEnglish.io. Sign up for our free weekly newsletter. Follow us on Twitter and LinkedIn. Check out our Community Discord and join our Talent Collective.

标签:Files,files,log,04,Python,Storage,file,folder,backup
From: https://www.cnblogs.com/z-cm/p/16741535.html

相关文章

  • python的多线程
    一、线程的概念线程是CPU分配资源的基本单位。当一程序开始运行,这个程序就变成了一个进程,而一个进程相当于一个或者多个线程。当没有多线程编程时,一个进程相当于一个主线程......
  • Python学生成绩管理系统(完整版)
    学生成绩管理系统简介一个带有登录界面具有增减改查功能的学生成绩管理系统(面向对象思想,利用tkinter库进行制作,利用.txt文件进行存储数据)源代码......
  • Python错误:scrapy框架中callback无法调用
    问题描述:当碰到scrapy框架中callback无法调用,直接略过了,别提多头疼了!scrapy.Request(url,headers=self.header,callback=self.details) 解决办法:原因分析:url可......
  • Python3
    实例一:importdatetime#定义一个列表mot=["今天星期一:\n坚持下去不是因为我很坚强,而是因为我别无选择。","今天星期二:\n含泪播种的人一定能笑着收获。","......
  • python else的小九九
    else是python语言中活生生的备胎,谁都是想用就用,不用也行。1.ifelse语句常规的if...else语句我就不过多赘述了,这里提一下三元表达式,可以有效减少代码量,使代码的整体......
  • How to Set Up a Virtual Environment in Python – And Why It's Useful
    https://www.freecodecamp.org/news/how-to-setup-virtual-environments-in-python/HowtoSetUpaVirtualEnvironmentinPython–AndWhyIt'sUsefulStephenSan......
  • Python基础(七) | 文件、异常以及模块详解
    ⭐本专栏旨在对Python的基础语法进行详解,精炼地总结语法中的重点,详解难点,面向零基础及入门的学习者,通过专栏的学习可以熟练掌握python编程,同时为后续的数据分析,机器学习及深......
  • Python面向对象---类的基本使用
    1、面向对象类(class):是一种用来描述具有相同属性和方法的对象的集合。类变量:类变量在整个实例化的对象中是公用的。一般定义在类中且在函数体之外。方法:类中的函数数据成员:类......
  • [oeasy]教您玩转python - 0002 - 你好世界(hello world!)
    你好世界......
  • 第四章python实训
    shift+win+s局部截图4-1:输出每日一帖4datetime.datetime.now()  获取当前日期datetime.datetime.now().weekday()  获取当前日期的星期  运行结果:  ......