• git大文件管理


      由于git在每一个commit时都会变动过的文件全部保存(不像其他的系统,只做文件增量存储),外加未变动文件的引用,这样如果在文件系统中有一些大的二进制文件,比如图片,视频,那么很快你的repo就将变得很大(特别是binary文件又会有高频小部分变化commit的话),clone这个repo时也会耗用越来越多的时间。有没有什么办法来优化这个问题呢?

      一个可行的方法是使用git-fat : https://github.com/jedbrown/git-fat

    其原理是:将二进制文件本身存放于共享文件系统中,保存在git repo中的信息仅仅是一些meta数据。

    1.安装: git-fat是一个shell脚本,只要下载该脚本,放到你的path变量中就安装好了

    2.使用:创建一个.gitattributes文件,来描述哪些文件是一个二进制文件:

    复制代码
    $ cd path-to-your-repository
    $ cat >> .gitattributes
    *.png filter=fat -crlf
    *.jpg filter=fat -crlf
    *.gz  filter=fat -crlf
    ^D
    复制代码

    运行git fat init 激活上面的文件后缀,从此你可以像一般文件一样来git add, git commit那些.png,.gz,.jpg文件,而文件本身却保存于repo之外的地方;

    如果你的文件本身保存于一个共享服务器上,你可以创建一个.gitfat文件,该文件中写入以下内容

    [rsync]
    remote = your.remote-host.org:/share/fat-store
    sshuser = yourusername
    options = -avzW

    下面是在本地保存文件的使用流程和相关命令:

    复制代码
    $ git init repo
    Initialized empty Git repository in /tmp/repo/.git/
    $ cd repo
    $ git fat init
    $ cat > .gitfat
    [rsync]
    remote = localhost:/tmp/fat-store
    $ mkdir -p /tmp/fat-store               # make sure the remote directory exists
    $ echo '*.gz filter=fat -crlf' > .gitattributes
    $ git add .gitfat .gitattributes
    $ git commit -m'Initial repository'
    [master (root-commit) eb7facb] Initial repository
     2 files changed, 3 insertions(+)
     create mode 100644 .gitattributes
     create mode 100644 .gitfat
    $ curl https://nodeload.github.com/jedbrown/git-fat/tar.gz/master -o master.tar.gz
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    100  6449  100  6449    0     0   7741      0 --:--:-- --:--:-- --:--:--  9786
    $ git add master.tar.gz
    git-fat filter-clean: caching to /tmp/repo/.git/fat/objects/b3489819f81603b4c04e8ed134b80bace0810324
    $ git commit -m'Added master.tar.gz'
    [master b85a96f] Added master.tar.gz
    git-fat filter-clean: caching to /tmp/repo/.git/fat/objects/b3489819f81603b4c04e8ed134b80bace0810324
     1 file changed, 1 insertion(+)
     create mode 100644 master.tar.gz
    $ git show --pretty=oneline HEAD
    918063043a6156172c2ad66478c6edd5c7df0217 Add master.tar.gz
    diff --git a/master.tar.gz b/master.tar.gz
    new file mode 100644
    index 0000000..12f7d52
    --- /dev/null
    +++ b/master.tar.gz
    @@ -0,0 +1 @@
    +#$# git-fat 1f218834a137f7b185b498924e7a030008aee2ae
    $ git fat push
    Pushing to localhost:/tmp/fat-store
    building file list ...
    1 file to consider
    
    sent 61 bytes  received 12 bytes  48.67 bytes/sec
    total size is 6449  speedup is 88.34
    复制代码

    上述过程完毕后,对应的二进制文件就已经保存好了,那么后面如何使用呢?

    复制代码
    $ cd ..
    $ git clone repo repo2
    Cloning into 'repo2'...
    done.
    $ cd repo2
    $ git fat init                          # don't forget:注意一旦clone了git repo后就要做这个动作,否则你修改了你的image文件后,git fat push时并不会主动将更行后的问题件上传到文件服务器上
    $ ls -l                                 # file is just a placeholder
    total 4
    -rw-r--r--  1 jed  users  53 Nov 25 22:42 master.tar.gz
    $ cat master.tar.gz                     # holds the SHA1 of the file
    #$# git-fat 1f218834a137f7b185b498924e7a030008aee2ae
    $ git fat pull
    receiving file list ...
    1 file to consider
    1f218834a137f7b185b498924e7a030008aee2ae
            6449 100%    6.15MB/s    0:00:00 (xfer#1, to-check=0/1)
    
    sent 30 bytes  received 6558 bytes  4392.00 bytes/sec
    total size is 6449  speedup is 0.98
    Restoring 1f218834a137f7b185b498924e7a030008aee2ae -> master.tar.gz
    git-fat filter-smudge: restoring from /tmp/repo2/.git/fat/objects/1f218834a137f7b185b498924e7a030008aee2ae
    $ git status
    git-fat filter-clean: caching to /tmp/repo2/.git/fat/objects/1f218834a137f7b185b498924e7a030008aee2ae
    # On branch master
    nothing to commit, working directory clean
    $ ls -l                                 # recovered the full file
    total 8
    -rw-r--r-- 1 jed users 6449 Nov 25 17:10 master.tar.gz
    复制代码

     如果出现了以下错误,可能是和文件权限为600有关,可以考虑使用sudo rsync xxx来执行,或者有可能部分文件不存在??

    复制代码
    rsync -zr userA@remoteServer:/var/www/website/ /home/user/Documents/webSiteBackup/website/www/
    rsync: send_files failed to open "/var/www/website/wp-config.php": Permission denied (13)
    rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1655) [generator=3.1.0]
    rsync: link_stat "/home/gitfat/gitfatlibs/da39a3ee5e6b4b0d3255bfef95601890afd80709" failed: No such file or directory (2)                                                  
    0 files to consider                                                                                                                                                        
    rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1655) [Receiver=3.1.0]   
    复制代码

     如何在执行rsync时保存相应log?

    rsync -avz --log-file=$HOME/.rsyncd.log' -e ssh /home/adm/ adm@plog01:/home/adm

    如果出现以下错误,则可以考虑在official git fat 网站上raw copy重新生成git-fat,设置777权限

    [cabox@box-codeanywhere gitfattest]$ git fat init                                                                                                                          
    : No such file or directory                                                                                                                                                
    fatal: 'fat' appears to be a git command, but we were not                                                                                                                  
    able to execute it. Maybe git-fat is broken?  
  • 相关阅读:
    C#和C++除了语法上的差别外,还有什么其他的区别
    各种指针的的概览及造成原因
    批量操作Tomcat Shell脚本
    pi币pinetwork安装注册教程中文详细版【实操有效】
    Oracle分析函数
    Logger.Xml
    使用Redis / Zookeeper作为分布式锁的一些注意点
    Seata Server配置文件
    .gitignore忽略target无效
    MySql隔离级别:RU / RC / RR / S + 脏读 / 不可重复读 / 幻读 / 可重复读
  • 原文地址:https://www.cnblogs.com/jiangzhaowei/p/8185527.html
Copyright © 2020-2023  润新知