Re: [Hampshire] Replicating directory tree and filenames

Top Page
Author: Hugo Mills
Date:  
To: Hampshire LUG Discussion List
Subject: Re: [Hampshire] Replicating directory tree and filenames

Reply to this message
gpg: failed to create temporary file '/var/lib/lurker/.#lk0x578d1100.hantslug.org.uk.27082': Permission denied
gpg: keyblock resource '/var/lib/lurker/pubring.gpg': Permission denied
gpg: Signature made Mon Aug 10 15:21:53 2009 BST
gpg: using DSA key 20ACB3BE515C238D
gpg: Can't check signature: No public key
On Mon, Aug 10, 2009 at 03:00:14PM +0100, Keith Edmunds wrote:
> I want to replicate a huge (multiple TBs) directory tree such that the
> replica has the same files, same GIDs/UIDs as the original, same paths, but
> with all the files 0 bytes. In other words, copy the directory and file
> structure but not the data. It feels as if this should be easy to do, but
> I haven't thought of an easy way yet...


I'm assuming that your terabytes of stuff consists of a large (>1e6
or so) number of smallish files, rather than a few large files.

I can think of a couple of ways of doing this via some bash
scripts, but doing it purely in bash is going to involve invoking at
least one external application per file, and you'll have to swallow a
relatively large overhead for process initialisation each time. So,
for performance reasons, I'd suggest doing it all in something a bit
more capable.

My (sketch) attempt in python is below. It won't copy fifos,
sockets, or links (hard or soft), and won't handle infinite link
recursion at all well, but should do what you want provided you have a
fairly sane and boring filesystem with mostly just files and
directories. It's untested, so use at your peril.

Hugo.

#!/usr/bin/python

import os
import os.path

def process_dir(oldroot, path, newroot):
    for name in os.listdir(path):
        newpath = os.path.join(path, name)
        srcname = os.path.join(oldroot, newpath)
        destname = os.path.join(newroot, newpath)
        st = os.stat(fullname)
        if stat.S_ISDIR(st.st_mode):
            os.mkdir(destname, st.st_mode)
            os.chown(destname, st.st_uid, st.st_gid)
            process_dir(oldroot, newpath, newroot)
        elif stat.S_ISREG(st.st_mode):
            f = open(destname, "w")
            f.close()
            os.chmod(destname, st.st_mode)
            os.chown(destname, st.st_uid, st.st_gid)
        else:
            print destname, "has unhandled mode", st.st_mode


process_dir("/path/to/source", "", "/path_to_destination")



-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
      --- Eighth Army Push Bottles Up Germans -- WWII newspaper ---      
                     headline (possibly apocryphal)