07 Jul 2008
Linux Filesystem Backups

Backup is an essential responsibility that comes with owning a computer, but it is more honored in the breech than in practice.

Echoing what I said in MySQL Backups, some situations may require more elaborate techniques, but these scripts are “good enough” for my needs. I hope you find these scripts useful, and welcome comments, critiques, or suggestions for alternative methods. Note: these are just examples. If you use them, you do so at your own risk. You’ll want to adjust them for your own situation.

I run these scripts ad hoc (via the at command), rather than via cron, for a couple reasons. First, these run on my laptop, so it may be down, or the backup media may not be avalable. Second, my personal schedule is very irregular and I prefer to back up when the system is quiescent. I’ve found the most convenient time for incremental backups is while I’m getting ready for work. I do full backups some time over the weekend.

My schedule, though not rigidly adhered to, is to run incrementals every morning. Because the name pattern includes the day of the week, I generally have seven copies available, although not necessarily the most recent seven days. Currently, they contain any files modified within 30 days.

Full backups use a similar naming scheme, which I should fix at some point. I pick one of the weekend backups each month to preserve long term, and rename it to indicate the specific date. I try to keep about a year’s worth of full backups available.

The first file backed up is /backid.txt; it is also included in the emailed logfile. The create-backid-file script (below) dumps several pieces of information useful in recovering a system into it, such as the output of fdisk -l and /etc/fstab. Because I use cpio, this text can be viewed directly from the archive file with less, head, or even cat. That could be handy if you have limited resources to work with during a recovery.

After the backup itself completes, a verify-crc pass is made over the archive. This may not be as assuring as doing an actual file-by-file comparison, but seems a good tradeoff in my situation between reliability and time. An occasional file restore should be done to confirm that what you think is getting backed up really is. I’ve also done bare-metal restores without problems using these archives.

The actual scripts are listed later, but first I’d like to point out a few things from the log file.

  • the backid.txt file is included
  • line counts from the backup pass and the verify pass are included; the vast majority of the time, these are the same
  • the block count from the two passes are included; these should always be the same
  • the size of the archive is calculated
  • the logs are kept locally for review as well as being copied to the backup media
  • timestamps are used at important steps so I can develop a feel for how the scripts perform

Here’s a sample log:

  1. ==============================================================================
  2. Script:   incr-backup
  3. Started:  Wed Jul  2 06:57:03 CDT 2008
  4. ==============================================================================
  5.  
  6. [ … deleted most of log … ]
  7.  
  8. Wed Jul  2 08:14:19 CDT 2008 : testing/listing done
  9.  
  10. line counts from logs
  11.    96944 /tmp/joule-incr-Wed-v0000.log
  12.    96945 /tmp/joule-incr-Wed-v0000.toc
  13.  
  14. block counts (size=32768) from logs
  15. ==> /tmp/joule-incr-Wed-v0000.log <==
  16. 1145922 blocks
  17.  
  18. ==> /tmp/joule-incr-Wed-v0000.toc <==
  19. 1145922 blocks
  20.  
  21. 1145922 blocks @ 32768/block = 34.97 GB
  22.  
  23. Wed Jul  2 08:14:20 CDT 2008 : compressing logs
  24.  
  25. Wed Jul  2 08:14:22 CDT 2008 : moving logs to /usr/local/data/bcklogs
  26. Wed Jul  2 08:14:22 CDT 2008 : copying logs to /media/MX200702082108
  27.  
  28. ==============================================================================
  29. Script:    incr-backup
  30. Started:   Wed Jul  2 06:57:03 CDT 2008
  31. Finished:  Wed Jul  2 08:14:22 CDT 2008
  32. Usage:     34.97 GB; 96945 files/dirs
  33. ==============================================================================

It’s kinda large because of a VMware disk image and a couple of dvd .iso files downloaded but not yet burned.

This is the full-backup script (text). Things to note:

  • the DIRS variable specifies which filesystems to back up; the id file should be first in the list
  • nul-terminated paths (options -print0, -va0) are used to avoid issues with unusual filenames
  • the VOLNBR is just a holdover from tape and could be removed
  1. #!/bin/bash
  2. # @(#) $Id$
  3.  
  4. BAR="=============================================================================="
  5. MyScript="`basename $0`"
  6. MyHost="`hostname -s`"
  7. MyStart="`date`"
  8.  
  9. # —————————————————————————-
  10.  
  11. VOLNBR="${1:-"0000"}"
  12.  
  13. DOW="`date +%a`"
  14. # Sun, Mon, …
  15.  
  16. BTYPE=full
  17. BCK="$MyHost-$BTYPE-$DOW-v$VOLNBR.cpio"
  18. LOG="$MyHost-$BTYPE-$DOW-v$VOLNBR.log"
  19. TOC="$MyHost-$BTYPE-$DOW-v$VOLNBR.toc"
  20.  
  21. WRKDIR="/tmp"
  22. DSTDIR="/media/MX200702082108"
  23. ARCDIR="/usr/local/data/bcklogs"
  24.  
  25. BSIZE=32768
  26. IDFILE="backid.txt" # really /backid.txt
  27. $HOME/bin/create-backid-file
  28.  
  29. # —————————————————————————-
  30.  
  31. # directories to be backed up; command option specifies not to cross filesystems
  32. # work is done after a "cd /", so the "./" prefix is relative to "/"
  33.  
  34. ### just for testing…
  35. ### DIRS="$IDFILE ./boot"
  36.  
  37. DIRS="./$IDFILE ./boot ./ ./home ./data ./usr/local ./usr ./opt ./var"
  38. # omitted: /tmp /media
  39.  
  40. # —————————————————————————-
  41.  
  42. echo ""
  43. echo "$BAR"
  44. echo "Script:   $MyScript"
  45. echo "Started:  $MyStart"
  46. echo "$BAR"
  47.  
  48. echo ""
  49. echo "Contents of /$IDFILE:"
  50. cat /$IDFILE
  51.  
  52. echo ""
  53. echo "Mounted file systems:"
  54. df -h
  55.  
  56. # —————————————————————————-
  57.  
  58. echo ""
  59. echo "`date` : creating backup"
  60. echo "  targets: $DIRS"
  61.  
  62. # stdout is empty (always?) when using the -O option of cpio
  63. # all content comes from stderr being redirected to stdout
  64.  
  65. cd / &&
  66. find $DIRS -xdev -depth -print0 |
  67. cpio -o -va0 -H crc -C $BSIZE -O $DSTDIR/$BCK >$WRKDIR/$LOG 2>&1
  68.  
  69. # —————————————————————————-
  70.  
  71. echo ""
  72. echo "`date` : testing and listing backup"
  73.  
  74. # block count is written to stderr, but can't just send stderr to stdout
  75. # because the count appears to be emitted at random within stdout stream
  76.  
  77. cpio -i -vt -H crc –only-verify-crc -C $BSIZE -I $DSTDIR/$BCK >$WRKDIR/$TOC 2>$WRKDIR/$TOC.err
  78.  
  79. cat $WRKDIR/$TOC.err >>$WRKDIR/$TOC
  80. rm -f $WRKDIR/$TOC.err
  81.  
  82. echo ""
  83. echo "`date` : testing/listing done"
  84.  
  85. # —————————————————————————-
  86.  
  87. echo ""
  88. echo "line counts from logs"
  89. wc -l $WRKDIR/$LOG $WRKDIR/$TOC | head -2
  90.  
  91. echo ""
  92. echo "block counts (size=$BSIZE) from logs"
  93. tail -n 1 $WRKDIR/$LOG $WRKDIR/$TOC
  94.  
  95. echo ""
  96. fdcnt=`wc -l $WRKDIR/$TOC | sed 's/^ *//' | cut -d' ' -f1`
  97. blkcnt=`tail -n 1 $WRKDIR/$TOC | cut -d' ' -f1`
  98. gbcnt=`echo "scale=2; $blkcnt * $BSIZE / 1024 / 1024 / 1024" | bc`
  99. echo "$blkcnt blocks @ $BSIZE/block = $gbcnt GB"
  100.  
  101. # —————————————————————————-
  102.  
  103. echo ""
  104. echo "`date` : compressing logs"
  105. gzip $WRKDIR/$LOG $WRKDIR/$TOC
  106.  
  107. echo ""
  108. echo "`date` : moving logs to $ARCDIR"
  109. [ ! -d $ARCDIR ] && mkdir $ARCDIR
  110. mv -f $WRKDIR/$LOG.gz $WRKDIR/$TOC.gz $ARCDIR/
  111. chmod u=rw,go= $ARCDIR/$LOG.gz $ARCDIR/$TOC.gz
  112.  
  113. echo "`date` : copying logs to $DSTDIR"
  114. cp $ARCDIR/$LOG.gz $ARCDIR/$TOC.gz $DSTDIR/
  115.  
  116. # —————————————————————————-
  117.  
  118. MyFinish="`date`"
  119.  
  120. echo ""
  121. echo "$BAR"
  122. echo "Script:    $MyScript"
  123. echo "Started:   $MyStart"
  124. echo "Finished:  $MyFinish"
  125. echo "Usage:     $gbcnt GB; $fdcnt files/dirs"
  126. echo "$BAR"
  127.  
  128. # —————————————————————————-
  129. # end
  130. # —————————————————————————-

The incr-backup script (text) only differs in two lines; next time I edit the scripts I’ll move the interval option to a variable.

  1. $ diff full-backup incr-backup
  2. 16c16
  3. < BTYPE=full
  4. > BTYPE=incr
  5. 66c66
  6. < find $DIRS -xdev -depth -print0 |
  7. > find $DIRS -xdev -depth -mtime -30 -print0 |

Finally, the create-backid-file (text). I’ve stripped out the extra noise from the script for display in this post. Select the text link to get the whole thing. The grub.conf (or lilo) should probably be added as well.

  1. uname -a
  2. cat /etc/fstab
  3. df -h
  4. fdisk -l
  5. chkconfig –list
  6. ps -ef

Aside, check out the full-size version of the lead image — it’s high res and quite interesting, along with the Wikipedia article Hard disk drive where I found it.

image: Paul R. Potts, SixHardDriveFormFactors.jpg, Wikimedia Commons

Category: programming
Tags: , , ,
(comments closed) | (trackbacks closed) | Permalink | Subscribe to comments |

Site last updated 2015-01-12 @ 13:31:07; This content last updated 2009-09-08 @ 06:13:33