GMbackup -- Gentle Music's backup tools

GMbackup consists of scripts that support the making of local incremental backups, local backups to a separate disk and remote backups. The scripts herein are written the Groovy programming language run in a Gradle environment. Groovy and Gradle rely on the Java framework.

The GMbackup tools do not cover backing up Windows. The recommendation is to let these backup scripts handle the home directory's important subdirectories like OneDrive and Documents. When backing up Windows, remove these directories subject to backups by GMbackup before starting the Windows backup to minimize the size of the Windows backup.

Still to be documented

Assumptions

Directory creation or modification dates do not have to be considered.

A file is considered to be moved if

File and directory names are in UTF-8.

A large USB disk is required to create local backups, probably 50 to 100 percent larger than the amount of data to be backed up

Another large USB disk is required to create remote backups, as large as the local backup disk.

If running Windows, neither the Cygwin tree nor the MSYS2 tree must be subject to GMbackup, due to bad symlinks causing problems for programs like rsync.

The remote system must run the same Unix system as the local system, because the remote system runs scripts generated by the local system.

The reader understands that the file name syntax in this document is Unix-like. If stating a file named /e/gmbackup, it implies that if run on Windows, this file's native name is E:\gmbackup and if run in the Cygwin environment, the name is /cygdrive/e/gmbackup.

Introductory View

The four columns of disks represent:

  1. the source trees subject to backup
  2. the GMbackup data store including incremental backups
  3. local backups
  4. remote backups

Concepts

This section introduces concepts used documenting GMbackup.

Source files are files subject to be backed up.

Installation directory is an arbitrary directory where all the GMbackup scripts are installed and executed. If run on Windows, most scripts require administrator privileges.

Incremental backups are tar backup archives holding the modified and new files since the last complete backup or the last incremental backup. The scripts working with incremental backups mark files as being incrementally backed up by creating empty marker files in a shadow directory with today's date. The tar backup archives are stored in the GMbackup data store's top directory, like /e/gmbackup.

The GMbackup directory is this tool's data store, holding information required by GMbackup and incremental backups. Example of such a directory is /e/gmbackup. It has the following structure:

The layout of the GMbackup data store is detailed here.

Local backups are backups made to the local backup disk. Carrying out a local backup, the incremental empty marker files and the incremental tar files are removed. After a local backup, the file trees on the local backup disk are identical to the source file trees.

Remote backup sets are tar backup archives and some scripts to be delivered to a remote site to be remotely unpacked. After the unpacking of the tar archives of such backup sets, the remote disk will be a copy of the local backup disk.

List files -- Essential to backing up only new and modified files, GMbackup creates list files containing detailed information about a file tree or a disk. Here's a fragment of such a list file.

# L:\hostx
# /l/hostx

.:

  2025-06-15 22:00.279              76 nmdsdcid

cmd:

  2023-01-09 10:11.297              45 scripts.txt
  2023-04-24 15:49.214            1582 pvr-backup.sh

documents/old/Site_files:

  2020-05-15 13:19.422            6977 ajax_view.js.download

NOTE: File creation or modification times have a granularity in milliseconds. Some older files or files not created in Windows or Linux may have a second as the smallest unit, making a time represented as 1998-05-23 13:14.000.

Once having complete backups by GMbackup, these list files are handy when looking for files in a computer's various disks and files trees backed up by GMbackup, particularly using the Bash script find.sh.

These lists files are crucial when comparing file structures carried out by GMbackup's scripts.

Scripts calling scripts -- All logic is implemented using Gradle/Groovy and Bash scripts. In the usual case, a Bash script calls a Gradle script. The Gradle script produces a Bash script with backup actions as regular Bash commands. This implies that backup actions can be reviewed and checked for consistency before being executed.

Prerequisites

For a working GMbackup environment, the following software and disks are expected:

Included in the GMbackup distribution is the Gradle wrapper. The Gradle wrapper is a set of Java archives capable of downloading a minimal Gradle runtime environment required to run the GMbackup Groovy scripts.

The Bash command executable must be version 4.4.12(3)-release or higher. The GMbackup package has been successfully tested with Bash version 5.2.21(1)-release.

Environment

Locally, there are four directories in use and a large backup disk. The local directories used are:

  1. The script directory -- This is the working directory where all GMbackup scripts are located.

  2. The GMbackup directory -- This is where incremental backups are stored and where the GMbackup exception files are stored. The directory contains a full tree copy of the source file tree, but all files are empty and are are called marker files. More on the GMbackup directory tree is found here.

  3. The local backup disk or directory -- Where complete local backups are stored. Normally this is a separate and detachable disk, but it can be a directory on a local disk, too.

  4. The directory for remote backup sets -- This is where incremental remote backup archives and scripts are to be stored before being transferred to the remote site.

On the remote computer being the host of the remote disk, there are two directory trees of importance:

  1. The working directory for remote backup increments -- Where incremental tar backup archives are received during transfer.

  2. The backup disk or directory -- Where a shadow of the local backup disk or directory is mirrored.

On Windows, most scripts require Administrator privileges.

GMbackup properties

In the working script directory, a properties file is expected describing some locations required by GMbackup called gmbackup-<hostname>-<os>.properties. Here's an example:

    gmbackupDirectory=/e/gmbackup
    localBackupRoot=/l/hostx
    localRemoteTarget=/e/not-to-be-backed-up

Backup operations files

Backup operations files are text files where each line describes a disk or file tree to be backed up by scripts. Scripts invoke this file line by line. These files reside in the working (script) directory, too. The syntax of such files is described by the following example named hostx-operations.trees:

    # to: a relative path, possibly under E:/gmbackup and L:/hostx
    from=~/cmd to=cmd
    from=~/Desktop to=desktop
    from=~/Documents to=documents
    from=~/Pictures to=pictures
    from=~/Videos to=videos
    from=/e/dev to=e-dev
    from=/w to=w-wd12tb

An instruction line has the following syntax:

Example:

from=/e/dev to=e-dev

This line tells the scripts that the directory tree /e/dev is subject to backups.

The example's to= parameter (a relative path) depicts two top directories, here set to hostx-e-dev:

  1. /e/gmbackup/e-dev: the location of GMbackup data store, set by the property gmbackupDirectory

  2. /l/hostx/e-dev: the local backup disk set by the property localBackupRoot

In detail:

  1. The GMbackup directory, possibly /e/gmbackup, will have a subdirectory e-dev where all files from /e/dev will be shadowed as empty files dated the day of an incremental backup. The directory /e/gmbackup/e-dev may also contain the files backup.file.exclusions and backup.directory.exclusions describing directories (by relative paths) and files to be excluded from the backup. The root, /e/gmbackup, was stated as the gmbackupDirectory property in the properties file above.

  2. The backup disk, here /l (or possibly a backup disk tree, /e/local-backups), will hold the directory e-dev where copies of files from /e/dev will be backed up and mirrored. The root, /l was stated as the localBackupRoot in the properties file above. Following this example, the backup disk's top directory will be /l/hostx/e-dev.

Using this example, let's assume there is a file called /e/dev/expenses-2024.docx, dated 2024-10-31 11:45.

NOTE: The from= and to= strings are used in subsequent script commands.

Scripts supporting incremental backups

to be run manually.

Main script for local backups and remote backup preparation

All required steps to make a local backup and to prepare for a remote backup are compiled into one script called full-backup-cycle.sh. The script facilitates the use of the GMbackup set of scripts.

Here's how to read the script's internal documentation:

$ ./full-backup-circle.sh hostx-operations.trees ?
##
## The steps are:
##
##01  1. Compare source directories (the "from" parameter) to the local disk's
##01     corresponding directories (the "to" parameter). looking for directory
##01     and file moves, creating the detected move scripts
##
##02  2. Review the move script created in phase 1.
##
##03  3. Execute the detected moves script, that is, make the corresponding
##03     moves on the local backup disk
##
##04  4. Create the script that backs up the source file trees from the input
##04     file in $1.
##
##05  5. Run the backup by executing the backup and purge scripts from phase 4.
##
##06  6. Create this backup's local backup disk list file,
##06     local-after-backup-<backup-no>-<os>.list
##
##07  7. Apply the moves found in the detected moves script to last remote
##07     disk's list file
##
##08  8. Create the remote backup incremental kit script. The created script
##08     will include the same moves to be made on the remote site.
##
##09  9. Create the remote backup incremental kit.
##
##10 10. Increment the backup number.
##

Steps 1 and 4 iterate over the lines in $1 (the first parameter), hostx-operations.trees used as an example in this document.

These are the details of each step:

  1. This first step looks for files and directories that have been moved within each set of file trees to be backed up. The analysis is implemented in detect-moves-in-disk-tree.gradle and is carried out once per line in hostx-operations.trees, that is, once per source file tree. The script detected-moves-<backup-no>-<os>.sh is generated. At its first invocation, the output script move-files-on-target-<backup-no>.sh is created. During the remaining invocations of the Gradle script, the file is appended to.

  2. Step 2 is implemented to review the suggested moves on the local backup disk determined in step 1. Simply, the generated script for moved files, detected-moves-<backup-no>-<os>.sh, is displayed.

  3. Step 3 executes the script detected-moves-<backup-no>-<os>.sh, hence moving files and directories on the local backup disk, just like they once were at the source tree or disk. If using the remote backup scheme, the same files and directories later will be moved on the remote disk, too.

  4. Step 4 creates the scripts backup-to-local-backup-disk-<backup-no>-<os>.sh and purge-local-backup-disk-after-backup-<backup-no>-<os>.sh, by iterating over the lines and file trees in hostx-operations.trees. The analysis is implemented in make-backup-to-back-disk-scripts.gradle. At the first invocation, the output script is created. At the remaining invocations of the Gradle script, the file is appended to.

  5. Step 5 runs the newly generated scripts backup-to-local-backup-disk-<backup-no>-<os>.sh and purge-local-backup-disk-after-backup-<backup-no>-<os>.sh.

  6. Now at step 6, the local backup disk reflects the various source file trees and the local backup is complete. This step is the first step to enable the comparison between the local backup disk and the remote disk. A complete listing of the local backup disk's contents is run calling reflect-tree.gradle, creating local-after-backup-<backup-no>-<os>.list. The resulting .list file will be used in step 8.

  7. In step 7, the former remote backup disk's last list file (backup number - 1) is read and the moves of directories and files detected during step 1 are applied to the list file, as if the moves in the remote backup disk had already been carried out. The Gradle script making this modification to the list file of last remote backup is called apply-moves-to-list-file.gradle and the output list file is called remote-after-backup-<backup-no - 1>-with-moves.list and is used in step 8.

  8. In step 8, the list files from the current local backup disk and the list file from step 7, that is, a list of all files on the remote backup disk but reflecting the moves as they had already been made. The script make-remote-backup.gradle is called creating the shell script backup-to-remote-kit-<backup-no>-<os>.sh and one or many tar lists to describe the incremental tar files to be created in step 9.

  9. Running step 9, the script created in step 8, backup-to-remote-kit-<backup-no>-<os>.sh is run. The remote backup kit is created in the directory depicted by the property localRemoteTarget. The created kit consists of one or more tar files and generated scripts to be run at the remote site. Also, the text file holding the current backup number, next-backup is added to the remote kit. If using the configuration example above, the remote backup kit will be stored in the directory /e/not-to-be-backed-up/remote-backup-<backup-no>.

  10. Step 10 is the final step. Here all generated scripts are moved to the older subdirectory and the backup number is incremented by one in the file next-backup, hence completing the local backup cycle and preparing for the next backup, whenever it is carried out.

Here follows an enumeration of all Gradle scripts.

Script Name Description
apply-moves-to-list-file.gradle This script deals with moved (or renamed) files taking the output from detect-moves-in-disk-tree.gradle as input and is called during step 8 in full-backup-cycle.sh.

The purpose of the script is to modify the remote disk's list file (remote-disk-after-backup-<n>.list) from the previous backup in such a manner that files recently having been moved in a source tree are moved in the list file, too. And these edited moves in the remote disk's list file is a preparation for creating the remote backup, avoiding moved files to be part of the remote backup update kit.

detect-moves-in-disk-tree.gradle This script compares two (local) trees looking for files having been moved. The script looks for files in the two structures with the same modification date and the same size. If such a pair of files is found and the paths differ, the file is considered to be moved on the target disk. A bash script move-files-on-target-.sh is created with move commands, that is, a set of move commands that will be eexecuted on the local backup disk as well as the remote backup disk. This generated Bash script also becomes input to the script reflect-moves-in-list-file.gradle.

This script is called in step 3 of the full-backup-cycle.sh Bash script.
find-dupes.gradle Looks for duplicates in list files. This is an important tool for detecting multiple copies of files but is not part of the GMbackup procedure. Simply a utility tool.
find-empty-dirs.gradle This utility script looks for empty directories in a disk or a file tree. It creates the Bash script dirs-to-be-removed.sh. The script is not part of the GMbackup procedure.
make-backup-to-backup-disk-scripts.gradle This script analyzes source disks (from backup description .trees files [like hostx-operations.trees]) and compares them to the corresponding local backup disk (or local backup directory) and generates two scripts, backup-to-local-backup-disk-<date>-<os>.sh and purge-local-disk-after-backup-<date>-<os>.sh for manual analysis before being run. In the `GMbackup` directory tree, there is one empty shadow file with the same name has the file backed up and with the date set to the time of the total backup. Upon an incremental backup, a file called `,i` is created for each file in the incremental backup

This script is called in step 4 of full-backup-cycle.sh.
make-incr-backup-scripts-for-nas.gradle This script creates a bash script backup-incrementally-<date>-<os>.sh to create empty files ("_marker files_") in the `GMbackup` directory tree as well as tar backup archives with the files being backed up. The tar incremental backup files, one for each line in a backup operations file, are stored in `GMbackup` directory.
make-move-script-for-target-disk.gradle This script compares two disk structures looking for files and directories that have moved. The output from the script is a Bash script, move-files-on-target-<date>-<os>.sh to be run prior to a total local backup. This script has similarities with detect-moves-in-disk-tree.gradle. While detect-moves-in-disk-tree.sh works with disk structures, this script works with and analyses list files.
reflect-moves-in-list-file.gradle This script takes the output from detect-moves-in-disk-tree.gradle, move-files-on-target-<backup-no>.sh, and modifies the list file representing the contents of the remote backup and make changes to that list file, as if the moves had been conducted on the remote site. This functionality was added to avoid large files once being renamed must be copied over to the remote site. By modifying the remote site's list file, the script make-remote-backup.gradle only has to add files to the remote kit that are new, and the moved/renamed files do not have to be copied.
reflect-tree.gradle This script creates a complete listing of all files in a directory tree or an entire disk and is a significant part of GMbackup's logic detecting new, modified and moved files.
utilities/file-name-processing.gradle Common methods related to file name processing, in particular converting Windows file name to and from Cygwin syntax.
utilities/file-processing.gradle Common methods related to file processing, including symbolic link detection.
utilities/list-file-processing.gradle Common methods related to creating and parsing .list files, that is, files completely describing a directory tree with respect to name, size and last modification date.

Script supporting remote backups

Script Name Description
make-remote-backup.gradle This script creates a kit of tar files containing all files that are on the local backup disk or directory and not yet on the remote site. It also creates a script, `synch-on-remote-.sh` to be run on the remote site to unpack ("untar") the tar backup files and to delete files that are no longer on the local backup disk.

Scripts on the remote backup site

The remote kit created by make-remote-backup.gradle has the following contents:

  1. next-backup -- Holds the current backup number

  2. local-after-backup-<backup-no>-<os>.list -- The complete list of files from the local backup disk after the local backup had completed and for later comparison.

  3. remote-site-actions.sh -- This generic script implements four actions:

    1. Unpacking of the kit's scripting environment.
    2. Running the remote site backup script synch-on-remote-<backup-no>.sh script, that optionally moves files moved on the local backup disk, "untars" files that were new to the local backup disk and finally, deletes files that no longer exist on the local backup disk
    3. Making a list of all files on the remote backup disk for comparison: remote-after-backup-<backup-no>-<os>.list
    4. Comparing the two list files as a consistency check. The file remote-after-backup-<backup-no>-<os>.list must be copied back to the local system's working directory and is subject to comparison when the next backup is to be run.
  4. remote-toolkit.tar -- The scripting environment required on the remote site:

    1. The Gradle wrapper
    2. GMbackup's Gradle scripts
    3. detected-moves-<backup-no>-<os>.sh -- The moves of files and directories carried out on the local backup disk
    4. synch-on-remote-<backup-no>-<os>.sh -- The generated main script to be run on the remote site, calling tar and removing files that have already been removed on the local backup disk.
    5. tarlists of the backup tar savesets featured
  5. to-remote-<backup-no>-<nn>-<os>.tar.gz -- One or many tar savesets containing files to copied to the remote backup disk.

Wrapper scripts

Common Groovy files

The following Groovy files are common object definitions used by multiple Gradle scripts. They are located in the buildSrc/src/main/groovy/se/gentlemusic/common subdirectory.

Layout of the GMbackup data store

The GMbackup data store contains:

  1. directory trees keeping track of what has been backed up incrementally and locally, as well as
  2. descriptions of files and directories to be omitted during backups
  3. incremental backup tar savesets

Here's an example of a GMbackup data store:

  1. /e/gmbackup -- top directory defined in gmbackup-<hostname>-<os>.properties
    1. documents -- backup directory for C:/userx/Documents named in hostx-operatations.trees.
      1. backup.directory.exclusions -- Enumerates directories to be excluded from being backed up, one directory per line
      2. backup.file.exclusions -- Enumerates files to be excluded from the backup, one filename per line
      3. backup.dirs-with-dupes.exclusions -- Enumerates directories to be excluded from duplicates lookup, one directory per line
      4. aaa -- subdirectory matching C:/userx/Documents/aaa to store empty marker files mapping to the source aaa directory
        1. payroll.xlsx -- empty marker file matching `C:/userx/Documents/aaa/payroll.xslx'
      5. letter.docx -- empty marker file matching C:/userx/Documents/letter.docx. GMbackup uses this file's date to see if the source file has been updated or not when creating a local backup.
      6. letter.docx,i -- empty marker file stating that the source file has been incrementally backed up. The date depicts when it was incrementally backed up.

Getting started

This section describes the steps required to establish a working environment.

Unpack the distribution to the designated working directory.

cd workingDir
tar -xvf gmbackup-1.0.tar

Rename gmbackup-template.properties to gmbackup-<hostname>-<os>.properties.

Currently, os is

(The letter is the first letter from the uname command in lowercase.)

mv gmbackup-hostname-os.properties gmbackup-hostx-c.properties

Carefully edit the properties file (gmbackup-<hostname>-<os>.properties):

  1. Set the gmbackupDirectory property to the directory where to place the GMbackup data store.
  2. Set the localBackupRoot property to the local backup disk's root directory. The property can be disk or directory.
  3. Set the localRemoteTarget property to the directory where to (locally) store remote intremental backups.

Create the file that enumerates the backup operations, that is, the list of source trees to be backed up and their target backup disk's top name. A typical name for his file is "hostx-operations.trees".

Source directories under the login directory can be specified with a tilde (~).

Here's an example line, based on that the hostname is hostx, the source tree is OneDrive and the target directory is set to a string that can depict the origin and to be used as top directory on in the GMbackup data store and on the local backup disk:

from=~/OneDrive to=onedrive

Remarks:

  1. The hostname is optional but nice to have if the local and remote disks are to hold backups from different computers.
  2. The from= and to= strings are mandatory
  3. Comments can be written to lines starting with a hashmark (#).

The GMbackup data store has to be created as well as the local backup disk roots. The script setup-gmbackup.sh:

  1. creates the GMbackup data store root directory, like /e/gmbackup.
  2. creates each target root directory based on the to= properties (one per line) in the source operations file. Example: /e/gmbackup/onedrive
  3. creates three empty exclusion description files. Example: /e/gmbackup/onedrive/backup.dir.exclusions
  4. creates each target root directory on the local backup disk. Example: /l/hostx/onedrive

Handling of moved files

This section details how files moved (or renamed) in a source tree are to be moved/renamed on the local backup disk and remote disk, too.

In the first step of full-backup-cycle.sh, the script detect-moves-in-disk-tree.gradle is called looking for moves in a source disk tree, by comparing the source tree to last backup's list file (local-after-backup-<backup-no>-<os>.list) listing files from the backup disk.

If a file with a certain size and certain last modification modification date appears in different directories or with a different name in the source tree compared to the list file, it is considered to have been moved. Example:

The file payload.docx was moved from the source tree's root to a directory called docs.

$ mv payload.docx docs

The "detect moves" script will create a script snippet looking like this:

safelyMove "payload.docx" "docs/payload.docx"

safelyMove is a Bash procedure moving a file with additional error handling.

Imagine that this move was found in the Documents directory. Following the example from the Backup operations files section above, the local backup disk's target directory is /<backup-drive>/hostx/documents. The safelyMove function calls will be preceeded by a change directory (cd) command to /<backup-drive>/hostx/documents:

cd /<backup-drive>/hostx/documents
safelyMove "payload.docx" "docs/payload.docx"
safelyMove ...

This generated Bash code will be used both backing up the local and remote backup disks.

In order to tell the script make-incremental-backup.gradle to understand that moves will be carried out before actually creating the remote backup kit, the script apply-moves-to-list-file.gradle is called making the changes to the list file (remote-after-backup-<n-1>-<os>.list) representing the current state of the remote disk.

The following edit matches this move example:

Before:

documents:

  2024-11-16 16:28.029            3459 payload.doc
  2025-12-11 23:04.121             433 example.txt
  ...

After:

documents:

  2025-12-11 23:04.121             433 example.txt
  ...

documents/docs:

  2024-11-16 16:28.029            3459 payload.doc

The changes are written to remote-after-backup-<n-1>-<os>-with-moves.list.

This file is then compared to the list file representing the local backup disk after the backup, local-after-backup-<n>-<os>.list when producing the remote backup kit.

Thanks to the edits of the remote disk's last list file, files that have been moved do not need to be added to the remote backup kit.

In order to be able to use the local and remote backup disks from different computers, the following structure is recommended:

and

as used in this document's examples.

[2025-08-25/gm]