A better Discourse backup to Amazon S3 with Backupninja and Duplicity

backup
amazon-s3
backupninja
duplicity

(Dmitry Fedyuk) #1

Discourse has in-box backup capability to Amazon S3, but it has some disadvantages:

  • It is not incremental. So, it eats your space, traffic, and money.
  • It runs only one time a day. So, you can loss all your can loss your forum data a whole day.

So, I build a better backup system for my Discourse forums.
The solution is tested on Debian 7/8.

Step 1.

Install Backupninja.

aptitude install backupninja

Step 2.

Install Duplicity.

aptitude install duplicity

Step 3.

Install boto

aptitude install python-boto

Step 4.

Install S3cmd.

aptitude install s3cmd

Step 5.

Set up your Amazon S3 account and create a bucket.
Place your bucket at any location except Frankfurt: for now, S3cmd fails for Frankfurt location with message:

The authorization mechanism you have provided is not supported.

Step 6.

Set up an Access Key for external access to your Amazon S3 bucket.

Step 7.

Find out where is GNU Privacy Guard on your system:

which gpg

You will be asked for it on the next step.

Step 8.

Run s3cmd --configure command.
You will be asked for

  • «Access Key Id» and «Secret Access Key» you have got on the Step 6
  • GNU Privacy Guard location (see the previous step) and password (select and remember it)

Step 9.

Set up /etc/backupninja.conf file.
My settings are:

loglevel = 3
reportemail = admin@magento-forum.ru
reportsuccess = no
reportinfo = no
reportwarning = yes
reportspace = yes
reporthost = 
reportuser = 
reportdirectory = 
admingroup = root
logfile = /var/backup/backupninja.log
configdirectory = /etc/backup.d
scriptdirectory = /usr/share/backupninja
libdirectory = /usr/lib/backupninja
usecolors = yes
vservers = no

Step 10.

Create /etc/boto.conf file and fill it with your Amazon S3 credentials:

[Credentials]
aws_access_key_id = <place here your Access Key Id>
aws_secret_access_key = <place here your Secret Access Key>

Step 11.

Create the file /etc/backup.d/discourse--db--1.sh with contents:

when = hourly
dir=/var/backup/discourse/db
rm -rf ${dir}
mkdir -p ${dir}
declare -A servers=( \
    [example-1.com]=14578 \
    [example-2.com]=14579 \
)
for domain in "${!servers[@]}"; do
    pg_dump \
        --schema=public \
        --no-owner \
        --no-privileges \
        --host=localhost \
        --port=${servers[$domain]} \
        --username=postgres \
        discourse > ${dir}/${domain}.sql
done

This script connects to Discourse Docker container and dumps Discourse databases.

  • Specify your Discourse Docker container names instead of example-1.com and example-2.com.
  • Specify your PostgreSQL tunneled ports instead of 14578 and 14579.

Step 12.

Create the file /etc/backup.d/discourse--db--2.dup with contents:

when = hourly

options = --allow-source-mismatch --no-encryption --s3-european-buckets --s3-use-new-style --s3-unencrypted-connection --s3-use-multiprocessing --asynchronous-upload --volsize 50 --archive-dir /var/backup/archive --name discourse/db

testconnect = no

[gpg]
password = <your GnuPG password: see Step 8> 

[source]
include = /var/backup/discourse/db

[dest]
incremental = yes
keep = 60
increments = 10
keepincroffulls = 2
desturl = s3+http://<your Amazon S3 bucket name>/<folder path inside the bucket>
awsaccesskeyid = <place here your Access Key Id>
awssecretaccesskey = <place here your Secret Access Key>

This script prepares backup packet with databases and sends it to your Amazon S3 bucket.

  • Specify your GnuPG password: see Step 8.
  • Set value for desturl option.
  • Set values for awsaccesskeyid and awssecretaccesskey options.

Step 13.

This step is excluded.

Step 14.

Create the file /etc/backup.d/discourse--files.dup with contents:

when = everyday at 03:45
when = everyday at 09:45
when = everyday at 15:45
when = everyday at 21:45

options = --allow-source-mismatch --no-encryption --s3-european-buckets --s3-use-new-style --s3-unencrypted-connection --s3-use-multiprocessing --asynchronous-upload --volsize 2000 --archive-dir /var/backup/archive --name discourse/files

testconnect = no

[gpg]
password = <your GnuPG password: see Step 8> 

[source]
include = <path to Discourse>/shared/<your Discourse container 1>/uploads
include = <path to Discourse>/shared/<your Discourse container 2>/uploads

[dest]
incremental = yes
keep = 90
increments = 30
keepincroffulls = 2
desturl = s3+http://<your Amazon S3 bucket name>/<folder path inside the bucket>
awsaccesskeyid = <place here your Access Key Id>
awssecretaccesskey = <place here your Secret Access Key>

This script prepares backup packet with images and sends it to your Amazon S3 bucket.

  • Set path to Discourse and specify your container name in [source] section.
  • Specify your GnuPG password: see Step 8.
  • Set value for desturl option.
  • Set values for awsaccesskeyid and awssecretaccesskey options.

Step 15.

Now you can manually test your backup plan:

backupninja --now --debug

Now you can check backup in your Amazon S3 bucket.

Step 16.

To restore backup you need to install some more packages:

aptitude install python-paramiko python-gobject-2

Step 17.

Now, you can test backup restoration:

duplicity --s3-unencrypted-connection --s3-use-new-style --s3-european-buckets s3+http://<your Amazon S3 bucket name>/<folder path inside the bucket> <folder for restored files>

Step 18.

That’s all. Backupninja will run your backup plan automatically on schedule: