Howto: Using check_mk/WATO via ssh and jumphost

I thought I’d pen this down right here as it took me a bit to really figure this out.

Problem: I have some Hosts I would like to monitor but I cannot access them directly (VPN also isn’t an option in this case), so I would like to monitor them using SSH, some directly, some behind a jumphost.


Usually the check_mk uses xinetd listening on Port 6556 only limited by allow_from in xinetd config and maybe iptables. This is fine in a closed, trusted environment but not really over public networks.

We could now either use VPN or tunnel the port through ssh port forwarding, but I found it more convenient just using ssh as a datasource program.

Preparing the nodes

We surely won’t use passwords for this, but rather a key with very limited capabilities.

So, first go to the monitoring site (make sure to do this as the monitoring user) and create a key pair:

OMD[site]:$ ssh-keygen -t ed25519
Generating public/private ed25519 key pair.
Enter file in which to save the key (/omd/sites/site/.ssh/id_ed25519): /omd/sites/site/.ssh/id_check_mk
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /omd/sites/site/.ssh/id_check_mk.
Your public key has been saved in /omd/sites/site/.ssh/
The key fingerprint is:
The key's randomart image is:
+--[ED25519 256]--+
|. |
|o . . . . |
|.o . . . . . |
|oo. . . . .o |
|.o. . S. . +E |
| = . . o + ++ |
|= o o + .o +o+.|
| .+. = . . =o.+..|
| .+*o . o+Bo. |

Don’t use a passphrase. Now append the public key to the authorized_keys file on the monitored nodes. I am using ansible for this:

- name: add mod ssh-key
 user: root
 state: present
 key: "{{ lookup('file', '/root/.ssh/') }}"
 key_options: 'command="/usr/bin/check_mk_agent"'

This results in:

command="/usr/bin/check_mk_agent" ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAXXXXXXT9PMJYIN4Mjdc9gsSAAAAAAxAZMhblN4+Mn18wh

Now you can ssh to that node from the monitoring host using the key, you should then get the output of the check_mk_agent.


Case 1: Directly accessible node

In WATO, go to Host & Service Parameters => Datasource Programs => Individual program call instead of agent access.

We give a simple name and the command line using the <IP> macro.

ssh -tt -o StrictHostKeyChecking=no root@

Some hosts have issues spawning a tty, so we omit that with -tt. I also had some issues with the host keys which afford disabling StrictHostKeyChecking.

I am using this rule for all monitored hosts, therefor I don’t need any rules set up. If you only want to apply these rules to single hosts, just add their names below, or better create a rule as shown in the jumphost example below.

Case 2: Nodes behind jumphost

We use the same approach with a jumphost, just adding that host in between.

We extend our ssh line by -J jump@jumphostMake sure that you don’t use the root account on the jump host!

If you are using an older or different version of SSH which doesn’t support the -J switch, you need to do it using the old style -W version.

As I only want specific nodes to be monitored using the jumphost, I created another choice in the networking host tags, so all Hosts tagged with “No ping possible” are monitored using the jumphost.

Now another issue arises: Ping fails, which means that all services are being monitored properly while the host has the status “down”. So we need to change that, too with another rule, also tagged with the same tag above.

Go to Host & Service Parameters => Monitoring Configuration => Host Check Command and add a new rule.

Switch the command from PING to Use the status of the Check_MK Agent and chose the correct host tag.




Encrypted Backup-Server with Debian and Dirvish – LVM in LUKS

If you haven’t heard of Dirvish, it’s about time. It is a very neat, rsync and hardlink based backup solution, just like a pro version of Apple’s TimeMachine.

What I want

My setup is quite specific for my use case: Having used dirvish at work for a while I was still stuck with backup-manager on my own servers, which is quite neat, but it leaves you with differential tarballs of your system which makes it an annoying task finding a specific version of one file somewhere in the past.

Dirvish, running on the backup server, logs into the target system using ssh and syncs the whole filesystem (you can define excludes, of course) to a folder with a timestamp in its name on the backup server. The next time it runs (simple cronjob) it will create a new timestamped folder, sync only changed files and hardlink everything else.
So, in the end we get a kind of snapshot of each time dirvish ran, where we can simply copy files back or even roll back the complete system.

When setting up my new dirvish backup server (replacing the ancient tarball and ftp solution) I wanted it to save data incrementally and store it on en encrypted device. For easiness of use I simply enter the encryption password manually after booting. Therefore only the dirvish partition (“bank”) is on an encrypted device, the rest of the machine is plain.
Caveat: The ssh key for accessing the other machines is on the unencrypted partition, this needs some rework to be done …

System: LVM on LUKS on VD in VM on LVM. For sure!

My Backup server runs in a VM on one of my Xen systems. Actually, the whole dirvish-bank (i.e. the location of all backups) is also synced to another machine at home for real backup.

The VM has got two virtual disks: the system disk, xvda, and (for historic reason) xvdc as storage for dirvish. This disk is completely encrypted using LUKS. I won’t cover encryption in detail here.

Inside this encrypted container I will create a LVM volume group, and backup will be done on volumes inside this group. The reason for this nested LVMs (the virtual disks already live in a LVM on the host) is that I will be able to create different “tiers” of backups if possible: If space on the underlying host gets tight, I will be able to move lower priority hosts on a second volume and don’t risk critical machines not being backuped.

So we basically have:

    LV_dirvish_system       ⇒ xvda
    LV_dirvish_bank         ⇒ xvdc
        LUKSVolume          ⇒ lukslvm
                LV_dirvish  ⇒ /home/dirvish


I did this on a freshly installed Debian Jessie, though it will quite surely work similarly on Ubuntu, SuSE or RedHatish systems. So first install cryptsetup:

apt-get install cryptsetup

Next, let’s encrypt our whole data disk:

cryptsetup -c aes-xts-plain64 -s 512 -h sha512 luksFormat /dev/xvdc

You need to answer some questions, especially for a password. Use a good one, but remember that you will have to type it each time you want to mount the volume. We better create a backup of the LUKS-Header and set up a second, even more complex password. Store both at a very save place:

cryptsetup luksHeaderBackup /dev/xvdc --header-backup-file luksDirvish
cryptsetup luksAddKey --key-slot 1 /dev/xvdc

We now have a completely encrypted partition and will make it available through deivcemapper under /dev/mapper/lukslvm:

cryptsetup luksOpen /dev/xvdc lukslvm

Let’s create a lvm within our container, the virtual disk hat 210GB:

pvcreate /dev/mapper/lukslvm
vgcreate vg_dirvish /dev/mapper/lukslvm
lvcreate -L 200G -n dirvishbank vg_dirvish

Finally we create our mount point and an entry in /etc/fstab.

mkdir /home/dirvish
/dev/mapper/vg_dirvish-dirvishbank /home/dirvish ext4 noauto,noatime,nodiratime 0 1

We use noauto to prevent the system stalling on boot because the encrypted container isn’t open already. We also use noatime and dirnoatime to speed up rsync comparison.

Preparing dirvish

Now let’s get the backup running.

First you need to create a key pair for ssh login to the target machines. I won’t cover this here because there are trillions of pages about this. I called the keys id_rsa.dirvish and and put them in /root/.ssh/ and copied the public key to all target machines (ansible is a very good friend!) Do yourself a favour and test if the login works.

Now install dirvish:

apt-get install dirvish

Now go to /etc/dirvish and edit the master.conf file.

expire-default: +5 days

We here define the bank (like “place where you store precious things”), which of course is our volume in the LUKS vault.

We also have some common excludes and a default expiry.

Debian provides a cronjob with some automatism, we won’t use that but rather create our own cronjob later in this process.

Backing up

Now let’s create our first backup. We will create a backup of the dirvish machine itself. Won’t help against failing disks, but against failing admins breaking stuff.

For each target machine we will create a folder hierarchy, the so called vault, which looks like this:


HOST will of course be the hostname of the target machine. Calling it *-root is just convention for backups of the root-tree. You could call it e.g. HOST-mailspool if you only backup the mail dir of a host.

client: dirvish
tree: /
xdev: 1
index: gzip
log: gzip
image-default: %Y%m%d_%H%M

Which means:

  • client: Hostname (needs to be resolvable!) of the target machine.
  • tree: Folder to backup.
  • xdev: 1 means that it will stay within the filesystem. Take care if you have /var or /home in different volumes (which I tend to have).
  • index: Type of compression of the index file (file list)
  • log: Type of compression of the log file saved within the folder.
  • image-default: Timestamp of the backup folder
  • exclude: Folders to be excluded.

There are a lot more config items, some of them which I use quite often:

  • pre-server: Path to a script on the dirvish server to be run before backup starts.
  • post-server: Path to a script on the dirvish server to be run after backup starts.
  • pre-client: Path to a script on the target machine to be run before backup starts. This is helpful for dumping sql databases before backing up.
  • post-client: Path to a script on the target machine to be run after backup starts.
  • speed-limit: Maximum transfer speed in Mbit/s

When we are finished creating our vault, we need to initialise it. This creates the first complete sync.

dirvish --vault localhost-root --init

Depending on the size of the machine this can take quite a while. When done our vault will look like this:


tree now contains the folder hierarchy synced from the target, while the other files contain meta information.

The last step is creating a cronjob now. Simply add a line to /etc/crontab for each vault. Make sure to use different running times:

1 0 * * * root /usr/sbin/dirvish --vault=localhost-root

Ok, there’s one more step: Create any kind of monitoring facility, e.g. use post-server to send the summary and the log by mail, or parse these files and react on the results or use your existing monitoring solution…


Bridged Xen on Debian Wheezy on a Hetzner Server

Xen (not XeServer, btw!) seems to have taken a bak-seat recently, RedHat/CentOS/Fedora concentrating on KVM and Debian silently neglecting it.

This is reflected in documentation, there is a lot of outdated stuff around, especially about bridged setups. Same occurs to packages, at least in Debian Wheezy (NB: I also tried on testing, same results with fairly newer packages).

My aim was a virtual host which is directly connected to the internet without any external firewall running different virtual machines which ARE thoroughly firewalled. In order to archive this, I am running the quite decent Sophos UTM (formerly Astaro) as a VM, this is the only virtual machine with direct access to the external network interface. It’s other interface just like all other VMs are connected to an internal bridge without any link to the rest of the world. This is why routing isn’t an option.

This article focusses on the Xen server and the bridging setup, maybe I will write another one later about Sophos UTM etc.



I am running this setup on some servers at Hetzner, though this should be working at most other hosters (some tend to drop the switch connection when they sense a pseudo ARP-spoofing, take care!), I am in no way affiliated to Hetzner.

My setup needs a secondary IP address (the main IP address is used for management of the host, I am assuming the following setup:

External Host-IP:

Secondary IP, used on UTM: At least at Hetzner, this IP address needs to have it’s own MAC address assigned, this can be done in their Robot tool.

Setting up the host

I want to have my host running directly on the (Software-)RAID, I personally don’t really like running the OS on LVM. But I also want to have my VMs live in an LVM realm in order to easily take snapshots, clone etc.

This means that Hetzner’s default setup isn’t very helpful. But they have an answer file based installation using the rescue system. Therefore: boot into the rescue system and run install image.

Note: Preserve the temporary password for the rescue system, you will need it for the freshly installed system!

This lets you define your custom install file and then installs everything within a few minutes. I chose Debian Wheezy Minimal. The only two settings I changed were the hostname (I am using dome in this example) and the partition setup:


I chose 50GB for my root filesystem and 12GB swap.

After saving the file and starting the installation, I had to wait for about five minutes and was presented with a brand new Debian system.


After the first boot I changed the root password and added my own SSH key.

Note: This document doesn’t cover hardening your server, which you really should do!

First thing to do is updating all package sources:

root@dom2 ~ # apt-get update

I tend to install emacs23-nox as soon as possible, YMMV.

It is quite handy to add your domain, if you are using one, to /etc/resolv.conf and to /etc/hosts.

Next is changing the network setup, so edit /etc/network/interfaces :

# Loopback device:
auto lo
iface lo inet loopback

# device: eth0 => transfigured to a bridge!
auto  virbr0
iface virbr0 inet static
  bridge_ports eth0
  bridge_stp off
  bridge_fd 1
  bridge_hello 2
  bridge_maxage 12
  allow-hotplug virbr0

auto virbr1
iface virbr1 inet static
  bridge_ports none
  bridge_fd 1
  bridge_stp off
  bridge_hello 1
  down ifconfig virbr1 down
  allow-hotplug virbr1

So we transformed our (only) network card eth0 into a bridge called virbr0 and added a secondary bridge, virbr1.

Set up Xen

First install the xen system (4.1 on Wheezy) and the xen-tools which are quite helpful setting up VMs.

root@dom2 ~ # apt-get install xen-system-amd64
root@dom2 ~ # apt-get install xen-tools

This will install xen and all the necessary tools.

In order to boot into a xen enabled hypervisor, we need to adapt GRUB:

root@dom2 ~ # dpkg-divert --divert /etc/grub.d/08_linux_xen --rename /etc/grub.d/20_linux_xen

Before we reboot we also adapt the boot command line in /etc/default/grub :

GRUB_CMDLINE_LINUX_DEFAULT="nomodeset dom0_max_vcpus=1 dom0_vcpus_pin dom0_mem=2048M nopat cgroup_enable=memory swapaccount=1"

This basically limits resources on Dom0.

Now update grub and reboot:

root@dom2 ~ # update-grub
Generating grub.cfg ...
Found linux image: /boot/vmlinuz-3.2.0-4-amd64
Found initrd image: /boot/initrd.img-3.2.0-4-amd64
Found linux image: /boot/vmlinuz-3.2.0-4-amd64
Found initrd image: /boot/initrd.img-3.2.0-4-amd64
root@dom2 ~ # reboot

After reboot we can check if Xen is up and running.

root@dom2 ~ # xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0  2047     1     r-----      7.3

Looks fine.

Now we need to set up the network, which is quite straight forward:

Edit the file /etc/xen/xend-config.sxp  and comment out everything about networking, routing and vif except this line:

(vif-script vif-bridge)

You  may also fine tune your Xen setup by changing the following lines:

# enable-dom0-ballooning below) and for xm mem-set when applied to dom0.
(dom0-min-mem 196)

# Whether to enable auto-ballooning of dom0 to allow domUs to be created.
# If enable-dom0-ballooning = no, dom0 will never balloon out.
(enable-dom0-ballooning no)

# xen kernel to reserve the memory for 32-bit paravirtual domains, default
# is "0" (0GB).
#(total_available_memory 0)

# In SMP system, dom0 will use dom0-cpus # of CPUS
# If dom0-cpus = 0, dom0 will take all cpus available
(dom0-cpus 1)

The first thing we changed tells Xen to run a script called vif-bridge  located in /etc/xen/scripts/  as soon as a virtual machine is being created. The script basically checks if the bridge exists and connects the VMs virtual network card to the bridge.

Now we need to adapt this file to our naming convention, so let’s replace the occurrences of xenbr  to virbr  in the file /etc/xen/scripts/vif-bridge :

  # This lets old config files work without modification
  if [ ! -e "/sys/class/net/$bridge" ] && [ -z "${bridge##virbr*}" ]
     if [ -e "/sys/class/net/eth${bridge#virbr1}/bridge" ]

Now restart xend (for some reason the service is called xen  on Debian.

root@dom2 ~ # service xen restart
[ ok ] Restarting Xen daemons: xend xend xenconsoled.

Getting the first VM up and running

Using xen-create-image  from the xen-tools makes it a piece of cake installing our first VM:

root@dom2 ~ # xen-create-image --hostname=test --ip= --netmask --gateway= --bridge=virbr1 --lvm=vg0 --mirror= --memory 512m --swap 1000M --dist=wheezy


  You appear to have a missing vif-script, or network-script, in the
 Xen configuration file /etc/xen/xend-config.sxp.

  Please fix this and restart Xend, or your guests will not be able
 to use any networking!

General Information
Hostname       :  test
Distribution   :  wheezy
Mirror         :
Partitions     :  swap            1000M (swap)
                  /               4Gb   (ext3)
Image type     :  full
Memory size    :  512m
Kernel path    :  /boot/vmlinuz-3.2.0-4-amd64
Initrd path    :  /boot/initrd.img-3.2.0-4-amd64

Networking Information
IP Address 1   : [MAC: 00:16:3E:88:89:EC]
Netmask        :
Gateway        :

Creating swap on /dev/vg0/test-swap

Creating ext3 filesystem on /dev/vg0/test-disk
Installation method: debootstrap

Running hooks

No role scripts were specified.  Skipping

Creating Xen configuration file

No role scripts were specified.  Skipping
Setting up root password
Generating a password for the new guest.
All done
Logfile produced at:

Installation Summary
Hostname        :  test
Distribution    :  wheezy
IP-Address(es)  :
RSA Fingerprint :  76:ab:f1:50:4e:71:49:7e:06:13:87:5c:8a:1d:62:82
Root Password   :  XXXXXXXXXX

You can safely ignore the warning about vif-bridge.

Now there’s a little bug in the xen-tools:

root@dom2 ~ # xm create /etc/xen/test.cfg
Using config file "/etc/xen/test.cfg".
Error: invalid literal for int() with base 10: '512m'

So edit /etc/xen/test.cfg and remove the m from 512m:

memory      = '512m'

memory      = '512'

Now let’s run it:

root@dom2 ~ # xm create /etc/xen/test.cfg
Using config file "/etc/xen/test.cfg".
Started domain test (id=6)

We can now connect a console to the vm and see why’s going on (you can also create it with the -c parameter above …).

root@dom2 ~ # xm console test

Debian GNU/Linux 7 test hvc0

test login: root
Linux test 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.

Hint: CTRL + 5 gets you to of the console again.


HowTo: Hetzner Backup-Server Automount

Das Problem: Hetzner bietet bei einem Root-Server zwar 100GB Backup-Space an, auf den kann man aber nur per FTP, SFTP oder CIFS zugreifen, es ist z.B. kein direktes rsync möglich.

Ursprünglich hatte ich den backup space einfach ständig gemounted, allerdings gab es immer wieder Probleme mit hängenden Handles, vermutlich gehen die Backupserver in einen Energiesparmodus etc. Also habe ich auf AutoFS umgeschwenkt, und das funktioniert hier prima (Was ich bei grösseren Umgebungen so nicht behaupten kann, btw).

Also hier kurz eine Anleitung, wie man den Backupspace sehr einfach per CIFS und Autofs nutzen kann.

Achtung: Sämtlicher Datenverkehr findet umverschlüsselt zwischen dem Root-Server und dem Backupserver statt. Wollte ich nur loswerden.


Im Robot (Hetzner Config-Tool) muss ein Backup-Space angelegt sein, dafür gibt es dann einen Servernamen, einen Usernamen und ein Passwort.

Hier gelten folgende Daten:

Username: u12345
Passwort: secret

Ausserdem braucht ihr das Paket autofs , ist eigentlich bei jeder Distro im Repo vorhanden.

Ich mounte meinen Backupserver unter /mnt/backup-server/.


Zuerst legen wir eine neue Mapping-Datei an, wir nennen sie auto.backup und legen sie in /etc. Sie hat folgenden Inhalt:

backup-server -fstype=cifs,iocharset=utf8,rw,credentials=/etc/backup-credentials.txt,uid=0,gid=0,file_mode=0660,dir_mode=0770 ://

Der Reihe nach bedeutet diese Zeile:

  • Mounte das Share unter backup-server/ relativ zum übergeordneten Mountpunkt (siehe unten)
  • Mount-typ ist CIFS, also SMB
  • Charset ist UTF8 (wichtig, falls Ihr Umlaute etc. verwendet)
  • rw, also lesen und schreiben
  • Zugangsdaten sind in der Datei /etc/backup-credentials.txt  abgelegt. Achtung: auch hier umverschlüsselt.
  • Neue Dateien werden mit 0660, neue Ordner mit 0770 angelegt.
  • Am Schluss steht der Pfad auf dem Server, der gemounted werden soll. ://  sieht seltsam aus, passt aber.

Die /etc/backup-credentials.txt sieht so aus:


Zuletzt fügen wir dann noch in der /etc/auto.master  folgende Zeile ein:

/mnt	/etc/auto.backup --ghost

–ghost  teilt autofs mit, dass es beim umounten den Mountpunkt nicht löschen soll.

Nun noch den Automounter mit service autofs start  starten.

Sobald man nun auf den Ordner /mnt/backup-server  zugreift, wird er gemounted. Feine Sache.



checkrestart für Nagios und Check_mk

Falls es jemanden interessiert: ich habe einen kleinen Wrapper für Nagios und Check_mk geschrieben, mit dem sich checkrestart sehr einfach ins Monitoring einbinden lässt…


Update: Juhu, jetzt auch auf Nagios Exchange.





Falls die jemand noch nicht kennt (ich war bis vor ein paar Tagen so einer), die Debian-Goodies sind sehr empfehlenswert.

apt-get install debian-goodies

Mir hat es vor allem checkrestart angetan …

checkrestart Prüft, welche Dienste noch alte Versionen von inzwischen upgedateten Paketen nutzen.
dpigs Zeigt, welche installierten Pakete am meisten Platz benötigen
which-pkg-broke <pkg> Zeigt (halbwegs verlässlich) an, welches Paket ein anderes kaputtgemacht hat.
dhomepage <pkg> Zeigt, falls vorhanden, den Link zur Website des Paketes
debget <pkg> Lädt ein .deb eines Packages herunter