Engineering

Running Local CoreOS Workers

We’re working with CoreOS here, both in the cloud and locally. For 24/7 workloads, local servers are far more cost effective than “big iron”-type instances in the cloud, like c3.8xlarge or r3.8xlarge. These can be around $2/hour, so running locally for things like continuous integration can pay back in a month or less.

Here’s how to boot a local worker using a CoreOS USB image and have it upgrade automatically.

Make a bootable USB from the stock CoreOS image.

  • Insert your USB stick and make note of the device, here we’re using /dev/sdb.
  • Download the stable CoreOS .iso. Later, in you cloud-config, you can choose from stable, beta, or alpha.
1
wget http://stable.release.core-os.net/amd64-usr/current/coreos_production_iso_image.iso
  • Your cloud-config that we’ll create below will be available on the web. Create a syslinux.cfg that knows where it is:
1
2
3
4
5
6
7
8
9
10
cat <<EOF > syslinux.cfg
default coreos
prompt 1
timeout 15

label coreos
  menu default
  kernel /coreos/vmlinuz
  append initrd=/coreos/cpio.gz coreos.autologin cloud-config-url=http://your.domain.com/cloud-config.yml
EOF
  • Now, we’ll mount the stock image, format your USB, copy the stock image, and make it bootable with syslinux.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
DEVICE=/dev/sdb
mkdir stock
sudo mount -oro,loop coreos_production_iso_image.iso ./stock
dd if=/dev/zero of=${DEVICE} bs=512 count=1
fdisk $DEVICE <<EOF
n
p
1

+2048M
t
6
a
1
w

EOF

sudo mkfs.msdos -F 16 ${DEVICE}1
mkdir vfat
sudo mount ${DEVICE}1 ./vfat/
cd vfat/
sudo cp -r ../stock/* .
sudo cp ../syslinux.cfg syslinux
cd ..
sudo umount vfat/
syslinux --install ${DEVICE}1 --directory syslinux
dd if=./stock/isolinux/mbr.bin of=${DEVICE} # see man syslinux and search for mbr
sync # you can never be too careful

Create a cloud-config that Auto-Updates

Below, we’ve listed the complete cloud-config.yml. Here is a bit of explanation of the individual pieces.

  • you’ll want your local instance to use local storage. So, add services that format the local disk as btrfs and labels it so Docker can find it:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
  - name: format-ephemeral.service
    command: start
    content: |
      [Unit]
      Description=Formats the ephemeral drive
      Before=var-lib-docker.mount
      [Service]
      Type=oneshot
      RemainAfterExit=yes
      ExecStart=/usr/sbin/wipefs -f /dev/sda
      ExecStart=/usr/sbin/mkfs.btrfs -LDOCKER -f /dev/sda
  - name: var-lib-docker.mount
    command: start
    content: |
      [Unit]
      Description=Mount ephemeral to /var/lib/docker
      Before=docker.service
      [Mount]
      What=LABEL=DOCKER
      Where=/var/lib/docker
      Type=btrfs
  • The stock CoreOS image has a file called /usr/.noupdate that interacts with several systemd service files and stops the instance from auto-updating. The culprit looks like this:
1
ConditionPathExists=!/usr/.noupdate
  • Rewrite those services without the ConditionPathExists:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
  - name: locksmithd.service
    command: start
    content: |
      [Unit]
      Description=Cluster reboot manager
      Requires=update-engine.service
      After=update-engine.service
      ConditionVirtualization=!container
      [Service]
      CPUShares=16
      MemoryLimit=32M
      PrivateDevices=true
      EnvironmentFile=-/usr/share/coreos/update.conf
      EnvironmentFile=-/etc/coreos/update.conf
      ExecStart=/usr/lib/locksmith/locksmithd
      Restart=always
      RestartSec=10s
      [Install]
      WantedBy=multi-user.target
  - name: update-engine.service
    command: start
    content: |
      [Unit]
      Description=Update Engine
      ConditionVirtualization=!container
      [Service]
      Type=dbus
      BusName=com.coreos.update1
      ExecStart=/usr/sbin/update_engine -foreground -logtostderr -no_connection_manager
      BlockIOWeight=100
      Restart=always
      RestartSec=30
      [Install]
      WantedBy=default.target
  • Pick your channel:
1
2
3
4
coreos:
  update:
    reboot-strategy: best-effort
    group: alpha
  • You might want to schedule your reboots for a specific time of day when no one is likely to notice. Add some services to do that:
1
2
3
4
5
6
7
8
9
10
11
12
13
  - name: update-window.service
    content: |
      [Unit]
      Description=Reboot if an update has been downloaded
      [Service]
      ExecStart=/usr/bin/bash -c 'if update_engine_client -status | grep NEED_REBOOT; then reboot; fi'
  - name: update-window.timer
    command: start
    content: |
      [Unit]
      Description=Reboot if needed at 05:00 daily
      [Timer]
      OnCalendar=*-*-* 05:00:00

Here’s the complete cloud-config. Remember to add some ssh public keys so that you can login easily. And don’t forget the pesky comment at the top of the file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
#cloud-config
---
coreos:
  fleet:
    etcd_servers: http://etcdcluster1.grandrounds.com:4001, http://etcdcluster2.grandrounds.com:4001,
      http://etcdcluster3.grandrounds.com:4001
    metadata: role=worker
  update:
    reboot-strategy: best-effort
    group: alpha
  units:
  - name: locksmithd.service
    command: start
    content: |
      [Unit]
      Description=Cluster reboot manager
      Requires=update-engine.service
      After=update-engine.service
      ConditionVirtualization=!container
      [Service]
      CPUShares=16
      MemoryLimit=32M
      PrivateDevices=true
      EnvironmentFile=-/usr/share/coreos/update.conf
      EnvironmentFile=-/etc/coreos/update.conf
      ExecStart=/usr/lib/locksmith/locksmithd
      Restart=always
      RestartSec=10s
      [Install]
      WantedBy=multi-user.target
  - name: update-engine.service
    command: start
    content: |
      [Unit]
      Description=Update Engine
      ConditionVirtualization=!container
      [Service]
      Type=dbus
      BusName=com.coreos.update1
      ExecStart=/usr/sbin/update_engine -foreground -logtostderr -no_connection_manager
      BlockIOWeight=100
      Restart=always
      RestartSec=30
      [Install]
      WantedBy=default.target
  - name: fleet.service
    command: start
  - name: update-window.service
    content: |
      [Unit]
      Description=Reboot if an update has been downloaded
      [Service]
      ExecStart=/usr/bin/bash -c 'if update_engine_client -status | grep NEED_REBOOT; then reboot; fi'
  - name: update-window.timer
    command: start
    content: |
      [Unit]
      Description=Reboot if needed at 05:00 daily
      [Timer]
      OnCalendar=*-*-* 05:00:00
  - name: format-ephemeral.service
    command: start
    content: |
      [Unit]
      Description=Formats the ephemeral drive
      Before=var-lib-docker.mount
      [Service]
      Type=oneshot
      RemainAfterExit=yes
      ExecStart=/usr/sbin/wipefs -f /dev/sda
      ExecStart=/usr/sbin/mkfs.btrfs -LDOCKER -f /dev/sda
  - name: var-lib-docker.mount
    command: start
    content: |
      [Unit]
      Description=Mount ephemeral to /var/lib/docker
      Before=docker.service
      [Mount]
      What=LABEL=DOCKER
      Where=/var/lib/docker
      Type=btrfs
ssh_authorized_keys:
- |
  ssh-rsasomeone@grandrounds.com
- |
  ssh-rsaanother@grandrounds.com

Done!

Now you’ve got a powerful CoreOS worker locally for a fraction of the 24/7 price at AWS or another cloud provider. Enjoy.