Contextualisation
The process of contextualisation configures a virtual machine after it has been installed. Typical examples would be to create additional users, install software packages or call a configuration management system. These steps can be used to take a reference image and customise it further. Contextualisation is only run once when the VM is created, further configuration steps should be performed by a configuration management system such as Puppet.
The benefit of using contextualisation compared to managing your own images is that CERN IT will ensure an up-to-date image is maintained, with appropriate bug fixes and security patches and that the IT support lines for cloud infrastructure and operating system support will investigate issues encountered with the operating system. Private images can be uploaded but there is no central IT support for VMs based on private images.
The standard Linux images as provided by the CERN private cloud include a contextualisation feature using the open source cloud-init package. The method to use this is documented below.
Note
The 'User Data' method of passing information to an instance should not be used for sensitive information. For this, the 'Config Drive' option described below or other means, such as secret stores offered by a configuration management system, should be used. See this OpenStack Security Note for more details.
Creating Virtual Machines with User Data
There exist different ways to launch instances with user data.
- From the OpenStack dashboard, in the "Configuration" tab you can either load a user data file or fill in the "Customisation Script" text box
- Using the
openstack
command line tool, the user data can be provided from a file
Verifying User Data
The user data for a VM can be retrieved from inside the VM using the following command. The "magic" IP address is fixed and should not be changed for different machines.
Examples of user data
The upsteam cloud-init
documentation has some nice examples.
Create all users
$ cat > cern-config-users.txt << EOF
#!/bin/sh
yum install -y cern-config-users
/usr/sbin/cern-config-users --setup-all
EOF
Running a command
The following syntax can also be used
$ cat > cern-config-users.txt << EOF
#cloud-config
runcmd:
- [ /usr/bin/yum, "install", -y, "cern-config-users" ]
- [ /usr/sbin/cern-config-users, --setup-all ]
EOF
Install the Folding @ Home client
This example shows how the CERN Cloud team used to deploy the Folding @ Home client on out-of-warranty resources:
$ cat > folding-at-home.sh << EOF
#!/bin/sh
# Install fahclient RPM as found on https://foldingathome.org/start-folding/
yum install https://download.foldingathome.org/releases/v7/public/fahclient/centos-6.7-64bit/release/fahclient-7.6.21-1.x86_64.rpm -y
# Join the CERN team
echo "EXTRA_OPTS=\"--user=CERN_Cloud --team=38188 --gpu=false --smp=true\"" >> /etc/default/fahclient
# Restart the service
/usr/sbin/service FAHClient restart
EOF
Don't grow the underlying partition
Large file conversion
User data can be provided as a gzip file if needed where the user data is larger than 16384 bytes,
$ cat > userdata4zip.txt <<EOF
#!/bin/sh
wget -O /tmp/large-userdata.txt http://mywebsite.cern.ch/large-userdata.txt
EOF
gzip -c userdata4zip.txt > userdata4zip.gz
Using a GZIP file
User data can be provided as a gzip file if needed where the user data is larger than 16384 bytes,
$ cat > userdata4zip.txt <<EOF
#!/bin/sh
wget -O /tmp/large-userdata.txt http://mywebsite.cern.ch/large-userdata.txt
EOF
gzip -c userdata4zip.txt > userdata4zip.gz
Then use openstack server create
command to launch a new instance, you will see that the file specified the user-data has been downloaded under /tmp directory.
Include Directive
Provide userdata in a "include" way, starts with "#include" or "Content-Type: text/x-include-url", contains a list of urls, one url per line, the userdata passed by the urls can be plain txt, gzip file or mime-multi-part script. Here is an example:
#include
# entries are one url per line. comment lines beginning with '#' are allowed
# urls are passed to urllib.urlopen, so the format must be supported there
http://mywebsite.cern.ch/userdata.txt
The content of userdata.txt is:
which will get robots.txt and put it under /tmp directory.
Then use openstack server create
command to launch a new instance, you will see that the file specified the user-data has been downloaded under /tmp directory.
Multiple part
cloud-init supply a method called "multiple part" to supply user data in multiple ways, which means you can use userdata script and cloud-config (or other methods recognized by cloud-init) at the same time. cloud-init provides a script write-mime-multipart to generate a final userdata file, here is the sample:
$ cat userdata4config
#cloud-config
runcmd:
- [ wget, "http://slashdot.org", -O, /tmp/index.html ]
$ cat userdata4include
#include
# entries are one url per line. comment lines beginning with '#' are allowed
# urls are passed to urllib.urlopen, so the format must be supported there
http://mywebsite.cern.ch/userdata.txt
Then use write-mime-multipart (from the cloud-utils RPM) to generate userdata4multi.txt:
The resulting file is:
Content-Type: multipart/mixed; boundary="===============1328186416458086896=="
MIME-Version: 1.0
--===============1328186416458086896==
Content-Type: text/x-include-url; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="userdata4include"
#include
# entries are one url per line. comment lines beginning with '#' are allowed
# urls are passed to urllib.urlopen, so the format must be supported there
http://mywebsite.cern.ch/userdata.txt
--===============1328186416458086896==
Content-Type: text/x-shellscript; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="userdata4script"
#! /bin/bash
mkdir -p /tmp/rdu
echo "Hello World!" > helloworld.txt
--===============1328186416458086896==
Content-Type: text/cloud-config; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="userdata4config"
#cloud-config
runcmd:
- [ wget, "http://slashdot.org", -O, /tmp/index.html ]
--===============1328186416458086896==--
Cloud boothook
Starts with "#cloud-boothook" or "Content-Type: text/cloud-boothook", but didn't provide running-only-once mechanism, see the following sample:
It's quite similar as userdata script.
References
For more information about cloud-init:
- Upstream code for the cloud-init package is here Documentation online here
- Ubuntu documentation is here
- Windows cloud init support here
- Amazon documentation on contextualisation is here.
Config Drive
cloud-init provides the most common contextualisation framework. There are others such as amiconfig which operate on similar techniques.
One different approach is a config drive. This takes a different approach of defining a read-only image which is made available to the VM.
The directory structure is similar to that returned from the magic IP supplied by Amazon. The following example illustrates the technique. Firstly, create a virtual machine using a standard image and with the parameter --config-drive=true and some additional data such as file.
$ openstack server create --flavor m2.small --image "CS8 - x86_64 [2021-11-01]" --key-name my-key --config-drive=true my-vm --file motd=/etc/motd
Once the VM is created, this configuration drive can be mounted:
$ mkdir /mnt/config
$ mount /dev/disk/by-label/config-2 /mnt/config
mount: block device /dev/sr0 is write-protected, mounting read-only
The drive can then be inspected using du
.