Adding cloud-init support to snf-image¶
How cloud-init works¶
Cloud-init is a service that runs at VMs and performs VM configuration at an
early boot stage. The instructions for performing the configuration can be
collected from the configuration files and various implemented datasources.
The configuration files of cloud-init are /etc/cloud/cloud.cfg
and
/etc/cloud/cloud.cfg.d/*.cfg
. The datasources are sources for configuration
data that typically come from the user or from the IaaS service itself (e.g the
instance’s name). Since different IaaS software use different mechanisms to
provide configuration data to the instances (attaching configuration disks,
using the kernel command line, using magic IPs, etc), multiple datasources are
implemented.
Why add support to snf-image¶
If an image is configured to work with cloud-init, the service will destroy some of the changes that snf-image will make. Since most distributions provide cloud images that are configured to work with cloud-init, we would like to gain the benefit of working with official images. Additionally, a user will be able to further configure a VM that is created out of a cloud-init enabled image by using cloud-init’s user data mechanism.
Enabling cloud-init¶
In order to enable snf-image’s cloud-init mode, the image should have set the CLOUD_INIT image property.
Proposed changes¶
Datasources¶
Since we want snf-image to be able to work stand-alone, without having to depend on external services, the software could only make use of 2 datasources:
- NoCloud
- None
All the others expect to fetch data from an external entity (CDROM, Floppy, serial-port, web service) [1].
“None” is the fallback datasource when no other can be selected. It is the
equivalent of an empty datasource in that it provides an empty string as
user data and an empty dictionary as metadata. We could make use of this and let
snf-image-helper inject all it’s configuration to files under
/etc/cloud/cloud.cfg.d/
. We could then allow the user to provide extra
configuration data by using a new OS parameter (e.g. cloud_userdata) whose
content would be injected into a file under /etc/cloud/cloud.cfg.d/
. The
only problem with this approach is that the user may only provide YAML user
data, which is not fully compatible with Openstack. The Nova API allows the
user to provide user data upon server creation. Those user data are available
in the OpenStack Metadata service to be consumed by cloud-init. Since we are
using snf-image to build a stack that exposes an OpenStack compute API, it’s
highly discouraged to have an incompatibility by design.
This leaves us with the “NoCloud” option. This datasource allows us to provide
user data and metadata to the instance through a number of different ways. The
most suitable for snf-image is the /var/lib/cloud/seed/nocloud-net
directory. The datasource will expect to find the files meta-data
and
user-data
and optionally vendor-data
as well as network-config
under this directory. snf-image could use a dedicated task to create those
files and write the following:
datasource_list: [NoCloud]
to the cloud-init configuration. The various tasks could then perform their
configuration by concatenating the meta-data
and vendor-data
files or
by appending files to the cloud-init configuration directory. By definition,
the best place for snf-image to put the configuration would be the
vendor-data
file:
Vendordata is data provided by the entity that launches an instance (for example, the cloud provider). This data can be used to customize the image to fit into the particular environment it is being run in [2].
Unfortunately, older versions of cloud-init don’t play well with
vendor-data
, which leaves us with meta-data
and the files under
/etc/cloud/cloud.cfg.d/
.
Configuration Tasks¶
snf-image configures the VM by running a number of configuration tasks on the
hard disk of the VM. On images that support cloud-init, instead of directly
altering the needed system files, snf-image-helper could add cloud-init
configuration into the meta-data
file as well as files under
/etc/cloud/cloud.cfg.d/
.
The proposed changes for the configuration tasks are:
10FixPartitionTable: The task should be prevented from running. Cloud-init
will automatically grow the partition to consume the available space using the
growroot
module of cloud-initramfs-tools and cc_growpart
module.
20FilesystemResizeUnmounted: This task should be prevented from running.
Cloud-init will automatically enlarge the file system using the cc_resizefs
module.
30MountImage: No changes are needed for this task.
35InstallUnattend: This can stay intact or be prevented from running. It’s Windows specific task and won’t do anything against a Linux image.
40FilesystemResizeMounted: This should be prevented from running.
Cloud-init will use cc_resizefs
module to grow the file system.
50AddSwap: This task parses the content of the SWAP image property which
cames in two forms (<partition id>:<size>
or <disk letter>
) and either
creates a new swap partition or formats a whole disk to be used as swap. In
cloud-init, the swap configuration is performed by the cc_mount
module.
This module has a swap config key that can be used like this:
swap:
filename: <file>
size: <"auto"/size in bytes>
maxsize: <size in bytes>
With this key we can add a swap file but not a swap partition. Since cloud-init does the partitioning itself and using swap files is as efficient as using partitions, in case the SWAP image property is defined in the first form, it’s better if we just ignore the partition id and use the swap config key to create a swap file of the requested size instead.
In case the SWAP property is defined in the second form and a swap disk is
requested, we could make use of the mounts
key of the cc_mounts
module,
to put the appropriate entry in fstab:
mounts:
- [ /dev/ephemeral-1, /mnt, auto, "defaults,noexec" ]
- [ sdc, /opt/data ]
- [ xvdh, /opt/data, "auto", "defaults,nofail", "0", "0" ]
Unfortunately, this module does not support providing non ephemeral device names (UUID for swap) and using the standard device naming is error-prone. Hence, for swap disks, its better if we bypass cloud-init and let snf-image directly format the disk and put the swap entry to fstab.
50AssignHostname: Adding a new local-hostname
key in the metadata file
should be enough to set the hostname. Alternatively, we could make use of the
cc_update_hostname
module which supports the following keys:
preserve_hostname: <true/false>
fqdn: <fqdn>
hostname: <fqdn/hostname>
We could ignore the fqdn key and use the other two.
50ChangeMachineId: We should probably leave this intact. Newer cloud-init version will automatically change the machine ID to a random value as this task does, but allowing this task to run to make sure the machine ID is always altered even on images that user older versions of cloud-init won’t harm.
50ChangePassword: This task will change the password and inject SSH
authorized keys to a list of users defined in the USERS
image property. For
changing the password of users, we could make use of the cc_set_password
module:
ssh_pwauth: <yes/no/unchanged>
password: password1
chpasswd:
expire: <true/false>
chpasswd:
list: |
user1:password1
user2:RANDOM
user3:password3
user4:R
The password
key only works for the default user and is not present in
older versions of cloud-init, which leaves us with the chpasswd
. Using this
key we can define the list of user-password tuples.
Injecting SSH authorized keys to a list of users is not that easy. We can make
the keys available to cloud-init by either setting the public-keys
metadata
key or using the ssh_authorized_keys
config key. The cc_ssh
module will
inject the keys found there to the root, as well as the default user, if
defined. The preferred way to do it is through the metadata service. This way
we leave the ssh_authorized_keys
config key for the user to add extra keys.
50ConfigureNetwork: This task may use of cloud-init’s net
module.
Cloud-init supports 3 network configuration formats [3]:
- Network Configuration ENI (Legacy)
- Networking Config Version 1
- Networking Config Version 2
The first one is obsolete. We should probably use the version 1 of network config which is supported by most cloud-init enabled images. This format looks like this:
network:
version: 1
config:
- type: physical
name: eth0
subnets:
- type: dhcp
snf-image should probably implement a new networking driver:
cloud-init.sh
that uses the same interface as the other networking drivers
of snf-image (freebsd.sh
, ifcfg.sh
, ifupdown.sh
, netbsd.sh
,
nm.sh
, openbsd.sh.in
) and creates the networking configuration in the
version 1 format. The only problem with this is that for IPv6 networks you
cannot tell cloud-init whether slaac or slaac+dhcpv6 is to be used. This
information is available in the Router Advertisement message and should be
automatically determined by the OS upon receiving one, but many OSes do not
respect it. In order to make sure that the networking works as expected in as
many cases as possible, it’s better if the cloud-init network configuration is
only used as a last resort. If snf-image detects that it knows how to configure
the instance’s OS without using cloud-init, it should do so and instruct
cloud-init to omit network configuration, by appending the following to
cloud-init’s configuration:
network: {config: disabled}
50DeleteSSHKeys: This task shall set the ssh_deletekeys
configuration
key of the cc_ssh
module.
50DisableRemoteDesktopConnections: This can stay intact or be prevented from running. It’s a Windows specific task and won’t do anything against a Linux image.
50SELinuxAutorelabel: This can stay intact or be prevented from running. It’s a Windows specific task and won’t do anything against a Linux image.
60EnforcePersonality: We could use the write_files
key of the
cc_write_files
module to create the files that need to be injected into the
image. The problem is that if the user makes use of this key in the user-data,
the original content will be overwritten and our files will be lost. It’s
better if we create a custom script under /var/lib/cloud/scripts/per-once
[4] to inject the files to their destination path at first boot and leave
the write_files
key for the user.
70RunCustomTask: This should be kept intact.
80UmountImage: This should be kept intact.
81FilesystemResizeAfterUmount: This should be prevented from running. The
cc_resizefs
module will do all the needed resizes.
User-defined configuration¶
The user may provide extra configuration through a new cloud_userdata
OS
parameter. The content of this parameter is base64 encoded. If this parameter
is set, the InitializeDatasource
configuration task will decode and then
inject its content to the /var/lib/cloud/seed/nocloud-net/user-data
file.
Cloud-init will treat this as user data and will handle the rest.
Design Limitations¶
The users are statically defined in snf-image. The list of users whose password
will change is defined in the USERS image property, which describes the image
and not the instance. This is a problem because the user may provide user-data
that will change the list of users that cloud-init will enable or create. By
providing an new system_info
list, the user may even change the name of the
default user. snf-image has no way to determine that the USERS image
property became obsolete because of provided user-data. We could solve this by
introducing a new users OS parameter that will overwrite (if defined) the
USERS image property and leave it to the user to make sure that the
instance’s users snf-image is aware of reflect the provided cloud-init
configuration. This means that in order to use it in synnefo, we’ll need add a
custom extension to the OpenStack API. OpenStack does not suffer from this
problem because it does not maintain a list of instance users to modify their
login credentials at all. Cloud-init will insert ssh authorization keys
to the root and the default users if available. Additionally, passwords are not
auto-generated by the system. It is left to the user to decide on which
instance users to change password and which passwords to use.
Footnotes
[1] | https://cloudinit.readthedocs.io/en/latest/topics/datasources.html#datasource-documentation |
[2] | https://cloudinit.readthedocs.io/en/latest/topics/vendordata.html |
[3] | https://cloudinit.readthedocs.io/en/latest/topics/network-config.html |
[4] | http://cloudinit.readthedocs.io/en/latest/topics/dir_layout.html |