Spin up a Ceph OSD host VM in a hurry. Uses virt-install and ansible.
Go to file
2024-07-20 14:42:41 -04:00
ansible_role Initial commit 2024-07-20 14:29:29 -04:00
vm_host Initial commit 2024-07-20 14:29:29 -04:00
.gitignore Initial commit 2024-07-20 14:29:29 -04:00
README.md Fixed much badspelling 2024-07-20 14:42:41 -04:00

"Instant" OSD for Ceph. Create a VM and spin it up as an OSD easily.

If you read the Ceph documentation, you're likely to be pesented with the legacy "manual" creation of a host and OSD. That still works, but it's really not the way to go.

Speaking of "not the way to go", ideally you don't run Ceph in a Virtual Machine and your OSD should be 1TB or more. But for the less-demanding user and for basic experimentation, the procedures provided here are a great start and it runs fine on a 4GB VM and 50GB of OSD storage.

This process is designed to work on a KVM host using libvirt, which is what the standard Virtualization Host meta-package provides. I expect that the Debian/Ubunto equivalent can also be used, although I have'n test it.

Some advanced features are needed, so a recent CentOS clone, RHEL host, or Fedora platform is expected. At the moment, the oldest CentOS clones under active support are based on RHEL 9, and I used AlmaLinux 9.

In addition to libvirt support, you need to have LVM installed on your machine, since that's where the OSD data storage will be located. An LVM volume group capable of holding the expected amount of disk space for the OSD plus a few GB more for overhead as a minumum..

Also, install the "virt-install" ulitity as it's not part of the core Virtualization Host meta-package, but it's used by the VM creation script

Finally, pull the cloud image that you want to serve as your base for the VM. As aupplied, I'm using AlmaLinux-9-GenericCloud-9.4-20240507.x86_64.qcow2. This MUST be a cloud image supporting cloud-init in order to properly pre-configure the VM!

Stages

First up, we need to prep the host environment. Secondly, we spin up the VM, using the provided script as a model. The, we activate the new VM as a Ceph host using Ansible. Finally, we use "ceph orch host add" to add the host to the ceph inventory, Ceph will see the LVM disk we provided and auto-magically turn it into an OSD and at that point, the new VM is a full-fledged Ceph node.

Host Environment Preparation

This is actually two tasks. One is general site prep. The supplied files are for my own particular environment. The second is done for each new OSD VM we want to create.

Site preparation

The Instand OSD archive has 2 main dictories. One is files that you will put in you VM host's /var/lib/libvirt/images directory. The other goes into your Ansible playbook directory and roles directory.

When you download my Instant OSD bundle, it's tailored for my local environment. So you need to tailor it to yours. The most important step is to unarchive the roles/files/ceph.cfg.tgz file and change the ceph config and keyfiles as needed. Or you could simply archive up a copy of your cuttent /etc/ceph directory from your admin machine and replace mine with yours. If you edit the supplied archive, re-archive it and replace the original archive, as it will be copied into the OSD.

You'll also want to go to your ceph admin host console and issue the "vars/main.yml" file of your ceph_osd_host role, Ansible will copy it to the new VM's authorized keys for its root account.

The other important thing to do is to customize the cloud-init.data file on the VM host. As supplied it has an SSH key. You may wish to replace the supplied key with the public key of your ceph admin host.

Now you're ready to start creating VMs.

VM preparation

Network

Ceph likes its hostnames all nice and consistent, so I recommend adding your new hostname/IP Address/MAC address as needed to your DHCP server, DNS server (including reverse DNS!) and optionally /etc/hosts for machines that might be interested, Especially the Ansible host.

Also don't forget to add the new VM's hostname to your ansible inventory file (default /etc/ansible/hosts)!

VM Customization

Now the fun begins. The cloud-init.data file contains information common to all VMs you'll create. There should properly be a meta-data file for the VM-specific stuff, but I haven't been able to get that to work and thus I dynamically create a tempory composite cloud-init for the actual VM creation.

Clone the make_cephxx.sh file to make a custom VM. Edit the variables that define the hostname, MAC address and LVM Logical Volume that will hold the OSD data.

Note that the default MAC address for libvirt is randomly generated, so I manually supply my own to make DHCP assign a predictable IP address.

Use LVM's "lvcreate" command to create the Logical Volume you'll reference here and edit the script to edit it". As presently configured, the VM will present the LVM logical volume as device "/dev/sda" - the OS lives on /dev/vda. The device ID will vary if you use a different VM bus type than "scsi", but since I don't know the optimal bus type for an OSD, that's what I picked.

Once you're customised the script, just execute it. Assuming you've got everything right, it will create a new VM disk based on your cloud image and boot up a VM.

Of course, if you are as error-prone as I am, this may require a few tweaks. Fear not. The process is idempotent, so you can re-run it as often as you like. If you're extra paranoid, you can delete VM disk and (if it got created), the VM itself.

Once everything is happy, the boot process will run and log to your command-line console. At its end, you'll be presented with a login prompt.

Caution It's best to wait a minute or 2, as some setup may still be running even after the login prompt comes up!

As supplied, the login is userid "almalinux" and password "redhat". These are defined in the cloud-init.data file and if you like, you can change them.

Now you're ready to run the Ansible stage. Use ctrl-] to return to your VM host's original shell (disconnect from the VM console). You don't need it anymore.

Ansible provisioning

The cloud-init process takes care of some of the most essential functions, but after a certain point, it's better to use something more flexible, and Ansible is the easiest option for that. So go too your ansible console and do the following prep work:

  1. Ensure your hostname is in the Ansible inventory.
  2. Customize the cephxx.yml playbook to point to that host
  3. Use "ssh copy-id almalinux@mynewosd" to ensure that Ansible can run the playbook automatically. Remember that for the default account (almalinux), the password is "redhat". "mynewosd" is, of course, the hostname you gave to the new OSD VM.

Use the ansible-playbook command to run the ceph OSD playbook. This playbook provisions using the "ceph_osd_host" role you installed.

It does the following:

  1. Install the ceph repository into yum.repos.d.
  2. Install the cephadm utility from the ceph repository.
  3. Copy in the /etc/ceph configuration information files from your master copy in the role/files directory..
  4. Do an initial run of cephadm to cause it to pull the container(s) needed to run cephadm and the ceph daemons.

Note that if you like, you can also install the "ceph-common" package and be able to run ceph commands without needing "cephadm shell" to run them.

Rejoice!

Congratulations! You have just created a new ceph host. You can confirm if you like, by using ssh to login to "almalinux@mynewcephosd", issuing the "sudo cephadm shell" command to enter the cephadm shell and then type "ceph orch ps" to list the running daemons in your system.

Note that if the above fails, the most likely cause will be that your /etc/ceph config files are wrong. You did replace mine with your own in the ansible role file, didn't you?

Going live

You're now a full-fledged ceph node and you only need to issue the "ceph orch host add" command to add this new VM to the Ceph host list. Ceph will automatically see the unused OSD data device (/dev/sdb) and make an OSD out of it.

As a final note, the new OSD may be created with a low CRUSH weight so it won't be too eager to fill up with data. Use the "ceph osd tree" command to see how it relates to the other OSDs and use the ceph osd set crush weight command to bumo it, if you need to.