instant_osd/README.md

125 lines
7.9 KiB
Markdown
Raw Permalink Normal View History

2024-07-20 18:29:29 +00:00
# "Instant" OSD for Ceph. Create a VM and spin it up as an OSD easily.
If you read the Ceph documentation, you're likely to be pesented with
the legacy "manual" creation of a host and OSD. That still works,
but it's really not the way to go.
Speaking of "not the way to go", ideally you don't run Ceph in a Virtual
Machine and your OSD should be 1TB or more. But for the less-demanding user
and for basic experimentation, the procedures provided here are a great
start and it runs fine on a 4GB VM and 50GB of OSD storage.
This process is designed to work on a KVM host using libvirt, which is what
the standard Virtualization Host meta-package provides. I expect that the
Debian/Ubunto equivalent can also be used, although I have'n test it.
Some advanced features are needed, so a recent CentOS clone, RHEL host, or Fedora platform is expected. At the moment, the oldest CentOS clones under active support are based on RHEL 9, and I used AlmaLinux 9.
In addition to libvirt support, you need to have LVM installed on your machine, since that's where the OSD data storage will be located. An LVM volume group capable of holding the expected amount of disk space for the OSD plus a few GB more for overhead as a minumum..
Also, install the "virt-install" ulitity as it's not part of the core
Virtualization Host meta-package, but it's used by the VM creation script
Finally, pull the cloud image that you want to serve as your base for the VM. As aupplied, I'm using AlmaLinux-9-GenericCloud-9.4-20240507.x86_64.qcow2. This **MUST** be a cloud image supporting cloud-init in order to properly pre-configure the VM!
## Stages
First up, we need to prep the host environment. Secondly, we spin up the VM, using the provided script as a model. The, we activate the new VM as a Ceph host using Ansible. Finally, we use "ceph orch host add" to add the host to the ceph inventory, Ceph will see the LVM disk we provided and auto-magically turn it into an OSD and at that point, the new VM is a full-fledged Ceph node.
# Host Environment Preparation
This is actually two tasks. One is general site prep. The supplied files are for my own particular environment. The second is done for each new OSD VM we want to create.
## Site preparation
The Instand OSD archive has 2 main dictories. One is files that you will put in you VM host's /var/lib/libvirt/images directory. The other goes into your Ansible playbook directory and roles directory.
When you download my Instant OSD bundle, it's tailored for my local environment.
So you need to tailor it to yours. The most important step is to unarchive the ``roles/files/ceph.cfg.tgz`` file and change the ceph config and keyfiles as needed. Or you could simply archive up a copy of your cuttent /etc/ceph directory from your admin machine and replace mine with yours. If you edit the supplied archive, re-archive it and replace the original archive, as it will be copied into the OSD.
You'll also want to go to your ceph admin host console and issue the "vars/main.yml" file of your ceph_osd_host role, Ansible will copy it to the new VM's authorized keys for its root account.
The other important thing to do is to customize the cloud-init.data file on the VM host. As supplied it has an SSH key. You may wish to replace the supplied key with the public key of your ceph admin host.
Now you're ready to start creating VMs.
## VM preparation
## Network
Ceph likes its hostnames all nice and consistent, so I recommend adding your
new hostname/IP Address/MAC address as needed to your DHCP server, DNS server (including reverse DNS!) and optionally /etc/hosts for machines that might be
interested, Especially the Ansible host.
Also don't forget to add the new VM's hostname to your ansible inventory file (default /etc/ansible/hosts)!
### VM Customization
2024-07-20 18:42:41 +00:00
Now the fun begins. The ``cloud-init.data`` file contains information common to all
2024-07-20 18:29:29 +00:00
VMs you'll create. There should properly be a meta-data file for the VM-specific
stuff, but I haven't been able to get that to work and thus I dynamically create
a tempory composite cloud-init for the actual VM creation.
2024-07-20 18:42:41 +00:00
Clone the ``make_cephxx.sh`` file to make a custom VM. Edit the variables that
2024-07-20 18:29:29 +00:00
define the hostname, MAC address and LVM Logical Volume that will hold the OSD data.
Note that the default MAC address for libvirt is randomly generated, so I manually supply my own to make DHCP assign a predictable IP address.
2024-07-20 18:42:41 +00:00
Use LVM's "lvcreate" command to create the Logical Volume you'll reference here
2024-07-20 18:29:29 +00:00
and edit the script to edit it". As presently configured, the VM will present the
LVM logical volume as device "/dev/sda" - the OS lives on /dev/vda. The device ID
will vary if you use a different VM bus type than "scsi", but since I don't know
the optimal bus type for an OSD, that's what I picked.
Once you're customised the script, just execute it. Assuming you've got everything
right, it will create a new VM disk based on your cloud image and boot up a VM.
Of course, if you are as error-prone as I am, this may require a few tweaks. Fear not. The process is idempotent, so you can re-run it as often as you like.
If you're extra paranoid, you can delete VM disk and (if it got created), the VM itself.
Once everything is happy, the boot process will run and log to your command-line
2024-07-20 18:42:41 +00:00
console. At its end, you'll be presented with a login prompt.
2024-07-20 18:29:29 +00:00
***Caution*** It's best to wait a minute or 2, as some setup may still be running even after the login prompt comes up!
As supplied, the login is userid "almalinux" and password "redhat". These are defined in the cloud-init.data file and if you like, you can change them.
2024-07-20 18:42:41 +00:00
Now you're ready to run the Ansible stage. Use ctrl-] to return to your VM host's original shell (disconnect from the VM console). You don't need it anymore.
2024-07-20 18:29:29 +00:00
## Ansible provisioning
The cloud-init process takes care of some of the most essential functions, but
after a certain point, it's better to use something more flexible, and Ansible is the easiest option for that. So go too your ansible console and do the following
prep work:
1. Ensure your hostname is in the Ansible inventory.
2. Customize the cephxx.yml playbook to point to that host
3. Use "ssh copy-id almalinux@mynewosd" to ensure that Ansible can run the playbook automatically. Remember that for the default account (almalinux), the
password is "redhat". "mynewosd" is, of course, the hostname you gave to the new
OSD VM.
2024-07-20 18:42:41 +00:00
Use the ansible-playbook command to run the ceph OSD playbook. This playbook provisions
using the "ceph_osd_host" role you installed.
2024-07-20 18:29:29 +00:00
It does the following:
1. Install the ceph repository into ``yum.repos.d``.
2024-07-20 18:42:41 +00:00
1. Install the cephadm utility from the ceph repository.
1. Copy in the ``/etc/ceph`` configuration information files from your master copy in the role/files directory..
1. Do an initial run of cephadm to cause it to pull the container(s) needed to run cephadm and the ceph daemons.
2024-07-20 18:29:29 +00:00
Note that if you like, you can also install the "ceph-common" package and be able to run ceph commands without needing "cephadm shell" to run them.
## Rejoice!
2024-07-20 18:42:41 +00:00
Congratulations! You have just created a new ceph host. You can confirm if you like, by using ssh to login to "almalinux@mynewcephosd", issuing the "sudo cephadm shell" command to enter the cephadm shell and then type "ceph orch ps" to
2024-07-20 18:29:29 +00:00
list the running daemons in your system.
Note that if the above fails, the most likely cause will be that your /etc/ceph config files are wrong. You did replace mine with your own in the ansible role file, didn't you?
2024-07-20 18:42:41 +00:00
## Going live
You're now a full-fledged ceph node and you only need to issue the
2024-07-20 18:29:29 +00:00
"ceph orch host add" command to add this new VM to the Ceph host list. Ceph will
automatically see the unused OSD data device (/dev/sdb) and make an OSD out of it.
As a final note, the new OSD may be created with a low CRUSH weight so it won't be too eager to fill up with data. Use the "ceph osd tree" command to see how it relates to the other OSDs and use the ceph osd set crush weight command to bumo it, if you need to.