Updatenode
A. Things to understand up front
How to update the client-side Emulab software on a node and make new images. This is still a bit ugly at the moment.
A. Things to understand up front
1. The disk image I am talking about updating here is one which has both a FreeBSD partition and a Linux partition. So obviously, you will need to update both parts before creating a new whole-disk image. By Utah convention, FreeBSD is in DOS partition #1 and Linux in DOS partition #2. For the record, partition #3 is a Linux swap partition and partition #4 is defined to contain the remaining space on the hard drive (courtesy of "growdisk" that is run after frisbee is run) and is available for users. 2. For booting between OSes (see next item), you will need access to the node console, so make sure you know where it is (VGA vs. serial line) and have access to it. You should have set this when you were initially customizing the generic image. 3. Since you will need to boot both the BSD and Linux partitions as well as the Emulab "admin MFS" (a scaled-down, PXE-loaded, memory filesystem based FreeBSD system), it is best to understand how to interact with the Emulab pxeboot program as that is the easiest way to get between them. When pxeboot loads, it prompts the console for override input before contacting boss for its default behavior: Type a key for interactive mode (quick, quick!) So hit the space bar (quick, quick!) and you go into interactive mode where you can tell it to boot from the FreeBSD disk partition: part:1 the linux partition: part:2 of to boot the admin MFS: loader:/tftpboot/freebsd (type "help" to get the complete list of commands). So below, when I speak of "rebooting into Linux" or "rebooting into the admin MFS", this is how I expect you to do it. 4. When running in the admin MFS (which is FreeBSD), use "ad0" if you have an IDE disk, "da0" if you have a SCSI disk, or "ad4" if you have SATA. All of the following examples specify "ad0", replace as necessary. 5. To do an Emulab client software install, certain other software packages must be installed before you even try to configure the Emulab software. If you are updating a recent Emulab image, these should all be in place. But if you are updating a really old image, or you are installing the software for the first time, you will need additional software packages. You can install them as necessary when you boot the image into the appropriate OS. Here is what you need: - GNU make. On Linux it is the standard make. On FreeBSD, you must install the port: /usr/ports/devel/gmake. - Python. Either Python 2.4 or 2.5 are fine. For FreeBSD, go to /usr/ports/lang/python24 and do a "make install". For Linux, use the appropriate package tool to load it. - Utah's pubsub headers and libraries. You need to download the source tarball and build it. What tarball you download depends on whether you have chosen to retain "Elvin compatibility" or not. If you don't know what I am talking about, then you didn't read the install/upgrade instructions! The important point here is that, if you have chosen ELVIN_COMPAT you must always build your client pubsub and Emulab software with it enabled, even if this is a new image. This does not mean that you will be installing Elvin on the image or talking to Elvind on the servers, it only means that you are maintaining compatibility with the on-the-wire Elvin message format. Anyway, if you are using ELVIN_COMPAT grab: http://www.emulab.net/downloads/pubsub-elvincompat-0.8.tar.gz otherwise grab: http://www.emulab.net/downloads/pubsub-0.8.tar.gz unpack it and build it: gmake client sudo gmake install-client - Boost headers. Check for the boost directory in the include directory path (probably in /usr/local/include or even /usr/include). For FreeBSD you can just install the package or port (version >= 1.30). For Linux, you may have a harder time. The RedHat RPMs I have found only include the libraries, you need just the headers (everything we use is implemented as a template I think). I think I just copied over the installed headers from a BSD box. - Dhclient. On FreeBSD and RedHat > 7, this should be standard. (We used to use "pump" for RedHat 7, but we couldn't make it work efficiently for multiple interfaces. So, we switched to dhclient there as well.) You will need to grab a RedHat 7 RPM from somewhere, I found one at pbone.net: http://rpm.pbone.net/index.php3/stat/4/idpl/1073819/com/dhclient-3.0pl2-1.norlug.i386.rpm.html When installing the RPM, you will need to use "--nodeps" to avoid its dependency on some initscripts RPM (those scripts presumably just provide the boot time rc boilerplate to call dhclient, we have our own and don't need it). - BPF devices. Under FreeBSD, the DHCP client uses /dev/bpf* devices. In FreeBSD 4, there are only 4 devices by default so if you have more than 4 interfaces in the system, DHCP will fail. So you may need to go out to /dev and: sudo ./MAKEDEV bpf5 bpf6 ... For FreeBSD 5 and Linux, you should not have to do this. - Perl. On FreeBSD 5, perl is not installed by default. Make sure you have a version of perl5 installed. - Perl HiRes timer module. If debugging timestamps are enabled in the client scripts, you will need to install the HiRes module. Currently this only happens in the mkjail script which is FreeBSD specific. To install on FreeBSD, install the devel/p5-Time-HiRes port. - Ethtool. On Linux, with certain NICs, you will need ethtool (instead of mii-tool) so that the Emulab software can change link speed/duplex. Just install an RPM. UPDATE: for a November 2007 install of Ubuntu 7.04 (minimal), I needed to do the following (in addition to pubsub): apt-get install gcc apt-get install libc6-dev apt-get install python-dev apt-get install make apt-get install g++ apt-get install byacc apt-get install libssl-dev apt-get install flex apt-get install libboost-dev apt-get install libboost-graph-dev apt-get install libpcap-dev apt-get install ntp-simple apt-get install tcsh apt-get install rpm apt-get install perl-suid 6. Another "first time, one time" thing to do is to setup the serial console in the OSes (if you are planning on using serial consoles). We use 115200 baud as the typical default of 9600 is too painful. For FreeBSD you need to rebuild a new boot loader if you intend to use 115200 baud (which is what our PXE boot loader and MFSes expect). To do that first add the line: BOOT_COMCONSOLE_SPEED=115200 to /usr/src/sys/boot/i386/Makefile.inc. Then: cd /usr/src/sys/boot sudo make obj sudo make sudo make install sudo disklabel -B ad0 # where "ad0" is your boot device and to /boot/loader.conf add: console="comconsole" Hang on, you're not done yet! One last thing: change the "console" line in /etc/ttys to look like: console "/usr/libexec/getty std.115200" unknown on secure Linux and lilo are a little simpler. In /etc/lilo.conf add: serial=0,115200n8 at the top and, for each kernel listed add: append="console=tty0 console=ttyS0,115200" the run /sbin/lilo to record the changes.
B. The update process
1. Make sure you have a testbed source and build trees in a filesystem that is visible to a testbed node, either in your home directory or /proj/emulab-ops. You will need a build tree for both BSD and Linux (see ~hibler/obj for example). You should use the same "defs" file for the client that you used for your boss/ops build, so just copy your source tree from one of them. 2. Load up a node with the current image, set either to boot BSD or Linux, you'll need to boot both eventually. 3. Login to the node, and fill in a little bit of missing source. We don't distribute the prototype password files for BSD/Linux, so you'll have to copy the "template" versions from the current node: # when running linux: sudo cp -p /etc/emulab/shadow <testbed-source-tree>/tmcd/linux/ # when running BSD: sudo cp -p /etc/emulab/master.passwd <testbed-source-tree>/tmcd/freebsd/ The only thing special about these (and the reason we don't distribute ours) is that they contain your site's node root password. 4. Make sure you have loaded any of the prerequisite packages mentioned above in step 5, otherwise the client build in the next step will fail. I will note again, that you need to have the Emulab "pubsub" package correctly configured with respect to ELVIN_COMPAT, or you will get some client event agents that run, but won't communicate with the server. 5. Go to your build directory and install new client binaries. Paranoid guy that I am, I first backup directories that will be affected, ala: sudo cp -pr /etc /Oetc sudo cp -pr /usr/local/etc/emulab /usr/local/etc/Oemulab then do the install. For FreeBSD 4, FreeBSD 5, RedHat 7 and RedHat 9 systems you can just do: cd <build-tree-for-this-os> gmake client and it will build the necessary client-side binaries. If something doesn't build, most likely it is because of a missing software package, see #5 in the section A above. After successfully building, install the binaries and scripts with: sudo gmake client-install If you did the backup, you can then compare the original to the new: sudo diff -r /Oetc /etc sudo diff -r /usr/local/etc/Oemulab /usr/local/etc/emulab The diffs can be significant however, so it may not tell you much of value. 6. Make sure everything works. Reboot the node once and make sure it comes up ok with the new binaries/scripts. 7. Cleanup the filesystem prior to making the image. Login at the console and do a shutdown to go to single-user mode. In single-user mode do: # BSD paranoia: unmount all NFS filesystems, this will already be # done for RHL umount -h fs cd /usr/local/etc/emulab sudo ./prepare 8. Now you need to do the same (3-7) for the other OS on the disk. So reboot the machine and tell pxeboot (see #3 in section A above) to boot from the other partition: sync reboot # wait for pxeboot prompt part:N # N==1 for BSD, 2 for Linux When it comes up in the OS, go do steps 3-7 again. 9. All done? Ok, now you can make the new images. I don't use the form since you need to create three images: one whole disk, one each for the individual partitions. You usually don't need the partition images, but we'll make em anyway! First step is to get into the admin MFS via pxeboot: sync reboot # wait for pxeboot prompt loader:/tftpboot/freebsd This will boot into the MFS. Now you can ssh in as root from boss and make the images (replace "FBSD410" with whatever version of FreeBSD you are running, and "RHL90" with the version of Linux): cd /proj/emulab-ops/images imagezip -o /dev/ad0 FBSD410+RHL90-STD.ndz imagezip -o -s 1 /dev/ad0 FBSD410-STD.ndz imagezip -o -s 2 /dev/ad0 RHL90-STD.ndz 10. Move the new images into place. The only trick here is to make sure frisbeed isn't currently serving up the image. This is another hack. On boss: cd /proj/emulab-ops/images sudo cp -p FBSD410+RHL90-STD.ndz FBSD410-STD.ndz RHL90-STD.ndz /usr/testbed/images/N/ cd /usr/testbed/images sudo mv FBSD410+RHL90-STD.ndz FBSD410-STD.ndz RHL90-STD.ndz O/ sudo mv N/* . Now the images are in place. If there is no currently active frisbeed serving up that image, you are done. If there is an existing frisbeed, it will still be serving up the old image since it has it open. You need to kill that and start a new one. There are three processes (threads) per frisbeed instance. If you just kill the first one (it'll be the one that has used the most CPU), then you are done. The parent frisbeelauncher will see that it has died and start a new one, which will open the new image file. One catch: if the old frisbeed is actively sending out the image (as opposed to sitting around idle waiting for a client), you can really screw things. The new frisbeed will happily take up where the old one left off, continuing to feed blocks to any active client. Unfortunately, it will be feeding blocks from a completely different image. There is currently no unique serial number in an image that would enable us to detect this scenario. 11. As long as you still have your node allocated, you might as well test the whole disk image. On boss just do: os_load -i FBSD410+RHL90-STD pc<XXX> and it will reload the node, and bring it back up in what ever OS is the default. Make sure it comes up, and then use your pxeboot prowess to boot into the other OS and make sure it works.