FAQ

From BCCD 3.0

Jump to: navigation, search

Contents

Frequently Asked Questions

This page lists some of the most frequently encountered problems in starting and maintaining a BCCD cluster. It is by no means an exhaustive list, so if you have an issue not listed here, send an email to Bccd-dev email.png and we'll do our best to help you work through the problem.

Users

Why are there two modes, Live and Liberated?

The BCCD was created to foster teaching parallel programming and cluster computing in environments and institutions where a full-scale computational resource (cluster, mainframe, etc.) isn't available. Live mode was the first step towards this end, allowing users to boot an entire Windows computer lab into the BCCD livecd, leaving the systems themselves untouched (and therefore not upsetting local sysadmins).

Liberation is the extension of this goal. The livecd is useful for temporary situations like a week-long workshop, but what about for a semester-long class or series of such classes? Enter LittleFe, a portable, affordable, beowulf cluster resource. This small resource allows educators to easily have a cluster at their fingertips where ever they need it to be. In order to make sure students are working in a familiar environment, we designed a way for the BCCD to run permanently on a system: install it on a hard drive and liberate it from the confines of a CD.

In short:

Live mode:
Running the BCCD directly from the CD, leaving the host systems untouched.
Liberated mode:
Install the BCCD on the hard drive to run a more permanent cluster.

How do I get in touch with you if I have a problem?

All of the developers and many of our users are subscribed to the mailing list. Send any of your questions there. It's a moderated list, but we're usually pretty good about accepting genuine messages quickly. The list archives are publicly readable too, so check there to see if we've addressed your issue before.

My network card isn't recognized

In this situation, the BCCD probably doesn't contain the kernel drivers or firmware for your card. The best thing to do here would be to send an email to Bccd-dev email.png, giving us the vendor and model number of your network card as listed by lspci.

The network booted nodes are inaccessible

More than likely you're running into the NFS/AUFS dead lock issue (#457). This is a known issue that we haven't been able to fix yet. Simply restart the affected nodes until they boot all the way to a login prompt.

Why is my head node's hostname "no_dhcp"?

If your head node (or any other node for that matter) is marked as no_dhcp, there were issues getting the network properly configured during boot-up. You probably need to run /bin/bccd-reset-network as root. If this still doesn't work, you may have bigger problems, such as unsupported hardware.

PXE booting doesn't work in liberated mode

The most common reason PXE booting doesn't work in liberated mode is that /bin/bccd-reset-network has not been run after liberation on the head node. This is needed because PXE booting is inherently dangerous on shared networks and should only be run on network dedicated to the BCCD.

Once I've created a bootable USB drive, is there a way to update it when you release a new version?

As of version 3.0.1, the only way to update a USB drive with a new ISO is to re-run the build_bootable_USB.sh script located in root's home directory of the BCCD or downloadable here.

I've liberated my system with version X, but you've gone and released version X+1. How can I update?

Currently (as of version 3.0.1), there isn't really an easy way to do this. The recommended path is to backup your data to somewhere else (e.g., a remote server, or an external disk), liberate again with the new version, and then restore your data. We're planning on implementing an easier way to do this, but for now, use the backup-liberate-restore method.

How do I setup VMware Player to run the BCCD virtually?

VMware Player is one way to run the BCCD virtually.

This is a free product that will let you setup simple virtual machines. Basically you would create a VM and point it at the ISO image that you downloaded. Here are the steps to do that:

  1. Double click on VMware player
  2. Go to File->New
  3. Click on ``Install OS Later
  4. Select ``Linux, then make sure ``Ubuntu is selected
  5. Set the name to ``BCCD
  6. Take the defaults for the storage

Why is boot hung on runaway loop modprobe binfmt-464c?

This means that you're trying to run a 32-bit user-space with a 64-bit kernel. Make sure that your live CD or USB stick is built properly.

How do I serve webpages and CGI scripts off BCCD?

  1. Start Apache: invoke-rc.d apache2 start
  2. Place static files in /var/www, and CGI scripts in /usr/lib/cgi-bin.
  3. Make sure the static files are readable by everyone: chmod -R a+r /var/www/html/.
  4. Make sure the CGI scripts are readable and executable by everyone: chmod -R a+rx /var/www/cgi-bin

How do I take a screenshot?

BCCD ships with the xwd utility that can be used to take screen shots. When you run it, your mouse cursor will be transformed into a target, and the next window you click on will be captured. There's few problems with xwd though:

  1. The output format is a bitmap specific to xwd. This makes it big and non-portable.
  2. It outputs to standard output. This will make a lot of interesting stuff on your terminal unless you redirect it somewhere.

The best way to use xwd is to pipe its output directly through convert to produce a PNG file:

  1. Run xwd | convert - ~/screenshot.png
  2. Click on the window you want to capture.
  3. Copy off the generated ~/screenshot.png image.

Why does liberation fail?

You might see liberation fail with this error:

    ok 0 - Unmounting all auto-mounted filesystems.
    ok 1 - Getting total system RAM.
    ok 2 - Getting libdev size.
    Too little disk space for /boot!
    at /root/liberate.pl line 137

The first problem is that the liberation device doesn't have enough space to accomodate an uncompressed BCCD (~8GB). If you're sure your hard drive has enough space, there's a good chance that the device you provided with --libdev isn't the one you thought it was. You can use fdisk -l and dmesg to view what disk devices are known to the system. For LittleFe users, the problem can be that the hard drive power cable is not fully seated, which will cause /dev/sda to be the USB device rather than the SATA drive.

Why does booting fail on a Mac?

See our instructions here.

Our LittleFe is not Internet-accessible - how can I access it over SSH directly remotely?

Let's imagine that you're at a conference with LittleFe, and you want to give access to LittleFe from users back at your home institution. LittleFe is likely trapped behind some kind of firewall, with no in-bound traffic allowed from the Internet. The solution is to use a reverse SSH tunnel; this uses an SSH connection from LittleFe to an SSH login host at home, and forwards a high-numbered port from the SSH login host to LittleFe's SSH port. Run this command from node000:

  ssh -R 5000:localhost:22 -l your-user-name-here your-ssh-host-here

Users at home can then login to that SSH host, and run this command to get to LittleFe:

  ssh -p 5000 -l bccd localhost

Developers

Currently this is based on a sampling of closed tickets. I welcome any suggestions.

How to add a new distribution to reprepro?

  1. Create a new stanza in /var/spool/reprepro/conf/distributions (sample below)
    Origin: Debian
    Label: Debian
    Suite: squeeze
    Pull: squeeze
    Version: 6.0
    Codename: squeeze
    Architectures: i386 amd64 source
    Components: main contrib non-free 
    Description: Debian 6.0 squeeze security updates
    Update: - debian security
    Log: logfile
    
  2. On bigfe, run gpg --keyserver keys.gnupg.net --recv-key key-id. Then run, gpg -a --export key-id | sudo apt-key add -. Alternately, set VerifyRelease lines in /var/spool/reprepro/conf/updates to blindtrust.
  3. Run reprepro update suite. Running with -v and within script might help too.

What's the proper way to add a package to the BCCD apt database?

On bigfe, change to /var/spool/reprepro and run reprepro includedeb debian_release deb. For BCCDv3, the release currently is lenny. Make sure to include both AMD64 and i386 .deb packages when available. Example:

  $ cd /var/spool/reprepro
  $ reprepro includedeb lenny ~/src/condor_7.4.4-1lenny_amd64.deb
  $ reprepro includedeb lenny ~/src/condor_7.4.4-1lenny_i386.deb

What's the proper way to commit a change to the repository?

After testing a changeset in your working environment, you should first commit your change into your development branch. Every commit should reference a ticket in Trac, especially when you go to merge the changes into Trunk. Put the ticket number in parentheses somewhere in the commit message, so that example message might look like:

Installed Condor (#179)

This has a couple of purposes:

How do I make a file executable?

(based off #557 - make bccd-nics executable)

This depends on where the file is and how it's used. If the file is only in Subversion, you can manage this within Subversion. If the file is manipulated on disk and needs to be made executable after being checked out of Subversion, it will have to be done using a test in build_livecd.pl or liberate.pl.

Subversion

Simply run svn propset svn:executable ON file and commit your changes.

Test

You should a test similar to this:

  $Bccd->run_test(
         "chmod", # Test name, from Bccd.pm
         "", # OK return code, defaults to something that makes sense
         "chmod 0700 file", # This is merely a comment
         0700, # Actual permission bit field
         file, # Actual file name
  );

This can be placed in either build_livecd.pl or liberate.pl.

How can I add a 3rd-party kernel module?

(Based on [https://cluster.earlham.edu/trac/bccd-ng/ticket/567 #567 -

Most kernel modules are directly from the kernel, which is beyond the scope of a FAQ to rebuild. Many kernel modules, however, can be built independently of the kernel source, using only the header files. The steps vary, but the basic layout is the same. First, boot up one BCCD instance for each supported architecture (currently i386 and amd64) and install the kernel headers package:

  # apt-get update # Download the latest package catalog
  # apt-get install linux-headers-$(uname -r) # Install the kernel headers

You should now have a directory called something like /usr/src/linux-headers-2.6.31.12-aufs. Note: $(uname -r) will run the command uname -r (print the operating system revision) and fill it in on the command line as if you had typed that text.

Next, unpack the source code for the kernel module and follow the steps in the README or INSTALL file. When you build the module, make sure to point the build process at the headers directory. Generally, this will be something like

  make KERNEL_SOURCE=/usr/src/linux-headers-$(uname -r)

What you do next will depend on whether your module is designed to be installed into the main modules hierarchy (in /lib/modules/$(uname -r)/kernel), which will allow it be loaded easily with modprobe, or just goes into /lib/modules/$(uname -r)/misc), which will reqiure a startup script to load it.

Main hierarchy

If your module ends up in the main hierarchy, you should run <code>depmod -a after installing your module. Then add your module (the .ko file that got added to /lib/modules) along with /lib/modules/$(uname -r)/modules.*) to your Subversion branch at trees/arch/$(uname -m)/lib/modules/$(uname -r), preserving the same path structure as in the BCCD.

Misc

Modules ending up in misc are much simpler to deal with. Simply add the .ko file to trees/arch/$(uname -m)/lib/modules/$(uname -r)/misc in Subversion.

How do I change boot options?

(based on #570 - make "linux 5" the default)

Remember that BCCD has two boot modes - live CD and liberated. The live CD uses isolinux, while liberated mode uses grub. In the future it would be nice if we had one source for the boot options, but for now you'll just have to change both places separately. The isolinux configuration is stored in trees/KNOPPIX/boot/isolinux/isolinux.cfg, while the grub configuration is stored in packages/etc/grub/menu.lst.

How do I add packages to the install process?

(based on #517 - C development manpages absent)

Open up bin/build_livecd.pl in an editor. Find where the $PACKAGES, $EXTRA_PACKAGES, and $AMD64_PACKAGES variables are defined. If you always want the package installed, add it to the list for $PACKAGES. If the package should be installed only if there is space, add it to $EXTRA_PACKAGES (NB: currently this logic is not defined yet, but it could be in the future). If it is specific to the AMD64 architecture, add it to $AMD64_PACKAGES.

How do I add a device to the minirt?

(based on #501 - Generate minirt on the fly)

The devices in the minirt are generated in the ISO build process based on the YAML file packages/mknod.y.

How do I add non-Debian software to the BCCD?

Keep in mind that there are two software depositories: a local one at trees/software/bccd/software, and a remote one at trees-remote (outside any branch). Also keep in mind that currently we provide only IA32 builds; there is no need to build custom software for AMD64 or other architectures. These instructions assume a GNU build process (aka configure && make).

Local

  1. Unpack your software.
  2. Load any dependent modules with module load software.
  3. Build and install it with --prefix set to /bccd/software/software-version.
  4. tar up the directory and unpack it in your subversion check out at trees/software/bccd/software.

Remote

The easiest way to get remote software available is to copy it from trees/software/bccd/software using svn copy software-version trees-remote/software/bccd/software. You can then use this find command to replace all occurrences of the local software directory with the remote software directory.

  find . -type f ! -regex '.*\/\.svn\/.*' -print0|xargs -0 perl -wpli -e '! -B && s!/bccd/software!/mnt/ssh/software!g'

You can then commit your changes.

How do I capture kernel output in VirtualBox?

Sometimes you need to trap kernel output that is going by too quickly to read. Unfortunately, Linux does not have a feature like FreeBSD that allows paging up after a kernel panic, which often is when you actually want to catch kernel panic. To accomplish this in VirtualBox, you should setup a virtual serial port that is redirected into a file:

  1. Make sure your particular kernel has support for serial ports built-in, and console redirection enabled:
    CONFIG_SERIAL_8250=y
    CONFIG_SERIAL_8250_CONSOLE=y
    CONFIG_SERIAL_8250_PCI=y
    
  2. Configure ttyS0 (aka COM1) VBoxManage modifyvm vm-name --uart1 0x3F8 4
  3. Redirect ttyS0 to a file: VBoxManage modifyvm vm-name --uartmode1 file filename
  4. When you boot, enter this on the kernel boot line: linux console=ttyS0.

How do I configure SVN commit emails?

Emails triggered by SVN commits are handled by /cluster/bccd-ng/hooks/post-commit, which calls /usr/local/sbin/mailer.py. The configuration is in /cluster/bccd-ng/conf/mailer.conf, and uses regexes to match paths in the commit.

How is X11 started?

The process starts in the kernel boot line, which is either in isolinux.cfg (CD), syslinux.cfg (USB), or grub.cfg (live mode):

  BOOT_IMAGE=/vmlinuz-2.6.38bccd-01771-gc6b7697 root=/dev/mapper/bccd-slash ro lang=us 5 vga=791 nodhcp quiet

That 5 tells the system to boot into runlevel 5. /etc/inittab has a line which only gets invoked for runlevel 5:

  gdm:5:once:/bin/su bccd -l -c "/bin/bash --login -c startx > /tmp/bccd_x.log 2>&1"

startx, in turn, uses scripts in /etc/X11/xinit to start a window manager and programs that should be loaded on login.

Why does my PXE node die?

Could be many reasons, among:

  1. You might have cruft in /diskless/clients/ip-address.
    1. Try cleaning up for just one client (if you know the IP address), for instance: sudo rm -fr /diskless/clients/192.168.3.11
    2. Or for all your clients (make sure all your diskless clients are off first): sudo rm -fr /diskless/clients/*

Why do I have oddly-named NICs?

Modern BIOSs have a notion of what it wants to call a NIC, which might not be what you want. You can disable this feature by adding net.ifnames=0 to your kernel boot command line.

See #930 for more information.

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox