Whew...just went through an ugly 24 hour outage on my dedicated Debian Etch box, due to an update gone bad that hosed my grub config.
For the benefit of others (and myself, should this happen to me again), here's a method to gain remote access to a non-booting Debian box. Without this procedure, the only option available to me via support, was an OS re-install and data restore.
The pre-requisites: You must know your server partitioning and software RAID config, if applicable. Also, while support might be able to dig it up on their own, having your network config available for them will likely speed up the process.
1. Have support download the Debian netinst installer CD. This is also a Debian Rescue CD. Current version is for Lenny - use it even if you are still on Etch, as the Etch CD doesn't have the necessary support for software RAID.
http://cdimage.debian.org/debian-cd/...86-netinst.iso
2. Have support boot off the CD, select Advanced Options --> Rescue Mode
3. Go through the next several screens selecting defaults, keeping a close eye out for the screen labeled "Detecting network hardware". Immediately after this screen, you'll see a DHCP attempt to fetch an address. The minute you see the DHCP screen, hit "Enter", which will cancel the DHCP config, allowing you to configure static ("Configure network manually".)
4. Configure static address, hostname/domain, default route/subnet and nameserver as prompted. Again - having this info handy for support, will likely speed up the process.
5. After setting the static info, continue accepting the defaults until you see "Device to use as root filesystem". This is where you need to know your partitioning and RAID setup, if applicable. In my case, the root filesystem was on a RAID1 partition, /dev/md2. Support was instructed to select it from the list.
6. On the next screen, select "Execute a shell in xxx" - "xxx" matches the root filesystem location in the last step.
7. At the shell prompt, have support check that network connectivity is working (ping, etc.)
8. Start up sshd manually.
sh-3.1# /etc/init.d/ssh start
You should now be able to SSH into the box, mount any additional necessary filesystems, and have a look around - with a caveat. You need to connect as follows, given the very limited nature of the rescue CD boot environment.
ssh user@host "/bin/bash -i"
At this point, hopefully you can fix whatever is wrong, and have support remove the rescue CD and reboot - going through the steps above again for remote access, if things don't come back up normally.
Oh...how I lusted for an IP KVM, but such a luxury doesn't appear to be something Jag is offering. The above is certainly better than nothing, and likely better than the OS reinstall the techs will quickly offer up when things get ugly.


LinkBack URL
About LinkBacks



Reply With Quote
Bookmarks