Archive for September, 2007

I/O error, dev hdb, sector 92813943

Friday, September 28th, 2007 | hardware, linux | No Comments

That message in your Linux server’s logs is a bad start to any day. Those of you who’ve read my previous posts on Linux and storage will know that I’m a big fan of RAID for storage. In our case, we’re in the middle of migrating our main server from using straight disks to a proper hardware RAID configuration. Of course, Murphy (or possibly even O’Toole) being the Patron saint of System administrators everywhere … it was inevitable that we’d suffer the kind of storage error I’ve been dreading just weeks before we move to our new configuration. We do have backups though, but I’d prefer to not have to face a restore if possible.

We’ve been noticing periodic spikes in load on our main server – without any accompanying cpu activity as observed in top or htop (I’ve been using htop a little more lately, it has some nice subtle improvements over top including better default graphical summary of cpu and memory usage). Some further digging with vmstat revealed that the cpu was spending lots of time in I/O wait. This could be quite normal if your system is doing a lot of I/O but I/O wait of 70-90% for minutes at a time suggested something else was up, particularly given that the system, while acting as our main server, isn’t that busy consistently (unless all of our developers have decided to check out all of their CVS and Subversion trees simultaneously!).

The next step was to take a look at the system log files and see if there were any clues there. Unfortunately there were, in the form of this rather unwelcome message,

Aug 27 22:48:45 duck kernel: hdb: task_in_intr: error=0x40 { UncorrectableError }, LBAsect=92813943, high=5, low=8927863, sector=92813943
Aug 27 22:48:45 duck kernel: ide: failed opcode was: unknown
Aug 27 22:48:45 duck kernel: end_request: I/O error, dev hdb, sector 92813943

A quick Google on “bad sectors” will tell you that a bad sector or 2 isn’t all that bad. In fact, the occasional bad sector happens naturally on a hard drive – and the electronics in the hard drive manage these in the background. If you’re seeing bad sectors at the operating system level – things may not be quite right with your hard drive, regardless of the job the electronics are doing at managing bad sectors. Anthony Ciani has a very nice description of whats going on in the drive when you get a bad sector. From my perspective, any time I’ve seen bad sectors on a drive in the past – the drive hasn’t lasted long after that, so my initial reaction was to save what I could from the drive.

Thankfully, this is one of a number of disks in our main server (one of the newer ones, curiously enough) so I had room on one of the other disks to move off any important data. We did see a few more bad sector messages but they were the same sectors suggesting that some of the data we were copying off resided on those sectors. I didn’t want to do any extensive checking or any further writing to the drive until we had moved off any data – the less work you do on a failing drive the better.

After recovering all of the data off of the drive, it was time to move it to one of our test servers with a view to determining what files had been affected by the bad sectors, the extent of the damage and whether the drive was heading towards becoming a paperweight or a piece of hard drive art.

I haven’t dug around filesystems at the level of mapping individual blocks to files since working with AdvFS on Tru64 systems. Daniel Alvarez blog pointed me to a document written by Bruce Allen BadBlockHowTo.txt which details how to identify the file associated with an unreadable disk sector. Using the notes from there, I prepared a basic Calc spreadsheet which quickly let me identify the file system block number of the failing sector. To identify the file affected by the failing sector requires some intermediate steps as described by Bruce.

The operating system logs an error in relation to what disk sector is failing (the disk sector is a physical location on the drive). You must first map this back to the filesystem block number (the filesystem block is a logical location). Only at this point is it possible to map the filesystem block back to an actual file in the filesystem. Bruce uses the following formula,

b = (int)((L-S)*512/B)


b = File System block number (what we want)
B = File system block size in bytes
L = LBA of bad sector (what we have from /var/log/messages)
S = Starting sector of partition as shown by fdisk -lu
and (int) denotes the integer part.

The spreadsheet for doing this is available to download. You just fill in the values and it outputs the filesystem block number. Once you have that you can use debugfs to identify the file using the following steps,

  1. Start debugfs.

    star:~# debugfs
    debugfs 1.40-WIP (14-Nov-2006)

  2. Run debugfs against the partition with the bad sectors.

    debugfs: open /dev/hdb1

  3. Identify the inode from the filesystem block number.

    debugfs: icheck 11601735
    Block Inode number
    11601735 5783564

  4. Identify the file from the inode.

    debugfs: ncheck 5783564
    Inode Pathname
    5783564 /mobyrne/.spamassassin/auto-whitelist

In our case – the file turned out to be a file automatically generated by spamassasin so its loss was inconsequential.

After this piece of forensic work – I tried running an fsck -c -c on the drive to identify any bad blocks and other errors. It showed up various errors and after fixing them, another fsck showed some more errors suggesting the drive is slowly failing. I verified that there were errors on the drive using the manufacturer’s own disk-checking tool and since it’s still under warranty, I’ll be sending it back for a replacement later today (after securely erasing it using the excellently named Darik’s Boot and Nuke tool).

Conclusions from this exercise?

  • Hard drives fail – plan for that eventuality.
  • RAID 1 is a good idea – it will at least protect you from these kind of failures.
  • Backups are a very good idea – make sure you perform them regularly, and make sure you test them regularly.