Break in file services (fs) at 2010-09-11 02:30 - 2010-09-13 01:15

Description: 

Schedule:

2010-09-11 02:30 - 2010-09-13 01:15

Duration:

1d 22:45 h

Affected services:

File service's (frodo, fs) group directories and some www-sites.

Reason:

Data transfer from one disk array system (Rivendell) to another (Mithlond) was interrupted because a nasty bug in logical volume management (LVM) manifested itself after 5TB (approximately 50% of the data) had been transferred. The bug corrupted at least the logical volume containing group directories.

The following web-sites were also shortly (few minutes) affected due to their dependency on file service:

  • www.futureinternet.fi
  • betelgeuse.hiit.fi
  • cgi.hiit.fi
  • cosco.hiit.fi
  • packages.hiit.fi
  • pgm2010.hiit.fi
  • www.mdl-research.org

Update at 2010-09-11 12:46: Data in group directories is being restored from tape. File system checks are being ran on other volumes (e.g. home directories).

Update at 2010-09-11 13:39: Other volumes, including the volume containing home directories checked out fine.

Update at 2010-09-13 01:15: Group directories have been restored and are in use again. Samba connectivity has been restored. The break is over.


Last updated on 13 Sep 2010 by Pekka Tonteri - Page created on 11 Sep 2010 by Pekka Tonteri