Break in at 2012-11-02 18:32 - 18:46

2012-11-02 18:32 to 18:46
14 min
Affected services: 
Services running in Adaptive group's test server

Universe's Apache process and possibly kernel's multipath were twisted themselves. To resolve this, the server was rebooted. All pending updates were installed as well.

Update at 18:48: Universe was up and running at 18:46. The problem was that multipathd failed to fail one path during a service on disk array system at 2012-10-23T13:45 even though it had failed the devices behind it:

mithlond-lun-14 (3600601603c0027009a1ae2bc8a12e011) dm-2 ,
size=2.0T features='0' hwhandler='1 emc' wp=rw
|-+- policy='round-robin 0' prio=0 status=active
| |- #:#:#:# -   #:#   active faulty running
| `- #:#:#:# -   #:#   active faulty running
`-+- policy='round-robin 0' prio=0 status=enabled
  |- 1:0:4:0 sda 8:0   active ready running
  `- 0:0:6:0 sdc 8:32  active ready running

This caused disk IO to fail and thus Apache generated some load and one zombie process:

top - 18:26:31 up 150 days,  6:23, 10 users,  load average: 93.99, 93.97, 93.64


