2009-01-08 14:53:44

by Dean Nelson

[permalink] [raw]
Subject: [PATCH] sgi-xp: eliminate false detection of no heartbeat

After XPC has been up and running on multiple partitions for any length of
time, if XPC on one of the partitions is stopped and restarted (either by a
rmmod/insmod or a system restart), it is possible for the XPCs running on
the other partitions to falsely detect a lack of heartbeat from the XPC that
was just restarted. This false detection will occur if the restarted XPC comes
up within the five-seconds preceding one of the other XPC's heartbeat check
(which occurs once every twenty seconds).

The detection of no heartbeat results in the detecting XPC deactivating from
the just restarted XPC. The only remedy is to restart one of the XPCs and
hope that one doesn't hit this five-second window on any of the other
partitions.

Signed-off-by: Dean Nelson <[email protected]>
Signed-off-by: Robin Holt <[email protected]>
Cc: stable <[email protected]>

---

drivers/misc/sgi-xp/xpc_sn2.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/drivers/misc/sgi-xp/xpc_sn2.c
===================================================================
--- linux.orig/drivers/misc/sgi-xp/xpc_sn2.c 2008-12-30 12:01:18.000000000 -0600
+++ linux/drivers/misc/sgi-xp/xpc_sn2.c 2009-01-08 08:12:01.000000000 -0600
@@ -899,7 +899,7 @@ xpc_update_partition_info_sn2(struct xpc
dev_dbg(xpc_part, " remote_vars_pa = 0x%016lx\n",
part_sn2->remote_vars_pa);

- part->last_heartbeat = remote_vars->heartbeat;
+ part->last_heartbeat = remote_vars->heartbeat - 1;
dev_dbg(xpc_part, " last_heartbeat = 0x%016lx\n",
part->last_heartbeat);