2008-06-28 10:33:48

by NeilBrown

Subject: RE: Sm-notify

On Saturday June 28, Dirk.Laurenz-/ixSogHR0HOS/[email protected] wrote:
> i, the time out is from node a to b seconds, but from node b to a 15 minutes.
> what I saw was, that the handle changed in /var/lib/nfs/rmtab from 0x000002 to 0x000003 although
> the fsid is the same on both nodes.
> how can the 15minute timeout occur?

The number is rmtab (e.g. 0x000002) is like a reference count. The
fact that you sometimes see '2' and sometimes '3' is of no real

I am correct in interpreting what you say as:

When you fail over from 'a' to 'b', it does so quite quickly, but
when you then fail back from 'b' to 'a', you get a 15 minute

If not, please explain in more detail.
If so:
Why exactly are you doing this. It doesn't seem to make sense.
Did 'b' get rebooted in the interim?
It could be that you are hitting the same issue discussed here:


and in the surrounding thread.

However this has nothing to do with sm-notify.