Return-Path: linux-nfs-owner@vger.kernel.org Received: from mx2.netapp.com ([216.240.18.37]:35956 "EHLO mx2.netapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751584Ab1LYJsC convert rfc822-to-8bit (ORCPT ); Sun, 25 Dec 2011 04:48:02 -0500 Message-ID: <1324806463.2740.6.camel@lade.trondhjem.org> Subject: Re: Session timeout on RHEL6.2 From: Trond Myklebust To: Benny Halevy Cc: tigran.mkrtchyan@desy.de, linux-nfs Date: Sun, 25 Dec 2011 10:47:43 +0100 In-Reply-To: <4EF6A898.2010207@tonian.com> References: <1324475851.7709.12.camel@lade.trondhjem.org> <4EF6A898.2010207@tonian.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org List-ID: On Sun, 2011-12-25 at 06:37 +0200, Benny Halevy wrote: > On 2011-12-21 22:11, Tigran Mkrtchyan wrote: > > On Wed, Dec 21, 2011 at 2:57 PM, Trond Myklebust > > wrote: > >> On Wed, 2011-12-21 at 10:24 +0100, Tigran Mkrtchyan wrote: > >>> Dear friends, > >>> > >>> We are observing strange behavior with RHEL 6.2: > >>> > >>> Our the server lease time is 90 seconds. I can see that client > >>> sends SEQUENCE every 60 sec. And this is for some hours ( ~8 ). > >>> At some point client sends SEQUENCE after 127 seconds and > >>> gets, as expected, EXPIRED. > >> > >> Why shouldn't the client be allowed to let the lease expire if nothing > >> is using that filesystem? > >> > >>> I this point I have to blame myself. > >>> Client comes with EXCHANGE_ID using the same clientid. > >>> We did not garbage collected clientid internally as this happens after > >>> 2*LEASE_TIME > >>> and return EXPIRE. This ping-pong never ends. > >>> > >>> This is probably mostly a bug on my side. Nevertheless we never observed late > >>> SEQUENCE with kernel > 2.6.39. A short packet dump attached. > >>> > >>> I can open bug at RHEL if required. > >> > >> I wouldn't consider that a bug. > > > > As I said, there is a bug in exchange_id processing ( case 3 ) on my > > side. But to me it's sounds strange that client after more than 8 > > hours of sending only sequence decided to send one of them later than > > lease time. Especially, that we did not have it with other kernels. > > I'm inclined to agree. The client can let the lease expire for sure > and that's not a bug but the fact that the client sent the SEQUENCE operation > after the lease had expired indicates it might not be aware of that fact > and that seems to be a client bug. > > That said, I don't think that letting the lease expire when the client is idle > is the most polite thing to do. Why let the server clean up after the client > and revert to possibly un-optimized recovery paths rather than orderly > destruction of the state by the client? There are plenty of cases where the client can be idle for hours or even _days_. What's the point of pinging the server all the time after working hours? If someone wants to code up a DESTROY_SESSION and DESTROY_CLIENTID in order to make it formal, then fine, however note that we don't even do that on a full unmount today. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com