Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S944189AbcJaPaQ (ORCPT ); Mon, 31 Oct 2016 11:30:16 -0400 Received: from us-smtp-delivery-194.mimecast.com ([63.128.21.194]:20336 "EHLO us-smtp-delivery-194.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S944174AbcJaPaM (ORCPT ); Mon, 31 Oct 2016 11:30:12 -0400 From: Trond Myklebust To: Sebastian Andrzej Siewior CC: Schumaker Anna , List Linux NFS Mailing , List Linux Kernel Mailing , "tglx@linutronix.de" Subject: Re: [PATCH v3] NFSv4: replace seqcount_t with a rw_semaphore Thread-Topic: [PATCH v3] NFSv4: replace seqcount_t with a rw_semaphore Thread-Index: AQHSM3mEHI2fh+m5skS58vgJ3aR4g6DCsFUA Date: Mon, 31 Oct 2016 15:30:02 +0000 Message-ID: <51E3F753-2D1F-4370-BFEB-BD8356D622A1@primarydata.com> References: <20161021164727.24485-1-bigeasy@linutronix.de> <20161028210511.k7hrsfxovti7gqtu@linutronix.de> <747E2CCB-3D83-40FD-8B31-46AB5A5E8592@primarydata.com> <20161031131958.mrrez5t7sow75p6v@linutronix.de> In-Reply-To: <20161031131958.mrrez5t7sow75p6v@linutronix.de> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [50.124.59.220] x-ms-office365-filtering-correlation-id: 3bb8473d-c2b8-4523-4e5c-08d401a2c881 x-microsoft-exchange-diagnostics: 1;BN6PR11MB1572;7:zQsu8wLfLQLTLgM1KHqa2kOV3XdyPKTDplZWQG0+D6ekUKdBExtUXNwo0w+Cwt8LZucCIIqpdsIa2h6Yn2K27GBLAIGlgNmpCB4rg0Uy2/hv/SfzYbNhFrddBhcJpRnd4VHM3nwSZ/ANNnr9XbrDeoqwxf0sWi70KaM2pUf5g1ZK9CSu6/iN9LZb3mWauVg7TzWgy8QJbe0K9FF2xcIm64dKZjrsRVkYsmkVI6kWhM5yuY5+fXRjujxwjN0muO8HmR9wOfhRJrZSRthoTt8435I1xoNjBf70cJWPSrI7P29VI9M9cOTj+b9Js5NKgUszrL7TyxP1d/9RPj6nA9ZLcjIazJKCt2tSZ46FD6OaiBc=;20:5fDH+y+b/Y0FBwzzKeCiz4KpWRgHk749SNOQCu6GE/YWjhKP1Sf/IxATK82NCQtM5azRTFtPx5rU/i1QGeYcIF4DE+Ke9AhkcFn8bDH+AQ0knn2/iDdOX1xt1AeiYa0goTlPmBN6b+Wx9ahBHBkocFqlhSytOLJ5DtDbPSLwlmw= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BN6PR11MB1572; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(158342451672863)(788757137089); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6043046)(6042046);SRVR:BN6PR11MB1572;BCL:0;PCL:0;RULEID:;SRVR:BN6PR11MB1572; x-forefront-prvs: 01128BA907 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(7916002)(189002)(24454002)(199003)(189998001)(86362001)(4326007)(3280700002)(97736004)(5002640100001)(105586002)(50986999)(3660700001)(99286002)(54356999)(76176999)(106116001)(11100500001)(68736007)(305945005)(106356001)(93886004)(19580395003)(36756003)(19580405001)(101416001)(7736002)(7846002)(8676002)(81156014)(81166006)(2950100002)(77096005)(122556002)(6916009)(2900100001)(5660300001)(33656002)(10400500002)(92566002)(110136003)(2906002)(66066001)(87936001)(83716003)(586003)(102836003)(6116002)(82746002)(3846002)(8936002)(104396002);DIR:OUT;SFP:1102;SCL:1;SRVR:BN6PR11MB1572;H:BN6PR11MB1570.namprd11.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-ID: <437952EC8F93994BAE5576739B19C5E4@namprd11.prod.outlook.com> MIME-Version: 1.0 X-OriginatorOrg: primarydata.com X-MS-Exchange-CrossTenant-originalarrivaltime: 31 Oct 2016 15:30:02.2306 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 03193ed6-8726-4bb3-a832-18ab0d28adb7 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR11MB1572 X-MC-Unique: Bm4XHYVbMkuIjgQdH6Kp_w-1 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id u9VFUMcb029455 Content-Length: 2297 Lines: 43 > On Oct 31, 2016, at 09:19, Sebastian Andrzej Siewior wrote: > > The raw_write_seqcount_begin() in nfs4_reclaim_open_state() bugs me > because it maps to preempt_disable() in -RT which I can't have at this > point. So I took a look at the code. > It the lockdep part was removed in commit abbec2da13f0 ("NFS: Use > raw_write_seqcount_begin/end int nfs4_reclaim_open_state") because > lockdep complained. The whole seqcount thing was introduced in commit > c137afabe330 ("NFSv4: Allow the state manager to mark an open_owner as > being recovered"). > Trond Myklebust confirms that there i only one writer thread at > nfs4_reclaim_open_state() > > The list_for_each_entry() in nfs4_reclaim_open_state: > It seems that this lock protects the ->so_states list among other > atomic_t & flags members. So at the begin of the loop we inc ->count > ensuring that this field is not removed while we use it. So we drop the > ->so_lock loc during the loop it seems. And after nfs4_reclaim_locks() > invocation we nfs4_put_open_state() and grab the ->so_lock again. So if > we were the last user of this struct and we remove it, then the > following list_next_entry() invocation is a use-after-free. Even if we > use list_for_each_entry_safe() there is no guarantee that the following > member is still valid because it might have been removed by someone > invoking nfs4_put_open_state() on it, right? > So there is this. > > However to address my initial problem I have here a patch :) So it uses > a rw_semaphore which ensures that there is only one writer at a time or > multiple reader. So it should be basically what is happening now plus a > tiny tiny tiny lock plus lockdep coverage. I tried to this myself but I > don't manage to get into this code path at all so I might be doing > something wrong. > > Could you please check if this patch is working for you and whether my > list_for_each_entry() observation is correct or not? > > v2…v3: replace the seqlock with a RW semaphore. > NACK. That will deadlock. The reason why we use a seqlock there is precisely because we cannot allow ordinary RPC calls to lock out the recovery thread. If the server reboots, then ordinary RPC calls will fail until the recovery thread has had a chance to re-establish the state.