Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3328314imu; Sun, 11 Nov 2018 12:28:56 -0800 (PST) X-Google-Smtp-Source: AJdET5eo9GkH/uTAqOExWQp6VuPBBfqPIHeAvt/niDWPBJ7yXpO95jRilOiSQz5+xbcpt2doLdps X-Received: by 2002:a63:40c6:: with SMTP id n189mr8268367pga.355.1541968136423; Sun, 11 Nov 2018 12:28:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541968136; cv=none; d=google.com; s=arc-20160816; b=GyS4/oSFq7g3T+kVdgxUqNSpXgxznJRPNaSHkmZCA1SDAogRGJv655hvv4NQSqgIU+ paEXfI562Pg0HdrM3uAbFyF68ZXP3YQlxNOSSOuphzJsLAr/kO/lxuwSwR2rzXhXjSOy ZAYRZCj3sdhdvpEjnYUd6ci5DETy6ZvY0S9Z3kDNlRSeIZNEVkYfFj7sIkNNorgeYkVt ULEJOXxS0PiOm0lo/nwEqRkKe8jEWYiFirqX6VNzCTf7j2xVEJCv326LfnljFNhP1hGF o8c5hcUh5ubxY5Bm05+ovLR4LTqoa2BP++H8tAVZ48YW0SSmZ61Ttc3BTeJ3qxZASzlr WVbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:subject:message-id:date:cc:to :from:mime-version:content-transfer-encoding:content-disposition; bh=1bnjutBVrFhtHNWCQYpAcbeYrmHsMN3R/NzIEWQersk=; b=pyS00x4ZT23rNl9t0zcwBahCM6WfEVE+TiJVRfLV2I2gELdZYHTYBPOkXLBt+4YHN+ yJhUeJrMFuiqYbBMZ9v/uvoSiJK5A1ujg8F8qP/lRowf/3e9kAtOnU6WvVt1Bn2hLZ3n ljKazDO/rMVBI7dQO6ocd+Nm8CkOe4eREzhuffv87I3Nj+Z5h2Ky8FkKc2E9OK0+EsKP eSgXtRn5e/YPsW9bcRyrH9vo9f7Y5ju3GbjJJ4Uv1hW8gr+Uv9HZRq+tHN/i6VqYYxBx 4nnPjAckSSu6y2OfSw5kmvOYtf6dUnsTWH7hLhq3j8M9c0vsu4QuiamPzrgkRIZLQQuD 2EWw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a5si13808650pgg.120.2018.11.11.12.28.41; Sun, 11 Nov 2018 12:28:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731509AbeKLGQ3 (ORCPT + 99 others); Mon, 12 Nov 2018 01:16:29 -0500 Received: from shadbolt.e.decadent.org.uk ([88.96.1.126]:50836 "EHLO shadbolt.e.decadent.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730609AbeKLFs0 (ORCPT ); Mon, 12 Nov 2018 00:48:26 -0500 Received: from [192.168.4.242] (helo=deadeye) by shadbolt.decadent.org.uk with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1gLvsl-0000lG-MD; Sun, 11 Nov 2018 19:58:55 +0000 Received: from ben by deadeye with local (Exim 4.91) (envelope-from ) id 1gLvsY-0001mL-7A; Sun, 11 Nov 2018 19:58:42 +0000 Content-Type: text/plain; charset="UTF-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit MIME-Version: 1.0 From: Ben Hutchings To: linux-kernel@vger.kernel.org, stable@vger.kernel.org CC: akpm@linux-foundation.org, "Kiran Kumar Modukuri" , "David Howells" Date: Sun, 11 Nov 2018 19:49:05 +0000 Message-ID: X-Mailer: LinuxStableQueue (scripts by bwh) Subject: [PATCH 3.16 291/366] cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag In-Reply-To: X-SA-Exim-Connect-IP: 192.168.4.242 X-SA-Exim-Mail-From: ben@decadent.org.uk X-SA-Exim-Scanned: No (on shadbolt.decadent.org.uk); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 3.16.61-rc1 review patch. If anyone has any objections, please let me know. ------------------ From: Kiran Kumar Modukuri commit 5ce83d4bb7d8e11e8c1c687d09f4b5ae67ef3ce3 upstream. In cachefiles_mark_object_active(), the new object is marked active and then we try to add it to the active object tree. If a conflicting object is already present, we want to wait for that to go away. After the wait, we go round again and try to re-mark the object as being active - but it's already marked active from the first time we went through and a BUG is issued. Fix this by clearing the CACHEFILES_OBJECT_ACTIVE flag before we try again. Analysis from Kiran Kumar Modukuri: [Impact] Oops during heavy NFS + FSCache + Cachefiles CacheFiles: Error: Overlong wait for old active object to go away. BUG: unable to handle kernel NULL pointer dereference at 0000000000000002 CacheFiles: Error: Object already active kernel BUG at fs/cachefiles/namei.c:163! [Cause] In a heavily loaded system with big files being read and truncated, an fscache object for a cookie is being dropped and a new object being looked. The new object being looked for has to wait for the old object to go away before the new object is moved to active state. [Fix] Clear the flag 'CACHEFILES_OBJECT_ACTIVE' for the new object when retrying the object lookup. [Testcase] Have run ~100 hours of NFS stress tests and have not seen this bug recur. [Regression Potential] - Limited to fscache/cachefiles. Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem") Signed-off-by: Kiran Kumar Modukuri Signed-off-by: David Howells [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings --- fs/cachefiles/namei.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -189,6 +189,7 @@ try_again: /* an old object from a previous incarnation is hogging the slot - we * need to wait for it to be destroyed */ wait_for_old_object: + clear_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags); if (fscache_object_is_live(&object->fscache)) { pr_err("\n"); pr_err("Error: Unexpected object collision\n"); @@ -250,7 +251,6 @@ wait_for_old_object: goto try_again; requeue: - clear_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags); cache->cache.ops->put_object(&xobject->fscache); _leave(" = -ETIMEDOUT"); return -ETIMEDOUT;