Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3326745imu; Sun, 11 Nov 2018 12:26:52 -0800 (PST) X-Google-Smtp-Source: AJdET5e6FXTcr8tgSQ7fLWqkyB/rxxtbARmR1gS6zkrgwKYoNVkV/R22I/9EkMe5NuifGN+BlN8D X-Received: by 2002:a17:902:650a:: with SMTP id b10-v6mr17105524plk.36.1541968012405; Sun, 11 Nov 2018 12:26:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541968012; cv=none; d=google.com; s=arc-20160816; b=hRI7F2XwmqyjMSUN57rSqlrZeoW92Dy8J0tguZ6YZD9cvimR/2pRVXKzKU+yVVQUWq GtZkib4DEWeVMHS+IvK/EVqV5k7EbniygIWRBWIsg2ZdZp8d/IGAmOTg4xuU+QmH5627 9wWkHn5WSJ1lI40ZAw8KgDB+b63SNojII0GnpbhAGsTh6ciYP8jcqjJHWJdA9X5YMhDM oiirUcdmuE9nZMMr0hVJVxqHiKvniuHOG58XeKrTk/NZcbrJ8UqVfZ4ttnAIiILlzv8O 4MRNos4nCl0sYz75NuGecD1ZYImqM813RzP828sm8Icvh2w1gCnBbJIqJfLn9WbKj3Vl kOmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:subject:message-id:date:cc:to :from:mime-version:content-transfer-encoding:content-disposition; bh=+HMYVKcrxLzWwLrHqa5Zripk6wX1Th91GcHi5HxIoQc=; b=BYf8/Kq1tLlwuNP8Zci0SFaXULSUNkdqNGCkvDQX6pIfYNBfyBsfevCXX922E5Ywhr jq9lvD73svVqsSzfzSQzR0vnmvj+YU0KRZX2cbInOQd1MXjsNHit/aICKukPQHCYN6tz GrYonXdNYT4KeuKqKK8nhEPzEuFgo67ab1DnqB3OYNOMHoiYKlAqLdsqG8R+vj++1t8N ECI8SimZO3hKsbgiVhwMQDLU4Dlm84o51jBuUOrrHMwFrcxmthRpeMLmDkdJg5F4QOag xT68YLeHfZFDaqnmqfr8/wVURVs4QosrRVaFhyp8N4UOhxcvZ6BUUa3n+XLF0zNKrOOM UumQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q64si14677535pga.280.2018.11.11.12.26.37; Sun, 11 Nov 2018 12:26:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730669AbeKLFs2 (ORCPT + 99 others); Mon, 12 Nov 2018 00:48:28 -0500 Received: from shadbolt.e.decadent.org.uk ([88.96.1.126]:50848 "EHLO shadbolt.e.decadent.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730579AbeKLFs1 (ORCPT ); Mon, 12 Nov 2018 00:48:27 -0500 Received: from [192.168.4.242] (helo=deadeye) by shadbolt.decadent.org.uk with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1gLvsl-0000l8-Pl; Sun, 11 Nov 2018 19:58:56 +0000 Received: from ben by deadeye with local (Exim 4.91) (envelope-from ) id 1gLvsY-0001mG-5j; Sun, 11 Nov 2018 19:58:42 +0000 Content-Type: text/plain; charset="UTF-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit MIME-Version: 1.0 From: Ben Hutchings To: linux-kernel@vger.kernel.org, stable@vger.kernel.org CC: akpm@linux-foundation.org, "Kiran Kumar Modukuri" , "David Howells" Date: Sun, 11 Nov 2018 19:49:05 +0000 Message-ID: X-Mailer: LinuxStableQueue (scripts by bwh) Subject: [PATCH 3.16 290/366] fscache: Fix reference overput in fscache_attach_object() error handling In-Reply-To: X-SA-Exim-Connect-IP: 192.168.4.242 X-SA-Exim-Mail-From: ben@decadent.org.uk X-SA-Exim-Scanned: No (on shadbolt.decadent.org.uk); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 3.16.61-rc1 review patch. If anyone has any objections, please let me know. ------------------ From: Kiran Kumar Modukuri commit f29507ce66701084c39aeb1b0ae71690cbff3554 upstream. When a cookie is allocated that causes fscache_object structs to be allocated, those objects are initialised with the cookie pointer, but aren't blessed with a ref on that cookie unless the attachment is successfully completed in fscache_attach_object(). If attachment fails because the parent object was dying or there was a collision, fscache_attach_object() returns without incrementing the cookie counter - but upon failure of this function, the object is released which then puts the cookie, whether or not a ref was taken on the cookie. Fix this by taking a ref on the cookie when it is assigned in fscache_object_init(), even when we're creating a root object. Analysis from Kiran Kumar: This bug has been seen in 4.4.0-124-generic #148-Ubuntu kernel BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277 fscache cookie ref count updated incorrectly during fscache object allocation resulting in following Oops. kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321! kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639! [Cause] Two threads are trying to do operate on a cookie and two objects. (1) One thread tries to unmount the filesystem and in process goes over a huge list of objects marking them dead and deleting the objects. cookie->usage is also decremented in following path: nfs_fscache_release_super_cookie -> __fscache_relinquish_cookie ->__fscache_cookie_put ->BUG_ON(atomic_read(&cookie->usage) <= 0); (2) A second thread tries to lookup an object for reading data in following path: fscache_alloc_object 1) cachefiles_alloc_object -> fscache_object_init -> assign cookie, but usage not bumped. 2) fscache_attach_object -> fails in cant_attach_object because the cookie's backing object or cookie's->parent object are going away 3) fscache_put_object -> cachefiles_put_object ->fscache_object_destroy ->fscache_cookie_put ->BUG_ON(atomic_read(&cookie->usage) <= 0); [NOTE from dhowells] It's unclear as to the circumstances in which (2) can take place, given that thread (1) is in nfs_kill_super(), however a conflicting NFS mount with slightly different parameters that creates a different superblock would do it. A backtrace from Kiran seems to show that this is a possibility: kernel BUG at/build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639! ... RIP: __fscache_cookie_put+0x3a/0x40 [fscache] Call Trace: __fscache_relinquish_cookie+0x87/0x120 [fscache] nfs_fscache_release_super_cookie+0x2d/0xb0 [nfs] nfs_kill_super+0x29/0x40 [nfs] deactivate_locked_super+0x48/0x80 deactivate_super+0x5c/0x60 cleanup_mnt+0x3f/0x90 __cleanup_mnt+0x12/0x20 task_work_run+0x86/0xb0 exit_to_usermode_loop+0xc2/0xd0 syscall_return_slowpath+0x4e/0x60 int_ret_from_sys_call+0x25/0x9f [Fix] Bump up the cookie usage in fscache_object_init, when it is first being assigned a cookie atomically such that the cookie is added and bumped up if its refcount is not zero. Remove the assignment in fscache_attach_object(). [Testcase] I have run ~100 hours of NFS stress tests and not seen this bug recur. [Regression Potential] - Limited to fscache/cachefiles. Fixes: ccc4fc3d11e9 ("FS-Cache: Implement the cookie management part of the netfs API") Signed-off-by: Kiran Kumar Modukuri Signed-off-by: David Howells [bwh: Backported to 3.16: Keep using atomic_inc() instead of fscache_cookie_get()] Signed-off-by: Ben Hutchings --- fs/cachefiles/bind.c | 3 ++- fs/fscache/cache.c | 2 +- fs/fscache/cookie.c | 7 ++++--- fs/fscache/object.c | 1 + 4 files changed, 8 insertions(+), 5 deletions(-) --- a/fs/cachefiles/bind.c +++ b/fs/cachefiles/bind.c @@ -218,7 +218,8 @@ static int cachefiles_daemon_add_cache(s "%s", fsdef->dentry->d_sb->s_id); - fscache_object_init(&fsdef->fscache, NULL, &cache->cache); + fscache_object_init(&fsdef->fscache, &fscache_fsdef_index, + &cache->cache); ret = fscache_add_cache(&cache->cache, &fsdef->fscache, cache->tag); if (ret < 0) --- a/fs/fscache/cache.c +++ b/fs/fscache/cache.c @@ -220,6 +220,7 @@ int fscache_add_cache(struct fscache_cac { struct fscache_cache_tag *tag; + ASSERTCMP(ifsdef->cookie, ==, &fscache_fsdef_index); BUG_ON(!cache->ops); BUG_ON(!ifsdef); @@ -248,7 +249,6 @@ int fscache_add_cache(struct fscache_cac if (!cache->kobj) goto error; - ifsdef->cookie = &fscache_fsdef_index; ifsdef->cache = cache; cache->fsdef = ifsdef; --- a/fs/fscache/cookie.c +++ b/fs/fscache/cookie.c @@ -302,6 +302,7 @@ static int fscache_alloc_object(struct f goto error; } + ASSERTCMP(object->cookie, ==, cookie); fscache_stat(&fscache_n_object_alloc); object->debug_id = atomic_inc_return(&fscache_object_debug_id); @@ -356,6 +357,8 @@ static int fscache_attach_object(struct _enter("{%s},{OBJ%x}", cookie->def->name, object->debug_id); + ASSERTCMP(object->cookie, ==, cookie); + spin_lock(&cookie->lock); /* there may be multiple initial creations of this object, but we only @@ -395,9 +398,7 @@ static int fscache_attach_object(struct spin_unlock(&cache->object_list_lock); } - /* attach to the cookie */ - object->cookie = cookie; - atomic_inc(&cookie->usage); + /* Attach to the cookie. The object already has a ref on it. */ hlist_add_head(&object->cookie_link, &cookie->backing_objects); fscache_objlist_add(object); --- a/fs/fscache/object.c +++ b/fs/fscache/object.c @@ -313,6 +313,7 @@ void fscache_object_init(struct fscache_ object->store_limit_l = 0; object->cache = cache; object->cookie = cookie; + atomic_inc(&cookie->usage); object->parent = NULL; #ifdef CONFIG_FSCACHE_OBJECT_LIST RB_CLEAR_NODE(&object->objlist_link);