Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp3979430pxb; Tue, 25 Jan 2022 00:32:04 -0800 (PST) X-Google-Smtp-Source: ABdhPJy0YqK6+iJKjifxTH99+m8XM2tRcYAo8AyaBqEsjlwqBPLA1QiOz4Yokym9LBOc918Zz5DN X-Received: by 2002:a17:902:c947:b0:14a:ff21:afe3 with SMTP id i7-20020a170902c94700b0014aff21afe3mr17742873pla.49.1643099524164; Tue, 25 Jan 2022 00:32:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643099524; cv=none; d=google.com; s=arc-20160816; b=Is1cNIgZXYYFxqCk4rCAzuCwSksg6EUx1bja2AGQBt9EZRJfAo9svutKBeVvNq5fz7 9hVEBUtioD0vdhInDSbJAPandAMWSUdPrw2tHJ+oDn7q89Fpz6yDgeW1vJFqGoVv57Zf 3QDGUDKdo+Le75UmudBO7/eHw468q0tDKPyuMiLSDOFyD0SKnBX0folfhyOh31wOmjS6 YN+SeldmqOGsdgTuiulF/Q/ovJ1mEiESj7Ej8GKCexFBFOAF+37oMS7oq0qLOwsRL4/j rMAPOpYibQkuAP06pl8Rcvb+zoWCxc/NL/LqhKXpehAQbRPH/6cCRkI37VlnBQ1G22Jm lmyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=ke+tN6p0eMZUL/G8InAKaaT0umukQNqjAMZQPwXYRNE=; b=kBibRkK+yXdpVAcaK5psQGWgwTOzgQZ1s5ImZwn/uNKZldQeDoKvSsGSC4hOcLaLgd R50uD4Nh+Jbj3++KyrlfffIr5CHRnLVYYub0x5F3t7mjKf69geJcOOar13T5FaCeR9HL bRP75zozo11adFpVMfveb+ly4t1wnUMcyBa2n9HV34DXM4uZfbHB7r2MVYoVV/f7pPab Qlx20sWLWGnB/g/0GCuKK/6K562RUT0TveLRAPvNyPC8x8rCYSwGCf/jiH+KdOy7wStX opxvhIa/BNZJaH8+lX/CtkSMcK/Hx3qj2wge5dOO/7mYiWw+PfyO9NyKUNU1XrM7w+E4 LOaw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=YDK+xjSn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w198si13075155pfc.73.2022.01.25.00.31.52; Tue, 25 Jan 2022 00:32:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=YDK+xjSn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1359374AbiAYCmk (ORCPT + 99 others); Mon, 24 Jan 2022 21:42:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1382509AbiAXUcg (ORCPT ); Mon, 24 Jan 2022 15:32:36 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 483FEC07E286; Mon, 24 Jan 2022 11:44:23 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 0FFC4B8121C; Mon, 24 Jan 2022 19:44:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2BA94C340E5; Mon, 24 Jan 2022 19:44:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1643053460; bh=QhyT2bS2KUOxuRjcFZtcIPFGFLrQN5ZORdtoOHjRbOg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YDK+xjSnbQLZMKS5w3FvvXMlJPlxh2vWl+d7QlI60HtfBCDrGnf1df91r25McJkGC 6xvnfWxdphYERRHAjKj4wAeXLOwZuR8z9L06l2ekl7MwQ/GhbZlz0d5IKncmYP5gOy AlKS0jDGJOjafIAhFUektZh0aC7V27onmHccKiag= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Gang Li , Muchun Song , "Kirill A. Shutemov" , Hugh Dickins , Andrew Morton , Linus Torvalds Subject: [PATCH 5.10 040/563] shmem: fix a race between shmem_unused_huge_shrink and shmem_evict_inode Date: Mon, 24 Jan 2022 19:36:45 +0100 Message-Id: <20220124184025.818939676@linuxfoundation.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220124184024.407936072@linuxfoundation.org> References: <20220124184024.407936072@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Gang Li commit 62c9827cbb996c2c04f615ecd783ce28bcea894b upstream. Fix a data race in commit 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure"). Here are call traces causing race: Call Trace 1: shmem_unused_huge_shrink+0x3ae/0x410 ? __list_lru_walk_one.isra.5+0x33/0x160 super_cache_scan+0x17c/0x190 shrink_slab.part.55+0x1ef/0x3f0 shrink_node+0x10e/0x330 kswapd+0x380/0x740 kthread+0xfc/0x130 ? mem_cgroup_shrink_node+0x170/0x170 ? kthread_create_on_node+0x70/0x70 ret_from_fork+0x1f/0x30 Call Trace 2: shmem_evict_inode+0xd8/0x190 evict+0xbe/0x1c0 do_unlinkat+0x137/0x330 do_syscall_64+0x76/0x120 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 A simple explanation: Image there are 3 items in the local list (@list). In the first traversal, A is not deleted from @list. 1) A->B->C ^ | pos (leave) In the second traversal, B is deleted from @list. Concurrently, A is deleted from @list through shmem_evict_inode() since last reference counter of inode is dropped by other thread. Then the @list is corrupted. 2) A->B->C ^ ^ | | evict pos (drop) We should make sure the inode is either on the global list or deleted from any local list before iput(). Fixed by moving inodes back to global list before we put them. [akpm@linux-foundation.org: coding style fixes] Link: https://lkml.kernel.org/r/20211125064502.99983-1-ligang.bdlg@bytedance.com Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure") Signed-off-by: Gang Li Reviewed-by: Muchun Song Acked-by: Kirill A. Shutemov Cc: Hugh Dickins Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/shmem.c | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) --- a/mm/shmem.c +++ b/mm/shmem.c @@ -527,7 +527,7 @@ static unsigned long shmem_unused_huge_s struct shmem_inode_info *info; struct page *page; unsigned long batch = sc ? sc->nr_to_scan : 128; - int removed = 0, split = 0; + int split = 0; if (list_empty(&sbinfo->shrinklist)) return SHRINK_STOP; @@ -542,7 +542,6 @@ static unsigned long shmem_unused_huge_s /* inode is about to be evicted */ if (!inode) { list_del_init(&info->shrinklist); - removed++; goto next; } @@ -550,12 +549,12 @@ static unsigned long shmem_unused_huge_s if (round_up(inode->i_size, PAGE_SIZE) == round_up(inode->i_size, HPAGE_PMD_SIZE)) { list_move(&info->shrinklist, &to_remove); - removed++; goto next; } list_move(&info->shrinklist, &list); next: + sbinfo->shrinklist_len--; if (!--batch) break; } @@ -575,7 +574,7 @@ next: inode = &info->vfs_inode; if (nr_to_split && split >= nr_to_split) - goto leave; + goto move_back; page = find_get_page(inode->i_mapping, (inode->i_size & HPAGE_PMD_MASK) >> PAGE_SHIFT); @@ -589,38 +588,44 @@ next: } /* - * Leave the inode on the list if we failed to lock - * the page at this time. + * Move the inode on the list back to shrinklist if we failed + * to lock the page at this time. * * Waiting for the lock may lead to deadlock in the * reclaim path. */ if (!trylock_page(page)) { put_page(page); - goto leave; + goto move_back; } ret = split_huge_page(page); unlock_page(page); put_page(page); - /* If split failed leave the inode on the list */ + /* If split failed move the inode on the list back to shrinklist */ if (ret) - goto leave; + goto move_back; split++; drop: list_del_init(&info->shrinklist); - removed++; -leave: + goto put; +move_back: + /* + * Make sure the inode is either on the global list or deleted + * from any local list before iput() since it could be deleted + * in another thread once we put the inode (then the local list + * is corrupted). + */ + spin_lock(&sbinfo->shrinklist_lock); + list_move(&info->shrinklist, &sbinfo->shrinklist); + sbinfo->shrinklist_len++; + spin_unlock(&sbinfo->shrinklist_lock); +put: iput(inode); } - spin_lock(&sbinfo->shrinklist_lock); - list_splice_tail(&list, &sbinfo->shrinklist); - sbinfo->shrinklist_len -= removed; - spin_unlock(&sbinfo->shrinklist_lock); - return split; }