Received: by 2002:a89:2c3:0:b0:1ed:23cc:44d1 with SMTP id d3csp807024lqs; Tue, 5 Mar 2024 18:20:09 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWYyADwehDNYFJT59ebRwL+x2Sv0PVCLomGZGRBBUPrFNOGgPrzeyS9CyNj0a1U2rrjJH+6A9f4sg8iCrlFdmOW8msiXUaOHmQGPq+1xA== X-Google-Smtp-Source: AGHT+IEW0Vz9dwmm+6HmcKOP1iP+iUEwL1BWu1BrNbodm/zT3A6iFLQYsVTogX3vdw93shxrQJtC X-Received: by 2002:a17:906:c49:b0:a3f:173a:224c with SMTP id t9-20020a1709060c4900b00a3f173a224cmr8402003ejf.51.1709691609056; Tue, 05 Mar 2024 18:20:09 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709691609; cv=pass; d=google.com; s=arc-20160816; b=CZO+4186pkx7BsGeBKzAnn4AMVP8aQDQwFEhDmd3X24ko5vcGskff0mxagkjx3Sto8 7IExGKSTFRSW99hHnrm3lv3jt4WJR0dm9LIEfi71x0jUUUpPORHZqP1wmzioNMmFdQm0 RqzPiG0ldb88If78aOWzlhYkR8ViT/2AhUQH9IWHvWmTk1nAhn9yb65rUhJg41/YyTH+ JSIz4EFRFb6mPOb+5E1aqbDQ16IVpM4gPYNv5Ij7Ac6K+hEfOuFXW02A4M0U818Os71W p7aZIYgOr6bLcn54gyqjV4fP+/AHyFY3JEyj3OUJ+/kaZkv1z8l6xsh39W16ql8hUO3a oKIA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=acf+30rZOU0mvAbyQhYXFdJzkgjEIoHhA9P8NXsNW7E=; fh=qBCAJq87bJKCz8326VRTATOBa6+1XeVMnKxWgkpKVBM=; b=FC0oAGqfCHdHjUJe4YaQ4h1DAIYabH0OtK10++jBK8MyaumT8SFUVqlP46PamILKal JCHIbjVIrhuzEu9KC4vA51iqrwo5y7TgmqFn9BQcRaphgF9TI+J7//IDppukSf5pfniC 76Sh+0z+Gp6J9Ao0/PhEzHy7ZWaa3bjX6TLbJNwXt/6qhfulip5JFFGqRcULdenuRQog 6rMYJd2Vvlu0/VVahv/2OBWE8Kbi9KcO7vS5heVGWGmW5qphGD36KVRQFPjXUjy+3PS2 dEgFpeTWUoq6hV5mujKK7jv7ANWW9wWvAJKOAH7ii0pHWVJOFHXRbVTHvaJ4V4HkW9U5 L14w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=PpUTOqRZ; arc=pass (i=1 spf=pass spfdomain=cmpxchg.org dkim=pass dkdomain=cmpxchg-org.20230601.gappssmtp.com dmarc=pass fromdomain=cmpxchg.org); spf=pass (google.com: domain of linux-kernel+bounces-93236-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-93236-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id sh14-20020a1709076e8e00b00a45bb775e45si111731ejc.352.2024.03.05.18.20.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Mar 2024 18:20:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-93236-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=PpUTOqRZ; arc=pass (i=1 spf=pass spfdomain=cmpxchg.org dkim=pass dkdomain=cmpxchg-org.20230601.gappssmtp.com dmarc=pass fromdomain=cmpxchg.org); spf=pass (google.com: domain of linux-kernel+bounces-93236-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-93236-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 962721F23F94 for ; Wed, 6 Mar 2024 02:20:08 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 98802D52E; Wed, 6 Mar 2024 02:20:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20230601.gappssmtp.com header.i=@cmpxchg-org.20230601.gappssmtp.com header.b="PpUTOqRZ" Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE11479E4 for ; Wed, 6 Mar 2024 02:19:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.51 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709691599; cv=none; b=bCVHxstHZhrjMwpiU5jJVcuwp5weGnx+LjEEnlYi6+0DCTp6ZQpYykbRRFL90un7rHJ0Cc8x1ZHk/vfBBrdQsf52siROPeVR4DAECR1HMxpe0bWbVXGS90fksJedZrXiNUXACq5oO4k4it+SnnFHmCnWo/qg6oz3UlWtrSkgvko= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709691599; c=relaxed/simple; bh=onWesjGOBT5yTGQmpOG1LxJdI89DNGUvK1WI3F/z19k=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=mISzBK1XpjrvC97dySQmNBIh19fJSex+5CdehfEmGPPC1USWcRND7F5QEmAGfNmlAitBZzcjEJ3HfMjCLVRjMzhLx18+FzWKCEgWUmT5TO8aEyDtIquZc8rUdDQjmQEvkFbbn13DvQNvliPdqN7L7ssNxjyw7/LbspDl85z8Kaw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org; spf=pass smtp.mailfrom=cmpxchg.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20230601.gappssmtp.com header.i=@cmpxchg-org.20230601.gappssmtp.com header.b=PpUTOqRZ; arc=none smtp.client-ip=209.85.219.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cmpxchg.org Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-68ee2c0a237so3308196d6.1 for ; Tue, 05 Mar 2024 18:19:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1709691595; x=1710296395; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=acf+30rZOU0mvAbyQhYXFdJzkgjEIoHhA9P8NXsNW7E=; b=PpUTOqRZ0wRh0l7sEW5J/XK5emyGAu0rv9q4q6oV9molp3oDkdbmvw5yqpeg4AaoXY BcJEBWRb92PJvVoZeUTEn1Qf4eJvtQG+Hn0I2PIv2yt2N4Q0gdZriuEfbJHOeohBvz/N 9l6oSwegZs6oOrn2Hittxpn8zrdRtXKQEPa7L4Cvvqun30CLp7meYhpMwwgSSe8NGSfJ 6ztyVWJBENTKzqUXmtiZZu545UMModMSqNLP/qZ1biac+tJE8Cbf3fNCOYbxiNxELh9r iu+tKM+g4LD85WGoTJERUihZy5YuQKeqSueg4fpPVZ4quH1kFoC0H2HTzd6EmlRJHC8e LdtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709691595; x=1710296395; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=acf+30rZOU0mvAbyQhYXFdJzkgjEIoHhA9P8NXsNW7E=; b=v12JOnRsHa34i0e0cGSN9YmUeq8S//QxdJTGbP93odeUV2TB2jMtfbRNpadFH97lhg HuwsKVlbF9ABBCUmYvhnkXJ0EfsGRpoOHp28x+ElbaZhjgtuGzXp+Wm/+elxfOC6zc5E woY/sdEeNVDqgHZGXJR+c4tU3rMhHLP7A9M69E/tZ+VKfJv3o6Mi+UktmPDsetxhvfF0 3Rf2vob+B6wvAyvlKt0ifGuErMa8Jb5tFJCIt/VXjfd5eMcCynhTeppPoHk3JiEggjTx iANESUQ0RH4hp4cywy89dNY1NQhISPLNTflEHDJgdFuVnWa/W4yD5rXUhtCidvz+DmEg L/CA== X-Forwarded-Encrypted: i=1; AJvYcCWXGsSzGfvf3/p1C4eh5wMwrlqlQuw+X6DLB04Dymn5EiO0P+O+WQI0vzys+Vjf0zeY6Q7sRhRB3VwpWBYe5Os5sj5n0h0hr4yPXjyN X-Gm-Message-State: AOJu0YzWYNKLvC/1lcklS+Ph7EUfGJerigG7/NXKIQ9GiXGelXb6QhUO lSmauckmnxf4MygpmXSKOS6m559OWgE/wkIinKIyY4EuQt17Sq8bivU0njk+d5M= X-Received: by 2002:a0c:dc04:0:b0:690:68a:a141 with SMTP id s4-20020a0cdc04000000b00690068aa141mr6220950qvk.24.1709691595613; Tue, 05 Mar 2024 18:19:55 -0800 (PST) Received: from localhost (2603-7000-0c01-2716-da5e-d3ff-fee7-26e7.res6.spectrum.com. [2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with ESMTPSA id pd6-20020a056214490600b006907801a000sm2629696qvb.26.2024.03.05.18.19.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Mar 2024 18:19:54 -0800 (PST) Date: Tue, 5 Mar 2024 21:19:50 -0500 From: Johannes Weiner To: Ryan Roberts Cc: Andrew Morton , David Hildenbrand , "Huang, Ying" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: Re: [PATCH v1] mm: swap: Fix race between free_swap_and_cache() and swapoff() Message-ID: <20240306021950.GA801254@cmpxchg.org> References: <20240305151349.3781428-1-ryan.roberts@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240305151349.3781428-1-ryan.roberts@arm.com> Hi Ryan, On Tue, Mar 05, 2024 at 03:13:49PM +0000, Ryan Roberts wrote: > There was previously a theoretical window where swapoff() could run and > teardown a swap_info_struct while a call to free_swap_and_cache() was > running in another thread. This could cause, amongst other bad > possibilities, swap_page_trans_huge_swapped() (called by > free_swap_and_cache()) to access the freed memory for swap_map. > > This is a theoretical problem and I haven't been able to provoke it from > a test case. But there has been agreement based on code review that this > is possible (see link below). > > Fix it by using get_swap_device()/put_swap_device(), which will stall > swapoff(). There was an extra check in _swap_info_get() to confirm that > the swap entry was valid. This wasn't present in get_swap_device() so > I've added it. I couldn't find any existing get_swap_device() call sites > where this extra check would cause any false alarms. Unfortunately, I found one, testing current mm-everything: [ 189.420777] get_swap_device: Unused swap offset entry 000641ae [ 189.426648] ------------[ cut here ]------------ [ 189.431290] WARNING: CPU: 3 PID: 369 at mm/swapfile.c:1301 get_swap_device+0x2da/0x3f0 [ 189.439242] CPU: 3 PID: 369 Comm: cachish Not tainted 6.8.0-rc5-00527-g19d98776f227-dirty #30 [ 189.447791] Hardware name: Micro-Star International Co., Ltd. MS-7B98/Z390-A PRO (MS-7B98), BIOS 1.80 12/25/2019 [ 189.457998] RIP: 0010:get_swap_device+0x2da/0x3f0 [ 189.462721] Code: a8 03 75 2a 65 48 ff 08 e9 36 ff ff ff 4c 89 e9 48 c7 c2 40 fd 91 83 48 c7 c6 c0 f9 91 83 48 c7 c7 60 ee 91 83 e8 26 2f af ff <0f> 0b eb af 4c 8d 6b 08 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 [ 189.481497] RSP: 0000:ffffc90000cff8a8 EFLAGS: 00010282 [ 189.486760] RAX: 0000000000000032 RBX: ffff8881262eee00 RCX: 0000000000000000 [ 189.493909] RDX: 0000000000000001 RSI: ffffffff83a1e620 RDI: 0000000000000001 [ 189.501054] RBP: 1ffff9200019ff15 R08: 0000000000000001 R09: fffff5200019fee1 [ 189.508202] R10: ffffc90000cff70f R11: 0000000000000001 R12: ffffc900018d51ae [ 189.515346] R13: 00000000000641ae R14: 0000000000000000 R15: 00000000000641af [ 189.522494] FS: 00007f7120263680(0000) GS:ffff88841c380000(0000) knlGS:0000000000000000 [ 189.530591] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 189.536373] CR2: 00007f6e659a2ea3 CR3: 0000000046860004 CR4: 00000000003706f0 [ 189.543516] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 189.550661] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 189.557811] Call Trace: [ 189.560276] [ 189.562393] ? __warn+0xc4/0x250 [ 189.565647] ? get_swap_device+0x2da/0x3f0 [ 189.569761] ? report_bug+0x348/0x440 [ 189.573444] ? handle_bug+0x6d/0x90 [ 189.576951] ? exc_invalid_op+0x13/0x40 [ 189.580810] ? asm_exc_invalid_op+0x16/0x20 [ 189.585019] ? get_swap_device+0x2da/0x3f0 [ 189.589142] ? get_swap_device+0x2da/0x3f0 [ 189.593255] ? __pfx_get_swap_device+0x10/0x10 [ 189.597717] __read_swap_cache_async+0x9f/0x630 [ 189.602281] ? __pfx___read_swap_cache_async+0x10/0x10 [ 189.607439] ? __mod_memcg_lruvec_state+0x238/0x4f0 [ 189.612344] ? __pfx_swp_swap_info+0x10/0x10 [ 189.616652] swap_cluster_readahead+0x2cd/0x510 [ 189.621206] ? __pfx_swap_cluster_readahead+0x10/0x10 [ 189.626279] ? swap_cache_get_folio+0xcd/0x360 [ 189.630760] ? __count_memcg_events+0x10a/0x370 [ 189.635318] shmem_swapin_folio+0x2f2/0xc60 [ 189.639525] ? __pfx__raw_spin_lock+0x10/0x10 [ 189.643908] ? __pte_offset_map+0x19/0x1d0 [ 189.648024] shmem_get_folio_gfp+0x307/0xe30 [ 189.652323] ? __schedule+0x9f0/0x1fe0 [ 189.656096] ? __pfx_shmem_get_folio_gfp+0x10/0x10 [ 189.660923] ? filemap_map_pages+0x999/0xe60 [ 189.665211] shmem_fault+0x1d9/0x810 [ 189.668834] ? __pfx_shmem_fault+0x10/0x10 [ 189.672954] ? __pfx_filemap_map_pages+0x10/0x10 [ 189.677590] __do_fault+0xed/0x390 [ 189.681012] __handle_mm_fault+0x1ba1/0x2e80 [ 189.685297] ? __pfx___handle_mm_fault+0x10/0x10 [ 189.689933] ? __pfx_down_read_trylock+0x10/0x10 [ 189.694570] ? __pfx_hrtimer_nanosleep+0x10/0x10 [ 189.699215] handle_mm_fault+0xe0/0x560 [ 189.703074] ? __pfx_restore_fpregs_from_fpstate+0x10/0x10 [ 189.708620] do_user_addr_fault+0x2ba/0x9d0 [ 189.712828] exc_page_fault+0x54/0x90 [ 189.716508] asm_exc_page_fault+0x22/0x30 [ 189.720535] RIP: 0033:0x5640dc2d72b5 [ 189.724131] Code: 98 48 ba 00 00 00 00 03 00 00 00 48 89 d6 48 2b 75 a0 ba 00 00 00 00 48 f7 f6 48 89 d1 48 89 ca 48 8b 45 a0 48 01 d0 48 01 d8 <0f> b6 00 bf e8 03 00 00 e8 1e fe ff ff 48 83 45 a8 01 48 8d 45 d0 [ 189.742922] RSP: 002b:00007ffc227e3f60 EFLAGS: 00010206 [ 189.748165] RAX: 00007f6e659a2ea3 RBX: 00007f6e2007e000 RCX: 0000000045924ea3 [ 189.755311] RDX: 0000000045924ea3 RSI: 0000000300000000 RDI: 00007f71202586a0 [ 189.762483] RBP: 00007ffc227e3fe0 R08: 00007f7120258074 R09: 00007f71202580a0 [ 189.769633] R10: 0000000000019458 R11: 00000000008aa400 R12: 0000000000000000 [ 189.776781] R13: 00007ffc227e4128 R14: 00007f712029d000 R15: 00005640dc2d9dd8 [ 189.783928] [ 189.786126] ---[ end trace 0000000000000000 ]--- [ 285.827888] get_swap_device: Unused swap offset entry 0018403f [ 320.699306] get_swap_device: Unused swap offset entry 000b001b [ 354.031339] get_swap_device: Unused swap offset entry 000681a9 [ 364.958435] get_swap_device: Unused swap offset entry 001f4055 [ 364.976235] get_swap_device: Unused swap offset entry 001f4057 [ 365.530174] get_swap_device: Unused swap offset entry 000d415c [ 394.223792] get_swap_device: Unused swap offset entry 001540d0 [ 394.317299] get_swap_device: Unused swap offset entry 000341d9 [ 394.341727] get_swap_device: Unused swap offset entry 0006c07e [ 396.062365] get_swap_device: Unused swap offset entry 000541a4 [ 396.068262] get_swap_device: Unused swap offset entry 000541a7 [ 402.629551] get_swap_device: Unused swap offset entry 00294021 [ 436.740622] get_swap_device: Unused swap offset entry 00334155 [ 436.758527] get_swap_device: Unused swap offset entry 001b417c swap_cluster_readahead() calls __read_swap_cache_async() on a range of made-up swap entries around the faulting slot. The device and the range (si->max) are valid, but the specific entry might not be in use. __read_swap_cache_async() instead relies on swapcache_prepare() returning -ENOENT to catch this and skip gracefully. Confirmed that reverting the patch makes the warnings go away.