Received: by 2002:ab2:69cc:0:b0:1fd:c486:4f03 with SMTP id n12csp86303lqp; Mon, 10 Jun 2024 19:45:52 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUKCYrxzH/P5k/k1V6FcA2+V+1haXVu7NzyAwuGN8eJW+x6nsq6O6lXIswpUYx2Gzy/8oVITrpW28vrJh4AT0xr5eslgE7ya7ZRdbUbwA== X-Google-Smtp-Source: AGHT+IH3bczSa7l9wP/xbx1QwMVSgPYD5c0WpuPojiOYoi9ERXALxSs1k2Zry3he8ffo6oHHBLav X-Received: by 2002:a17:902:e884:b0:1f7:6ed:7389 with SMTP id d9443c01a7336-1f706ed7af9mr55832575ad.66.1718073952389; Mon, 10 Jun 2024 19:45:52 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1718073952; cv=pass; d=google.com; s=arc-20160816; b=GOmgSkShqCpYqyobHnTf1TCd/bAaukZIRXRYbpCg2ogDonqrWHMiW+ITfyZYLXBT/d 3BEZYBjdHoDENIQyKlpne1XM9KTxM+9lPr9dlahI3O3NGqV5rXP0LRV9ZQ/EFJvELj3K ZaGUJOJS9K4+7vE7D1nf/mX5r3iJnQaIzWhe0Pi9LgMkoErabR09TkuMX4cY7I4rmenz EJQTZ34fQI1wST+ZMilgLtT0LhAFPb/QHiJN+1CL+p8OCdnYdyGYrP04foLXl2a1cEse vcSrIeloZbw8kTbICrUDz/RxZ4Y8CTHarnxlA8Up+ite2aToXxyYJF4Ezi2oG0bkqOku TO+Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=DJ8+7Sej2fxglO9eGa2Al3jrHJVakXZUpIE0ICxQvWE=; fh=oKlbqpzI/ivPTB7ujHoZkhsLK2Z72VgudTwYpFHJTwk=; b=pSgEt+EG1LzjeelgXiNXAZXfJ5tbXiaFProue8a5KWM4xnOXsR+HcWp5BQZozUw1og ZvKIuK95VqkRPV5blg8PDdJ3s2drVXSPp5jNaYjc50b7+pHrl0GsM2GfiaoUlUy8NupN py7gOygAw7nQTvAUF3U6XCOEr3ScPhHlRb5DHYAv9AJ97+FH7CFeecM5N+KbKJbUl2PY KOFelNJ3jOV4CZryEnvddZ/m8wxGy+QpwFKBI1kKW5OB97UE885rfTPc5xOB5d0f6GnH lPg+gqRdvTt0YTMSiTD9LY3UyYzmqoEBlm85rVA9pG0AcKC24EzPL/ZiqjfQNQDHj2r2 R5HQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Mh8MjxvG; arc=pass (i=1 spf=pass spfdomain=flex--yosryahmed.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-209195-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-209195-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id d9443c01a7336-1f6bd7f433asi47959845ad.486.2024.06.10.19.45.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Jun 2024 19:45:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-209195-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Mh8MjxvG; arc=pass (i=1 spf=pass spfdomain=flex--yosryahmed.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-209195-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-209195-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 1A496B20B4F for ; Tue, 11 Jun 2024 02:45:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A432016F913; Tue, 11 Jun 2024 02:45:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Mh8MjxvG" Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 335A416F85D for ; Tue, 11 Jun 2024 02:45:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718073924; cv=none; b=TkbTsbh73mR30XIg35zMow506ZOD044MBJWKYn7p0vdhYS1ZMlzPOqDVu17Ohv8NZez75svWbzeSa1lwwTxKDxn4jB2VpWNG9iE7KfnbkcBDQ5FvnkbWua+wOcRF8zRigKTjDVi0o261gPypWkpKzRPl9fkmqBm6eJsa2P3izfk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718073924; c=relaxed/simple; bh=s1lLl3668ohS/zArb10DiTjhPJcDIyj3wMOKa5jJbp8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=AB9K02Rf6ff8DL4h5AWhEaMtovS0BN7V0uPWFn7SLIl7hB2DMX2xkeJMzOEbXVJcplkSRHi/EK36oXQ1NYoDmJJ/4Vv6DhNb0mKNxs5Fs9d3mZ+aR/rE25pOA9sopscjxWL5mWQB6sYkQ11AuNNhalcYVB5OJOm2KO7+XoU7Zfk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yosryahmed.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Mh8MjxvG; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yosryahmed.bounces.google.com Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-627f43bec13so12806307b3.0 for ; Mon, 10 Jun 2024 19:45:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1718073922; x=1718678722; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=DJ8+7Sej2fxglO9eGa2Al3jrHJVakXZUpIE0ICxQvWE=; b=Mh8MjxvGaD2AAiBhlvp58+JW6/PMg1Y4p02sIcju/q8N1gCtJXvUXAokt3XzombDn3 6IGBClyzpy7a1WI01vvIYX9l/FTK6HBjUP4spMnr7Mexf7fjOaDSyPJIaNF1022Gieap DUfOrPVdvju/XJw4juEmTCcVysK7A3636NchWIibOaLMafvwvptb/Db8vcPdMwCIRTRo mZbgF0OQ7iiF8ejvtov8RU6CKRrLc8LPugb5wPvzetLO0A+QWYR0Wm/epzRIbSqPeoa+ 4X5w8mjW7MZyej/Eu+6jJhuJAmGy1OXa5OC2rFSs+gC8PUlmhsoaALs5OaISa7Gysoej DYRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718073922; x=1718678722; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DJ8+7Sej2fxglO9eGa2Al3jrHJVakXZUpIE0ICxQvWE=; b=P1DwznxK4Dp5m75pQ0RpsHFk5t0U/e+nkkjGgeZSOtJcRLz00ZA8D3VHNqbAMWnTjm 0OhzyUnGfHAR8iYUO8+/6MW4KIcBu725JNPt8B3Vkbn0ZOBV/ES0vpmEx+o+1ss2LMK3 OLTTUiahXh45yfvMIQ62MyBGDT3dT3yGa2y3Gd3pamnBRBjd7rgT6iHS6cRmC1Akj50C HX1vsHMHlEWJNXWNYTdmwlpeH/KZUlljFYmi+XiNmS5mb+E5yi10cKnOMWCvYudC0vSq IKS2P9+Ky/oWctsVA6lxI0jXOoJ0dinXh7UP+r0R9O7Tdk/QgNVfCy1iJoMBjYNZKHQG TjVA== X-Forwarded-Encrypted: i=1; AJvYcCXebfUxR1QyjQje2Qb/isQYoGNTslG470ne7QGEhL5yhP1N6JaahnktJhtRybrD3SoBj5858aL/atFKehuZ6TfuL2KSJKOOD6KQNZ3B X-Gm-Message-State: AOJu0Yy5isUDW+BrL+oI2VW3iex7+xKLNj7OwcRmXgDRWUqUcRDSmlwl e/YUCuYLkMCD3E3PhPaZ5UIyquoyf8NEUvBF9Zy10JMKyT8G6RB47Hnp0j1nxBWMK8KC1f7pIYt drIsrQWhqNfG+g+iJIQ== X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:29b4]) (user=yosryahmed job=sendgmr) by 2002:a05:690c:10:b0:61d:4701:5e66 with SMTP id 00721157ae682-62cd5570599mr43233527b3.2.1718073922123; Mon, 10 Jun 2024 19:45:22 -0700 (PDT) Date: Tue, 11 Jun 2024 02:45:16 +0000 In-Reply-To: <20240611024516.1375191-1-yosryahmed@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240611024516.1375191-1-yosryahmed@google.com> X-Mailer: git-send-email 2.45.2.505.gda0bf45e8d-goog Message-ID: <20240611024516.1375191-3-yosryahmed@google.com> Subject: [PATCH v3 3/3] mm: zswap: handle incorrect attempts to load large folios From: Yosry Ahmed To: Andrew Morton Cc: Johannes Weiner , Nhat Pham , Chengming Zhou , Barry Song <21cnbao@gmail.com>, Chris Li , David Hildenbrand , Matthew Wilcox , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed Content-Type: text/plain; charset="UTF-8" Zswap does not support storing or loading large folios. Until proper support is added, attempts to load large folios from zswap are a bug. For example, if a swapin fault observes that contiguous PTEs are pointing to contiguous swap entries and tries to swap them in as a large folio, swap_read_folio() will pass in a large folio to zswap_load(), but zswap_load() will only effectively load the first page in the folio. If the first page is not in zswap, the folio will be read from disk, even though other pages may be in zswap. In both cases, this will lead to silent data corruption. Proper support needs to be added before large folio swapins and zswap can work together. Looking at callers of swap_read_folio(), it seems like they are either allocated from __read_swap_cache_async() or do_swap_page() in the SWP_SYNCHRONOUS_IO path. Both of which allocate order-0 folios, so everything is fine for now. However, there is ongoing work to add to support large folio swapins [1]. To make sure new development does not break zswap (or get broken by zswap), add minimal handling of incorrect loads of large folios to zswap. First, move the call folio_mark_uptodate() inside zswap_load(). If a large folio load is attempted, and zswap was ever enabled on the system, return 'true' without calling folio_mark_uptodate(). This will prevent the folio from being read from disk, and will emit an IO error because the folio is not uptodate (e.g. do_swap_fault() will return VM_FAULT_SIGBUS). It may not be reliable recovery in all cases, but it is better than nothing. This was tested by hacking the allocation in __read_swap_cache_async() to use order 2 and __GFP_COMP. In the future, to handle this correctly, the swapin code should: (a) Fallback to order-0 swapins if zswap was ever used on the machine, because compressed pages remain in zswap after it is disabled. (b) Add proper support to swapin large folios from zswap (fully or partially). Probably start with (a) then followup with (b). [1]https://lore.kernel.org/linux-mm/20240304081348.197341-6-21cnbao@gmail.com/ Signed-off-by: Yosry Ahmed --- mm/page_io.c | 1 - mm/zswap.c | 12 ++++++++++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/mm/page_io.c b/mm/page_io.c index f1a9cfab6e748..8f441dd8e109f 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -517,7 +517,6 @@ void swap_read_folio(struct folio *folio, struct swap_iocb **plug) delayacct_swapin_start(); if (zswap_load(folio)) { - folio_mark_uptodate(folio); folio_unlock(folio); } else if (data_race(sis->flags & SWP_FS_OPS)) { swap_read_folio_fs(folio, plug); diff --git a/mm/zswap.c b/mm/zswap.c index 7fcd751e847d6..505f4b9812891 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1566,6 +1566,17 @@ bool zswap_load(struct folio *folio) if (zswap_never_enabled()) return false; + /* + * Large folios should not be swapped in while zswap is being used, as + * they are not properly handled. Zswap does not properly load large + * folios, and a large folio may only be partially in zswap. + * + * Return true without marking the folio uptodate so that an IO error is + * emitted (e.g. do_swap_page() will sigbus). + */ + if (WARN_ON_ONCE(folio_test_large(folio))) + return true; + /* * When reading into the swapcache, invalidate our entry. The * swapcache can be the authoritative owner of the page and @@ -1600,6 +1611,7 @@ bool zswap_load(struct folio *folio) folio_mark_dirty(folio); } + folio_mark_uptodate(folio); return true; } -- 2.45.2.505.gda0bf45e8d-goog