Received: by 2002:ab2:6857:0:b0:1ef:ffd0:ce49 with SMTP id l23csp2613532lqp; Mon, 25 Mar 2024 04:34:55 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXKHi8N85Nwy/B2eMnqZ3d9/3r4LA/0dIcDMsbR+c1mcBzL291A85gVk9F9nGV3Fk97r4h0wYPmoxpW9MsastC5RQfRhUofqGRXGHKD9A== X-Google-Smtp-Source: AGHT+IG0r0boYVHukakkcbcP07EHL8TrrfM08Q5MLeXSZ2X8jb2BUfZSqev3q8T8Q1IyapTF6V3T X-Received: by 2002:a05:6870:2307:b0:22a:dc7:7d66 with SMTP id w7-20020a056870230700b0022a0dc77d66mr5858156oao.59.1711366495446; Mon, 25 Mar 2024 04:34:55 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711366495; cv=pass; d=google.com; s=arc-20160816; b=g1o1NEYs7r5OUFn41IAzVKJCkBZc3RjS0z8h7XUpBKMUouUtjpRNCxcy/aSsfIm7gz BddNna1PABft9543J1URxpbYStfH3HplISLZcxsw0vKSNWIwxJFhl2BYlnYYYF1II4Ad hokb9KlJMPftAFgoJhxfEsTdm4VfYvO1vvo5XYJf6jCeCjODcJWQS5c5995zTHHVsDAT eZs0hQDTG5kKUpAcIDJ+Y44EeuQrvBitLuJhdk6OjiBeGNgq0Lt2sW7kmLXR5ShM6Dhx 3gd832dO4m7CDyjuXQIRHyfsSAF/uZ+LvNn+OUyU3WJy22T0ZCcjCIdPMp4l3exarGGD zfTQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=vGZHB2WZTfvTK8AUGI8LedlLw/3bzf9ZOZw9kvDSlsM=; fh=TivEhR0OGA+AeMnIT3iHmwSLHl5GHvh5iLdLS+RcpWU=; b=LBqzJKWwBUT7s76pReuZWtSx7JuoVuG+rsPdxTYGF8qX5xG2mDbhxydkWM7kpWDPPW RQYvMWGsWuZn+CWjx3RNDkJB9jHVtvITbIzkeYXIqceExb4hQYNhDOjDZu0r1hdg+FK8 VktuHTjg60M8BoLmmY4pGTiNLjdVoPh+dfL5Wq/Bdklucn5alNoruYT8VYMR2Nz38Oev 8TYaizBlJ+gsqzIy+0oUjHvq1515Cp8CU+p8PEM7EcZhUaXGikvtF4WdaOgyp6ytUoOx BT0Ye5tnquCz3jqJuzgAuGiZ6UOz+1D63/XSz0mgqCCQcbf2isG2bjKJQg/hnpbzaWo6 xANA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=QJ8oSI9d; arc=pass (i=1 spf=pass spfdomain=bytedance.com dkim=pass dkdomain=bytedance.com dmarc=pass fromdomain=bytedance.com); spf=pass (google.com: domain of linux-kernel+bounces-116428-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-116428-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id b68-20020a62cf47000000b006ea93543065si5002937pfg.283.2024.03.25.04.34.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 04:34:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-116428-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=QJ8oSI9d; arc=pass (i=1 spf=pass spfdomain=bytedance.com dkim=pass dkdomain=bytedance.com dmarc=pass fromdomain=bytedance.com); spf=pass (google.com: domain of linux-kernel+bounces-116428-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-116428-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 22D4E2C1B5C for ; Mon, 25 Mar 2024 11:26:53 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 683AA16AFA2; Mon, 25 Mar 2024 06:44:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="QJ8oSI9d" Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D365837D66D for ; Mon, 25 Mar 2024 03:02:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.178 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711335729; cv=none; b=XOHwtxHCo85wbJ7CmC/QHkHheSShXAqgydj54IO4C2mSVkFnt4QXt+lSar9tXdp4BDHgiyvk7IDIc/Z5lLqaAG01oz46F8ULBCHLtcg1cm8Wlu4shnii+gwjTZErU9l8HKpbTBOqIvSSeJRAtS3IclDw2tgSJ/KZZsCdeML4BHY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711335729; c=relaxed/simple; bh=TwSsbGoAzrMlvSmqLjXTibrvNixlaFbTE5ouixSO704=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=AQkTJT8AWt4HbO6aWOv6J7y0cbYwjBapYTRIEm7gyOA8ludVsL4C9XQFEvaW4vclJu2RYGHbRr9jpc6QqM7wuODZU7KDDeQnHYJR4SVm1puAty8BVGIODkkC6mreX3lE+xeoKA3wJmCO5uflp8cMptCQ6y8y6VspYAiHQxEpsdU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=QJ8oSI9d; arc=none smtp.client-ip=209.85.208.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Received: by mail-lj1-f178.google.com with SMTP id 38308e7fff4ca-2d68c6a4630so47646991fa.3 for ; Sun, 24 Mar 2024 20:02:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1711335724; x=1711940524; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=vGZHB2WZTfvTK8AUGI8LedlLw/3bzf9ZOZw9kvDSlsM=; b=QJ8oSI9dFk5VyP06hyuzq+6oQV3JDucC0yuoAs9DY8KEVdUbCRMGpBTRCBkqELc/Gj dAuAvwEkYkblXqZcx8G8G1RzBq1iyL/mKNe9pI86nRFiOxcOTDCsBzuZMmiYlBfHmIzP pOFCGGnqkQy7KZPMFGcdRYNg2UWf4/VFUa0k1TltWw+Z/rVTNFvx9Ra4tgrX+fSo+9uE 2iVFlrWFc3rITUWyrz694ZsO0+rdGyJ6C765eW9rlk8uXxGaRxY7RpcqGd0KQlJ7oPis XJjhdleA3MdYS9GPDAjU05wN6Bmxg9oIgWYrzgywV5A0iUty6rG2NIsmUl1MxTFn9ZCe Uiww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711335724; x=1711940524; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vGZHB2WZTfvTK8AUGI8LedlLw/3bzf9ZOZw9kvDSlsM=; b=BrfPNpYCOuZ6gPYS10n5AyTKmBcBcalyXMWf0iqxtZNTLLS40dJvSAUEE1V/1B47S8 AMF8ixRf4whvQODa3ml+h1AGAufHt9w0DFwIP4QrTjFczRDMjPQwXUOecdWHxe89Amih p3KBVrEgHMVe17CdPv2+wTzGXbq5UDpqllzD+SfoYXAl3vT9KYNTV2HPQOJS1FiCYj8k tBH66f+w9VnwXmZUzw+IO2g5+xJkkSuW4Nfpth+P7fgYMoRhuauITlHe2pRrnoFBiswC gFuCp7cAjZVp1wohoehsPajTRwmwA/Koem9ROSqrZdj9TBm/FXzQCp0/qExNfTdc1bDa BtXQ== X-Forwarded-Encrypted: i=1; AJvYcCWm2//uMVxRlFEyT0AiAa/SJncXFkzyeEZx85e4p1XtV4cSKwMp5p4oRfek3PR2vyBZAntO+1OtB1WqMBs19RL6GnBuOh+tmr5Q1KBd X-Gm-Message-State: AOJu0Yz/FQt4zeuuyVnY+rabY1/OKZJzOLi8AktfkFvc8F3ZxSaBNWrb vZmjQsqKaEHpzi7F9aFAq2t/4LxNSdJGJBLM5OOOLWNF8U54jdQSPyiBmgztpBJK0+ZAv5C4cC4 SBQRVZQofPGtg4CCKMBTDz3YYCaORrIhbSWwDhw== X-Received: by 2002:a2e:88d0:0:b0:2d2:206a:2f2a with SMTP id a16-20020a2e88d0000000b002d2206a2f2amr4041890ljk.17.1711335724075; Sun, 24 Mar 2024 20:02:04 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240324210447.956973-1-hannes@cmpxchg.org> In-Reply-To: <20240324210447.956973-1-hannes@cmpxchg.org> From: Zhongkun He Date: Mon, 25 Mar 2024 11:01:52 +0800 Message-ID: Subject: Re: [External] [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices To: Johannes Weiner Cc: Andrew Morton , Chengming Zhou , Yosry Ahmed , Barry Song <21cnbao@gmail.com>, Chris Li , Nhat Pham , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Mar 25, 2024 at 5:05=E2=80=AFAM Johannes Weiner wrote: > > Zhongkun He reports data corruption when combining zswap with zram. > > The issue is the exclusive loads we're doing in zswap. They assume > that all reads are going into the swapcache, which can assume > authoritative ownership of the data and so the zswap copy can go. > > However, zram files are marked SWP_SYNCHRONOUS_IO, and faults will try > to bypass the swapcache. This results in an optimistic read of the > swap data into a page that will be dismissed if the fault fails due to > races. In this case, zswap mustn't drop its authoritative copy. > > Link: https://lore.kernel.org/all/CACSyD1N+dUvsu8=3DzV9P691B9bVq33erwOXNT= mEaUbi9DrDeJzw@mail.gmail.com/ > Reported-by: Zhongkun He > Fixes: b9c91c43412f ("mm: zswap: support exclusive loads") > Cc: stable@vger.kernel.org [6.5+] > Signed-off-by: Johannes Weiner > Tested-by: Zhongkun He > --- > mm/zswap.c | 23 +++++++++++++++++++---- > 1 file changed, 19 insertions(+), 4 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index 535c907345e0..41a1170f7cfe 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1622,6 +1622,7 @@ bool zswap_load(struct folio *folio) > swp_entry_t swp =3D folio->swap; > pgoff_t offset =3D swp_offset(swp); > struct page *page =3D &folio->page; > + bool swapcache =3D folio_test_swapcache(folio); > struct zswap_tree *tree =3D swap_zswap_tree(swp); > struct zswap_entry *entry; > u8 *dst; > @@ -1634,7 +1635,20 @@ bool zswap_load(struct folio *folio) > spin_unlock(&tree->lock); > return false; > } > - zswap_rb_erase(&tree->rbroot, entry); > + /* > + * When reading into the swapcache, invalidate our entry. The > + * swapcache can be the authoritative owner of the page and > + * its mappings, and the pressure that results from having two > + * in-memory copies outweighs any benefits of caching the > + * compression work. > + * > + * (Most swapins go through the swapcache. The notable > + * exception is the singleton fault on SWP_SYNCHRONOUS_IO > + * files, which reads into a private page and may free it if > + * the fault fails. We remain the primary owner of the entry.) > + */ > + if (swapcache) > + zswap_rb_erase(&tree->rbroot, entry); > spin_unlock(&tree->lock); > > if (entry->length) > @@ -1649,9 +1663,10 @@ bool zswap_load(struct folio *folio) > if (entry->objcg) > count_objcg_event(entry->objcg, ZSWPIN); > > - zswap_entry_free(entry); > - > - folio_mark_dirty(folio); > + if (swapcache) { > + zswap_entry_free(entry); > + folio_mark_dirty(folio); > + } > > return true; > } > -- > 2.44.0 > Good solution and makes great sense to me. Thanks.