Received: by 2002:a05:7412:40d:b0:e2:908c:2ebd with SMTP id 13csp83964rdf; Mon, 20 Nov 2023 17:17:53 -0800 (PST) X-Google-Smtp-Source: AGHT+IEa5HoRfQwpPkk9RDfZbDLcxLU2iRKSTK6CKrJMG6oID0RywLdsGRqUO2tTMP/080NHaZTa X-Received: by 2002:a17:903:2287:b0:1cc:5ce4:f64b with SMTP id b7-20020a170903228700b001cc5ce4f64bmr1708453plh.8.1700529472741; Mon, 20 Nov 2023 17:17:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700529472; cv=none; d=google.com; s=arc-20160816; b=pyhLDUFABpvaukY7afJNj0viIb4ocpchGtuP+oE2YiGcBkDOjhVw8PD1ko919+sHUE Y6tMJ4lskfljn2ZNmz0XnxwbFa1X0dRlJ0RDLms6iwTo31vQgs98c2fpc+Y3K7pdvb3k TCD5H6tAg8jLXTngv3v3FeTFA263iVe3Hpp8m3zV5y7QGZnb8WqZeh/u4HueuYqUxhxY IavNTUnRGWx582ZcpELH8irB9eUSOsxDxiDE7FzU03LGFvILLP5gOwcKDjlzpsyfFNAC 3d77XSsaKJQXzC0gNrfWRSogpe0cRPcwr/olgWIZyPK3tVHdP4p2os/Nxyf1sPUWFQS4 8WUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=AjY00jS05kwn5BSYUnYd0PcyI2gRMkkzxEGPgeY8Yfc=; fh=wvfcfNmB/0livP+I1tI8F1oO+6bhIix/c5rNjWy6LhY=; b=WP5EQFG74STxRLomEvYWgsGRe9/D2Z4a/fcxL6p5LPjc6j6A5qtzwcj+Ta3dV/ooYR B+wlbL408oLDIxKRR1IuIEXjG9mopz/veqC+LjMGOG+xSj/Inx/GpjiwpCjc9vNhH93a if/tkyD9lZA8LVSHkU03hwYcPjBLrftRsD5nRMKIb8yNymIdzgPOyvtzfB188J48x1jv Tz2z4VezZCMtTfjfgh7lVoBI+Es9L7NGTJRrSgZsXPZcrCLNHM5W0gwPYeNoythx1ity jxq4LPJkpSgI0vLPGrMsuOLwei4Olx3XzYd/rqSHttyMjK5J3TqcFq9ImCav+yMQfYoy 3V4w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=nHnVQkHn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id b12-20020a170902d50c00b001cf665669f7si2052730plg.426.2023.11.20.17.17.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Nov 2023 17:17:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=nHnVQkHn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 43C278067E04; Mon, 20 Nov 2023 17:16:15 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232753AbjKUBQA (ORCPT + 99 others); Mon, 20 Nov 2023 20:16:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230107AbjKUBP7 (ORCPT ); Mon, 20 Nov 2023 20:15:59 -0500 Received: from mail-ej1-x62b.google.com (mail-ej1-x62b.google.com [IPv6:2a00:1450:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ADDAF91 for ; Mon, 20 Nov 2023 17:15:55 -0800 (PST) Received: by mail-ej1-x62b.google.com with SMTP id a640c23a62f3a-9fffa4c4f43so157285466b.3 for ; Mon, 20 Nov 2023 17:15:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1700529354; x=1701134154; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=AjY00jS05kwn5BSYUnYd0PcyI2gRMkkzxEGPgeY8Yfc=; b=nHnVQkHnshxH2MiLtwIbMpoGlch7WzXOsBB2BJewBf9GlPvC0Wc6Nzi+zVDhQzroC8 mf2W3jc00IYD9bU38xzBjJkP0py95pxgWgp5iMnRZ4gNH/qrlck1j0yfjnt0e30tRoq0 jmXtZATCvwgLvphZP03YR3JCDDveiArJwi4jbcwZq/uwcMGVSBSXbpuSY4KpbcGUfqtO WQ/+TO3+h1dhqrFwDeaHcHHqrX0tX9/My9kFFKNqJuvS+wlSTSlaQF9weKksNEx0iPok vLa6ztoz1pvnFB3tnJin08siPqCxrioHDQzDteSq/goUeUIYDM1VCFjgzcQJZLVrfSFB MQSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700529354; x=1701134154; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AjY00jS05kwn5BSYUnYd0PcyI2gRMkkzxEGPgeY8Yfc=; b=d7XbCPhVBXpK40WE1ORLVxv0sCkf0rKCgZm8R1IQQPW4uLRsryj6RFHhHjr0PjUiEJ BD16XPP2Ds4ikflSUBPZang0/lifNbJUUYq17+2FaAFACSlLUjfr4ZBq4bDqW7h31q4u iZ0O51D2/7ZS3T09OrS2tUiS+2PJ8NOauB1ARLK2yGXrXJyeXNw/UzZP1tNlj1ybCOo1 pHyIpduYmfdGitLYPsrMlKESO8HiE93+JXTuHN/fexo6ell1fvvMgOJukDL3s92xH1A+ yhY0+IuOr1kuM1B673Q7CDtQY+gXYWKdHCygO+WJwp0SDeY4lbyBu1Vqa8tUd947QVra qZIg== X-Gm-Message-State: AOJu0YzQGQefNEeA6MHUxT7lt57uUflc6Jua1o3BAKGLh3ygnN14LYBN zqFB6fI6knXVFmqMSFxP5KaaVJwVwB97KV0+Ks+HJQ== X-Received: by 2002:a17:906:51d0:b0:9eb:af0e:39da with SMTP id v16-20020a17090651d000b009ebaf0e39damr6746294ejk.46.1700529353980; Mon, 20 Nov 2023 17:15:53 -0800 (PST) MIME-Version: 1.0 References: <20231113130601.3350915-1-hezhongkun.hzk@bytedance.com> <8734x1cdtr.fsf@yhuang6-desk2.ccr.corp.intel.com> <87edgkapsz.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87edgkapsz.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Yosry Ahmed Date: Mon, 20 Nov 2023 17:15:15 -0800 Message-ID: Subject: Re: [PATCH] mm:zswap: fix zswap entry reclamation failure in two scenarios To: "Huang, Ying" Cc: Chris Li , Zhongkun He , Andrew Morton , Johannes Weiner , Nhat Pham , Seth Jennings , Dan Streetman , Vitaly Wool , linux-mm , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Mon, 20 Nov 2023 17:16:15 -0800 (PST) On Mon, Nov 20, 2023 at 4:57=E2=80=AFPM Huang, Ying = wrote: > > Yosry Ahmed writes: > > > On Sun, Nov 19, 2023 at 7:20=E2=80=AFPM Huang, Ying wrote: > >> > >> Chris Li writes: > >> > >> > On Thu, Nov 16, 2023 at 12:19=E2=80=AFPM Yosry Ahmed wrote: > >> >> > >> >> Not bypassing the swap slot cache, just make the callbacks to > >> >> invalidate the zswap entry, do memg uncharging, etc when the slot i= s > >> >> no longer used and is entering the swap slot cache (i.e. when > >> >> free_swap_slot() is called), instead of when draining the swap slot > >> >> cache (i.e. when swap_range_free() is called). For all parts of MM > >> >> outside of swap, the swap entry is freed when free_swap_slot() is > >> >> called. We don't free it immediately because of caching, but this > >> >> should be transparent to other parts of MM (e.g. zswap, memcg, etc)= . > >> > > >> > That will cancel the batching effect on the swap slot free, making t= he > >> > common case for swapping faults take longer to complete, righ? > >> > If I recall correctly, the uncharge is the expensive part of the swa= p > >> > slot free operation. > >> > I just want to figure out what we are trading off against. This is n= ot > >> > one side wins all situations. > >> > >> Per my understanding, we don't batch memcg uncharging in > >> swap_entry_free() now. Although it's possible and may improve > >> performance. > > > > Yes. It actually causes a long tail in swapin fault latency as Chris > > discovered in our prod. I am wondering if doing the memcg uncharging > > outside the slots cache will actually amortize the cost instead. > > > > Regardless of memcg charging, which is more complicated, I think we > > should at least move the call to zswap_invalidate() before the slots > > cache. I would prefer that we move everything non-swapfile specific > > outside the slots cache layer (zswap_invalidate(), > > arch_swap_invalidate_page(), clear_shadow_from_swap_cache(), > > mem_cgroup_uncharge_swap(), ..). However, if some of those are > > controversial, we can move some of them for now. > > That makes sense for me. > > > When draining free swap slots from the cache, swap_range_free() is > > called with nr_entries =3D=3D 1 anyway, so I can't see how any batching= is > > going on. If anything it should help amortize the cost. > > In swapcache_free_entries(), the sis->lock will be held to free multiple > swap slots via swap_info_get_cont() if possible. This can reduce > sis->lock contention. Ah yes that's a good point. Since most of these callbacks don't actually access sis, but use the swap entry value itself, I am guessing the reason we need to hold the lock for all these callbacks is to prevent swapoff and swapon reusing the same swap entry on a different swap device, right?