Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp608847pxp; Sat, 19 Mar 2022 11:23:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyGJkn6knwQzK+UrHlHTYOqXrv4j7aFiFphCpDq8vPU44lehZuBU8CeCJ2v++P6jUdaOgVU X-Received: by 2002:a65:6c16:0:b0:380:f45b:f1b2 with SMTP id y22-20020a656c16000000b00380f45bf1b2mr12521923pgu.65.1647714220092; Sat, 19 Mar 2022 11:23:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647714220; cv=none; d=google.com; s=arc-20160816; b=FbTgMjOtxtEfLAqAweI4GhqQqbuMjGxKLjodAR5Revl2uwrDLylieQgiWL0r7ZmMa8 ynNDkChGM1ydguNE5iSSSspYqde2QoWF16L2bzfvvLGOvQDky23iVIWKiKqlMJByM4pQ bzUZURMMITJOYJwRlbXvsf7acRhPZBfhn3Jex1achvf2ywfajkgAEWNIIJvMg3NbaQ4c eJ6sW5UhRAFKDhDMEE/NrhJdYAWjNvXT3B4pnA9K/SqLQ7v3cgm2ELM/qjwJ+8mw7G7Q l1NFGbyw7Hi0/SxlbR2IK+PR4Qfi2dK/SsNtMJL+ntBtht8RJT1P0puWrAptds6bSqyM Df4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=ZoPZaaC2NKmWX4pMUWovHOkjyoMteKBE1HWqQdjxHic=; b=Qt0fI3A45/pNDHkMkyORXfEo27jhuqe8xIZV482vrA0VIyqDDWg5eLl9fLj8RlZ4Wz bqDq813A39wKhZWLEnB1F0CYUCyW993pZG+ztb/4IgkrNGyCIofXW/CEQn33zeQXGtB+ Ec5vLWCb68jp4kcdK/TSid+WRQZ6fZtXywa1pm9/mOcjGXloXAgv4r/ZPp9B8B2AjjOV nWDTG8HiBkXqGlE24iG2h7BigQOLLEqByC/VdLHDuod4kIwbaJbH6jPAzSCsGeRp9lZI OqlJY2UAyGqpsz4RF1Ab/z4LMg7BFSubsR9yiztVpeHANF5WWQnRQ9oSiim3N/PHMld/ hcag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=HeWOIunJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bd2-20020a170902830200b00153b2d165besi5572158plb.454.2022.03.19.11.23.26; Sat, 19 Mar 2022 11:23:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=HeWOIunJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239405AbiCRQxM (ORCPT + 99 others); Fri, 18 Mar 2022 12:53:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51098 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239398AbiCRQxL (ORCPT ); Fri, 18 Mar 2022 12:53:11 -0400 Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 262B8170D97 for ; Fri, 18 Mar 2022 09:51:52 -0700 (PDT) Received: by mail-ed1-x52c.google.com with SMTP id m12so10912876edc.12 for ; Fri, 18 Mar 2022 09:51:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ZoPZaaC2NKmWX4pMUWovHOkjyoMteKBE1HWqQdjxHic=; b=HeWOIunJl0UAOOBuf3sFMoQ5H0G4YvHE64MUshpGXCqPohbLwWv6CprzWDDtvQZV4a gZp/IxMUTTx1ov0X4+Vytt7su/e4jtYsO/D7+kTb9CU6ARrJw3MeR8b7NRqQIiLqbZ+u j0aPEH4e2LjldUZc5ZiqUuKcRxGdW0FHrgLL4hB6guAKbnFpXj4HPtEHgcCnG4QCJ52d 8sGtuj8e3Awdr6ViUqdpWzBCCokYpw3AL06YpXiNcCaFQKGz+GcgpjJZoJ7sHuMpdpCs ymr/6dX8jq2mXYwwEaHQ6nDbFH7a9NCUu3d4CU2qQvzNicnQV+J46JfBwgvGB6WAPUvK o4RQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ZoPZaaC2NKmWX4pMUWovHOkjyoMteKBE1HWqQdjxHic=; b=xdHB4Vm8oVawqH3GnGSBb6AkOCAuYpwx6ZDWDqHNPk9KwZC3Dxacbep6gr1cvoZsAi G6+epESXsIbC4e5YKIORZtvBQumZ9FNu0X753/9TX/XOBdix2GxzUKNBLLDvJTxVPf/k bYBQVn5ibdvMXBJ2taTXhMEggEHzhsef4dZq30WzwyKH8iW8tFRhCut00UEa5TF8L0A/ KNRu6GprB7DiQIeDomwEHsrsC5tcIBDA0zH+IElAo1Qf5ymHLYGXzwli3ZdPywO4bj4p VW/v0zkb1QhVi63qsZo/+DdCech+mg8H0lJkYqpmriI7zAR2S9fEiJbXM+w99WYJbaph iaQg== X-Gm-Message-State: AOAM531m3sPVvk+Aslt1eYBLGeePXHLwPVkkQhsyhMheXpGozzEQ7IOL Bk5QNVFMP+ind22e80UgCkPp9TEHPXisaCMcBY6ZRg== X-Received: by 2002:a05:6402:289d:b0:419:437:ef4f with SMTP id eg29-20020a056402289d00b004190437ef4fmr5976676edb.110.1647622310479; Fri, 18 Mar 2022 09:51:50 -0700 (PDT) MIME-Version: 1.0 References: <20220315172221.9522-1-bgeffon@google.com> In-Reply-To: From: Brian Geffon Date: Fri, 18 Mar 2022 12:51:14 -0400 Message-ID: Subject: Re: [PATCH] zram: Add a huge_idle writeback mode To: Minchan Kim Cc: Andrew Morton , Nitin Gupta , Sergey Senozhatsky , LKML , linux-doc@vger.kernel.org, linux-block@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 18, 2022 at 12:41 PM Minchan Kim wrote: > > On Tue, Mar 15, 2022 at 10:22:21AM -0700, Brian Geffon wrote: > > Today it's only possible to write back as a page, idle, or huge. > > A user might want to writeback pages which are huge and idle first > > as these idle pages do not require decompression and make a good > > first pass for writeback. > > Hi Brian, > > I am not sure how much the decompression overhead matter for idle pages > writeback since it's already *very slow* path in zram but I agree that > it would be a good first pass since the memory saving for huge writing > would be cost efficient. > > Just out of curiosity. Do you have real usecase? Hi Minchan, Thank you for taking a look. When we are thinking about writeback we're trying to be very sensitive to our devices storage endurance, for this reason we will have a fairly conservative writeback limit. Given that, we want to make sure we're maximizing what lands on disk while still minimizing the refault time. We could take the approach where we always writeback huge pages but then we may result in very quick refaults which would be a huge waste of time. So idle writeback is a must for us and being able to writeback the pages which have maximum value (huge) would be very useful. Brian > > > > > Signed-off-by: Brian Geffon > > --- > > Documentation/admin-guide/blockdev/zram.rst | 6 ++++++ > > drivers/block/zram/zram_drv.c | 10 ++++++---- > > 2 files changed, 12 insertions(+), 4 deletions(-) > > > > diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst > > index 3e11926a4df9..af1123bfaf92 100644 > > --- a/Documentation/admin-guide/blockdev/zram.rst > > +++ b/Documentation/admin-guide/blockdev/zram.rst > > @@ -343,6 +343,12 @@ Admin can request writeback of those idle pages at right timing via:: > > > > With the command, zram writeback idle pages from memory to the storage. > > > > +Additionally, if a user choose to writeback only huge and idle pages > > +this can be accomplished with:: > > + > > + echo huge_idle > /sys/block/zramX/writeback > > + > > + > > If admin want to write a specific page in zram device to backing device, > > they could write a page index into the interface. > > > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > > index cb253d80d72b..f196902ae554 100644 > > --- a/drivers/block/zram/zram_drv.c > > +++ b/drivers/block/zram/zram_drv.c > > @@ -643,8 +643,8 @@ static int read_from_bdev_async(struct zram *zram, struct bio_vec *bvec, > > #define PAGE_WB_SIG "page_index=" > > > > #define PAGE_WRITEBACK 0 > > -#define HUGE_WRITEBACK 1 > > -#define IDLE_WRITEBACK 2 > > +#define HUGE_WRITEBACK (1<<0) > > +#define IDLE_WRITEBACK (1<<1) > > > > > > static ssize_t writeback_store(struct device *dev, > > @@ -664,6 +664,8 @@ static ssize_t writeback_store(struct device *dev, > > mode = IDLE_WRITEBACK; > > else if (sysfs_streq(buf, "huge")) > > mode = HUGE_WRITEBACK; > > + else if (sysfs_streq(buf, "huge_idle")) > > + mode = IDLE_WRITEBACK | HUGE_WRITEBACK; > > else { > > if (strncmp(buf, PAGE_WB_SIG, sizeof(PAGE_WB_SIG) - 1)) > > return -EINVAL; > > @@ -725,10 +727,10 @@ static ssize_t writeback_store(struct device *dev, > > zram_test_flag(zram, index, ZRAM_UNDER_WB)) > > goto next; > > > > - if (mode == IDLE_WRITEBACK && > > + if (mode & IDLE_WRITEBACK && > > !zram_test_flag(zram, index, ZRAM_IDLE)) > > goto next; > > - if (mode == HUGE_WRITEBACK && > > + if (mode & HUGE_WRITEBACK && > > !zram_test_flag(zram, index, ZRAM_HUGE)) > > goto next; > > /* > > -- > > 2.35.1.723.g4982287a31-goog > >