Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp3616314rdb; Wed, 13 Sep 2023 18:48:38 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFxa3bg8n48QgFMVHCGE+JxoEzYw/lAQOz8EGkA7nMlFLIDK84ur2afzbT5HmC76S1/GrXU X-Received: by 2002:a05:6808:13c1:b0:3a9:e40c:683c with SMTP id d1-20020a05680813c100b003a9e40c683cmr5232873oiw.1.1694656118610; Wed, 13 Sep 2023 18:48:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694656118; cv=none; d=google.com; s=arc-20160816; b=z3dWHMbEJcJo32VSXtn5zj9bzyUFXfYywZbC6dRxjM8IKKO9PhqlG9zxG58H0JW0eV BKh/z6AOSEuXU86ExL07dh/bc6mdVLlv+rgOqjZqLMppnSEYIJu9s0457xciZPa2rGgb Zw7DZXrMpaT8vwkiHB2ivoF2g54Rg3AnsvC+sYoE3ROtAGHjhVTJzOoglJce/36XjeP6 QFplqhiDhY0qcHP7JNxDNtBUgKR9t6PQSIDvNfqNAyHjjfZ931tPBt1SNXiqRzoMtsvv mY+mT5LIGRDF0l3zfBmZ1CZplH4PeJdLz4qZozfw8K/3VaOzKKVsgvAMRfY9dX2dpg79 VqVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=2Gj7JtBbGVJF2PnX+Z44IHAFP5KFsxdqajaxA8LzE3c=; fh=7nEcp/ZBXXSL20W3T3mpC1aHVeo5VJfcICz/byPCLCY=; b=SNfpNo4URyr8sQYW65kJaGeVMrcHDLWYFV/6aY3o5EmTpYkk4LsXaXrWr5imxMQ1RW nT6wSOdfG5kwXSCTrm0iUlXWnwcMmF72Q7+D41yAmyhi1Mv5RdG2FyNveRGG9s2ElufA xxae0wPTx6W3aTyYF4JJ2b542zjT8lj9d4GrqUXxSuBRZcbNUa+8YD7r9YXoxTpxDvzH LdKfCjV5CCyzuVm6VAbLl7ROkM5IQQGANUfIL1ahkanqHPXwDk921K/vHheFMrkJehmu ZuEeTy4vTLixXAgYIwnrnruOhoqn3ZGCwfSeIEHg6f4rvr9iJBMZfe+SgiQFU9lwhqBH hn0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=OJieTzkW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id o8-20020a056a001bc800b0068fbbc81abdsi479447pfw.218.2023.09.13.18.48.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 18:48:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=OJieTzkW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 7D567823CC51; Wed, 13 Sep 2023 14:07:55 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229698AbjIMVH5 (ORCPT + 99 others); Wed, 13 Sep 2023 17:07:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35750 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229671AbjIMVH4 (ORCPT ); Wed, 13 Sep 2023 17:07:56 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B75F1BCA for ; Wed, 13 Sep 2023 14:07:52 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6655AC433C8; Wed, 13 Sep 2023 21:07:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1694639271; bh=RNYP9+p+x6ZzoNqDo1JjEqeN5WDavOAss0TMtv0FSd8=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=OJieTzkWepHpXecGf8t8Ijddd8IpFJqDWvCMhN8ZxENiV1Rs1sqhQ8mBITiN4qB1o 9n/MSZuMCni9rbepiJz+5kA1EPSxgJkipqfrU7zlJ3PGp7aBm9azwd30YnDrdEsVBb uI2Hp+QSxGL5jkTYgJtafMqpfmJJmWRMAQEVGlfo= Date: Wed, 13 Sep 2023 14:07:50 -0700 From: Andrew Morton To: Stefan Roesch Cc: kernel-team@fb.com, david@redhat.com, hannes@cmpxchg.org, riel@surriel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v1 1/4] mm/ksm: add "smart" page scanning mode Message-Id: <20230913140750.616d3d87fe986a74d870b71f@linux-foundation.org> In-Reply-To: <20230912175228.952039-2-shr@devkernel.io> References: <20230912175228.952039-1-shr@devkernel.io> <20230912175228.952039-2-shr@devkernel.io> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Wed, 13 Sep 2023 14:07:55 -0700 (PDT) On Tue, 12 Sep 2023 10:52:25 -0700 Stefan Roesch wrote: > This change adds a "smart" page scanning mode for KSM. So far all the > candidate pages are continuously scanned to find candidates for > de-duplication. There are a considerably number of pages that cannot be > de-duplicated. This is costly in terms of CPU. By using smart scanning > considerable CPU savings can be achieved. > > This change takes the history of scanning pages into account and skips > the page scanning of certain pages for a while if de-deduplication for > this page has not been successful in the past. > > To do this it introduces two new fields in the ksm_rmap_item structure: > age and skip_age. age, is the KSM age and skip_page is the age for how s/skip_page/skip_age/ > long page scanning of this page is skipped. The age field is incremented > each time the page is scanned and the page cannot be de-duplicated. > > How often a page is skipped is dependent how often de-duplication has > been tried so far and the number of skips is currently limited to 8. > This value has shown to be effective with different workloads. > > The feature is currently disable by default and can be enabled with the > new smart_scan knob. > > The feature has shown to be very effective: upt to 25% of the page scans > can be eliminated; the pages_to_scan rate can be reduced by 40 - 50% and > a similar de-duplication rate can be maintained. > All seems nice. I'll sit out v1, see what people have to say. Some nits: > --- a/mm/ksm.c > +++ b/mm/ksm.c > > ... > > @@ -2305,6 +2314,45 @@ static struct ksm_rmap_item *get_next_rmap_item(struct ksm_mm_slot *mm_slot, > return rmap_item; > } > > +static unsigned int inc_skip_age(rmap_age_t age) > +{ > + if (age <= 3) > + return 1; > + if (age <= 5) > + return 2; > + if (age <= 8) > + return 4; > + > + return 8; > +} "inc_skip_age" sounds like it increments something. Can we give it a better name? And a nice comment explaining its role in life. > +static bool skip_rmap_item(struct page *page, struct ksm_rmap_item *rmap_item) > +{ > + rmap_age_t age; > + > + if (!ksm_smart_scan) > + return false; > + > + if (PageKsm(page)) > + return false; > + > + age = rmap_item->age++; > + if (age < 3) > + return false; > + > + if (rmap_item->skip_age == age) { > + rmap_item->skip_age = 0; > + return false; > + } > + > + if (rmap_item->skip_age == 0) { > + rmap_item->skip_age = age + inc_skip_age(age); > + remove_rmap_item_from_tree(rmap_item); > + } > + > + return true; > +} Would a better name be should_skip_rmap_item()? But even that name implies that the function is idempotent (has no side-effects). Again, an explanatory comment would be good. And simple comments over each non-obvious `if' statement. > > ... >