Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp1713328rwi; Thu, 13 Oct 2022 18:40:51 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7zfLj9mVyTf9A7sl+IHcz/u6XDXUn7kYNwYu60urv7SjbfePXB71BeZw2qk/6Z/tz0lsbb X-Received: by 2002:a05:6402:34c7:b0:45c:c02c:e256 with SMTP id w7-20020a05640234c700b0045cc02ce256mr2232771edc.198.1665711650873; Thu, 13 Oct 2022 18:40:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665711650; cv=none; d=google.com; s=arc-20160816; b=Ieix9D1sDQQGjG5mh+0tyHvVzHBB2elvYTg2j4Pc/nWfklXeMs1pofyXArTab0nwng FrjvaCZQz25+oEz/MAb3q7usCVwHU8NijI75qqhZIxZzUQqIiJEz/qzZ4VjE73TN2mzb yzzesNzD2o+P0TZ9aK6rG6odhA3LjRnvvILYQF/MIOWsnADwl6qFubAoSG0y5mj61QYp bAQqeAcinBup2cjCkqYX9jVD8THjF4ZO6Mu4qHC9rQyq4w0LfKar3B+Tr2ZgRnagE2P/ pDQ/IH8sPGUTWvcD7DAnhGv1/jwstXQlmmWvvp9h/flQwsBpaVjCOMFQ+4lBdQz8/cVg uFDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:mime-version:date :dkim-signature:message-id; bh=qxNGLneyLXiEIftrmWJ7UR9IFJikkY+SVPP4K4jC414=; b=SxjllSimx7nZ8QC0Cl+1AwUbovXL+JqyD8uDecxtjprJmMEnO++vpvMjTz97l+2xY4 PQCnv2NwMD3K+HXxD8I03VQQErBga6FltnwKLnkpoHcWnMG4Vuw1gRoe4vWUkIn4jiRl GJgka6FZliBVOt4U/gyQR+ysULZwaaKfHb/vVuV+cIUS0ClcsCyASMuDDTNO97crLFFl BVsUaEeWjanekLV1I2US3FBMObmZti8r2EsgGOC8ttENNnX/eu04+1BtXxBuyOqyvgiy 7PgTMc/R+m0WERt6KfFl1Id2rI0rnpL7x5Xe6oQ5WL3vOHc/pXwY1WqlT6Kanl4M73qf 7d1A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=AQcOVBmm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w13-20020a05640234cd00b00459f9c3d02bsi1424504edc.22.2022.10.13.18.40.24; Thu, 13 Oct 2022 18:40:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=AQcOVBmm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229699AbiJNBLR (ORCPT + 99 others); Thu, 13 Oct 2022 21:11:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229491AbiJNBLP (ORCPT ); Thu, 13 Oct 2022 21:11:15 -0400 Received: from out2.migadu.com (out2.migadu.com [IPv6:2001:41d0:2:aacc::]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DAE7912B35B; Thu, 13 Oct 2022 18:11:12 -0700 (PDT) Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1665709870; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qxNGLneyLXiEIftrmWJ7UR9IFJikkY+SVPP4K4jC414=; b=AQcOVBmmnWzup8jej9bMok8KskVHNuJTxaJ5yoetWfZH6A1UApc8jvAxMXhruVHWkTERJJ vErOe0yRFjEkQbGatCDV58uqb3MI3FXFUAO58yf/+dMGYHjve1ts/3EMOesOkyP1Zjb/2C B3tFEnJpjT/3KWiMgNDRsyaEsIYLcrE= Date: Thu, 13 Oct 2022 19:11:06 -0600 MIME-Version: 1.0 Subject: Re: [PATCH v2 1/3] md/bitmap: Add chunk-threshold unplugging Content-Language: en-US To: Song Liu Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, jonathan.derrick@solidigm.com, jonathanx.sk.derrick@intel.com, Mariusz Tkaczyk References: <20221013224151.300-1-jonathan.derrick@linux.dev> <20221013224151.300-2-jonathan.derrick@linux.dev> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Jonathan Derrick In-Reply-To: <20221013224151.300-2-jonathan.derrick@linux.dev> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/13/2022 4:41 PM, Jonathan Derrick wrote: > Add a mechanism to allow bitmap unplugging and flushing to wait until it > has surpassed a defined threshold of dirty chunks. This allows certain > high I/O write workloads to make good forward progress between bitmap > updates or provide reliable bitmap consistency. The default behavior is > previous behavior of always unplugging when called. > > Signed-off-by: Jonathan Derrick > --- > drivers/md/md-bitmap.c | 35 +++++++++++++++++++++++++++++++---- > drivers/md/md-bitmap.h | 1 + > drivers/md/md.h | 1 + > 3 files changed, 33 insertions(+), 4 deletions(-) > > diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c > index bf6dffadbe6f..c5c77f8371a8 100644 > --- a/drivers/md/md-bitmap.c > +++ b/drivers/md/md-bitmap.c > @@ -1004,7 +1004,7 @@ static int md_bitmap_file_test_bit(struct bitmap *bitmap, sector_t block) > /* this gets called when the md device is ready to unplug its underlying > * (slave) device queues -- before we let any writes go down, we need to > * sync the dirty pages of the bitmap file to disk */ > -void md_bitmap_unplug(struct bitmap *bitmap) > +static void __md_bitmap_unplug(struct bitmap *bitmap) > { > unsigned long i; > int dirty, need_write; > @@ -1038,6 +1038,33 @@ void md_bitmap_unplug(struct bitmap *bitmap) > if (test_bit(BITMAP_WRITE_ERROR, &bitmap->flags)) > md_bitmap_file_kick(bitmap); > } > + > +/* > + * Conditional unplug based on user-defined parameter > + * Defaults to unconditional behavior > + */ > +void md_bitmap_unplug(struct bitmap *bitmap) > +{ > + unsigned int flush_threshold = bitmap->mddev->bitmap_info.flush_threshold; > + > + if (!flush_threshold) { > + __md_bitmap_unplug(bitmap); > + } else { > + struct bitmap_page *bp = bitmap->counts.bp; > + unsigned long pages = bitmap->counts.pages; > + unsigned long k, count = 0; > + > + for (k = 0; k < pages; k++) > + if (bp[k].map && !bp[k].hijacked) > + count += bp[k].count; > + > + if (count - bitmap->unplugged_count > flush_threshold) { > + bitmap->unplugged_count = count; > + md_bitmap_daemon_work(&bitmap->mddev->daemon_timer); I just noticed I call daemon_timer before adding it in 3/3 I'll fix that in v3 > + __md_bitmap_unplug(bitmap); > + } > + } > +} > EXPORT_SYMBOL(md_bitmap_unplug); > > static void md_bitmap_set_memory_bits(struct bitmap *bitmap, sector_t offset, int needed); > @@ -2012,9 +2039,9 @@ int md_bitmap_copy_from_slot(struct mddev *mddev, int slot, > for (i = 0; i < bitmap->storage.file_pages; i++) > if (test_page_attr(bitmap, i, BITMAP_PAGE_PENDING)) > set_page_attr(bitmap, i, BITMAP_PAGE_NEEDWRITE); > - md_bitmap_unplug(bitmap); > + __md_bitmap_unplug(bitmap); > } > - md_bitmap_unplug(mddev->bitmap); > + __md_bitmap_unplug(mddev->bitmap); > *low = lo; > *high = hi; > md_bitmap_free(bitmap); > @@ -2246,7 +2273,7 @@ int md_bitmap_resize(struct bitmap *bitmap, sector_t blocks, > spin_unlock_irq(&bitmap->counts.lock); > > if (!init) { > - md_bitmap_unplug(bitmap); > + __md_bitmap_unplug(bitmap); > bitmap->mddev->pers->quiesce(bitmap->mddev, 0); > } > ret = 0; > diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h > index cfd7395de8fd..49a93d8ff307 100644 > --- a/drivers/md/md-bitmap.h > +++ b/drivers/md/md-bitmap.h > @@ -223,6 +223,7 @@ struct bitmap { > unsigned long daemon_lastrun; /* jiffies of last run */ > unsigned long last_end_sync; /* when we lasted called end_sync to > * update bitmap with resync progress */ > + unsigned long unplugged_count; /* last dirty count from md_bitmap_unplug */ > > atomic_t pending_writes; /* pending writes to the bitmap file */ > wait_queue_head_t write_wait; > diff --git a/drivers/md/md.h b/drivers/md/md.h > index b4e2d8b87b61..1a558cb18bd4 100644 > --- a/drivers/md/md.h > +++ b/drivers/md/md.h > @@ -501,6 +501,7 @@ struct mddev { > int external; > int nodes; /* Maximum number of nodes in the cluster */ > char cluster_name[64]; /* Name of the cluster */ > + unsigned int flush_threshold; /* how many dirty chunks between updates */ > } bitmap_info; > > atomic_t max_corr_read_errors; /* max read retries */