Received: by 2002:a5d:9c59:0:0:0:0:0 with SMTP id 25csp80357iof; Sun, 5 Jun 2022 21:44:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz9rvRZg2v6va6gPPQshlG3SjSYHizp25TPK0mUvaX0A9uy6iKemUrvVo65wB9wHGwbpzFl X-Received: by 2002:a05:6a00:a16:b0:518:ffe0:7229 with SMTP id p22-20020a056a000a1600b00518ffe07229mr57765748pfh.49.1654490695905; Sun, 05 Jun 2022 21:44:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654490695; cv=none; d=google.com; s=arc-20160816; b=j0l99dMjh25vpjXvefu0sWkNG6Sno4kHL5fgP1qdawXYRDs0xJH8mzTZthFfJEwyAp Nm2sVujuS6rAqLMPoGVB3f7vRCGYaOGiZsYiIFvfSIN2uwrFx/kY22zFnJojo01SHzWz H4d5Mvf7HCoNdqJ0B74TE6spqCANUA5QHUtBDqJVfgBhB8eMyhNy6hQF99/CF0+L4xO9 tQwUYV0Pc7MsWLg5BYqxFTl87/5bs7Pa3I2RfAg31ACUNDybPmHqf6oIPIp9a7C7FkZn oVEmDVjnGDgHT7uHAa5AajAiNEzcLSvLWGJnFnlq3iGOZn7XXuXqKNOTRcMuRxqr+j9Q 6QDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=Bvyf6sQjmkfCDLNK49ZrMJaEbj0blM3ELQjytG3RjZU=; b=osQG+erRgkpWpfQxrJvfuLOJ9RCPwb9JL0fDi0Y6LMxXKk9LRuRt7KzNnIsAfAuLVS qCjxaGQqaJ4FgqmwDoNFIfkddHTo/5MXaC/c1P7UwRzplFQw4dtGfBvu6GpU81aHvUdv nNt+4yt15zGk5L/lPoz3gPgOTBVE+K318fRqmBzZfAwS87MvJRowhyHOMuNPnbllCuC6 Um2wtV2j4WblcB84xNR04A6l7PdP7jQTid+0mLJqRdxdOdRZ1Fle//UBDnBKOg+Tkzy8 V72tNZtxv1WY+J2dtvoL9fgTSR5Rs5bs73hnO9b5I1T4kWocu9gRyieQlxmOXbb8gYmf rxwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=XfMZtmy7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id 2-20020a631942000000b003fbf9fadea6si20319483pgz.139.2022.06.05.21.44.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Jun 2022 21:44:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=XfMZtmy7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CF1AF13FA4; Sun, 5 Jun 2022 21:03:06 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241408AbiFFB77 (ORCPT + 99 others); Sun, 5 Jun 2022 21:59:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57080 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240339AbiFFB75 (ORCPT ); Sun, 5 Jun 2022 21:59:57 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 482124F471 for ; Sun, 5 Jun 2022 18:59:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1654480794; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Bvyf6sQjmkfCDLNK49ZrMJaEbj0blM3ELQjytG3RjZU=; b=XfMZtmy7e3DJdnqhO46Vzp6/nFzir4UFGYgfpwl3I+xWAImZoy0TP5bJviRAaRc/dx5PVc wpc100RF19C+abKcLUn3ctnoY+KoA6M74/jCSCEDFXe+YFhfOwpnAGnMspG6IZhTd5yGea ehOE/cKYN61UhmClcmetQOdQ+5aiGUs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-505-ctW413vmOo2SsU0iXf4QNg-1; Sun, 05 Jun 2022 21:59:50 -0400 X-MC-Unique: ctW413vmOo2SsU0iXf4QNg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7EC52801228; Mon, 6 Jun 2022 01:59:50 +0000 (UTC) Received: from [10.22.32.31] (unknown [10.22.32.31]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3FA4140EC001; Mon, 6 Jun 2022 01:59:50 +0000 (UTC) Message-ID: Date: Sun, 5 Jun 2022 21:59:50 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH v6 3/3] blk-cgroup: Optimize blkcg_rstat_flush() Content-Language: en-US To: Ming Lei Cc: Tejun Heo , Jens Axboe , cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org References: <20220602192020.166940-1-longman@redhat.com> <20220602192020.166940-4-longman@redhat.com> From: Waiman Long In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.84 on 10.11.54.2 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/5/22 21:39, Ming Lei wrote: > On Sun, Jun 05, 2022 at 07:15:27PM -0400, Waiman Long wrote: >> On 6/3/22 23:58, Ming Lei wrote: >> >>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >>> index abec50f31fe6..8c4f204dbf5b 100644 >>> --- a/mm/memcontrol.c >>> +++ b/mm/memcontrol.c >>> @@ -622,7 +622,7 @@ static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val) >>> { >>> unsigned int x; >>> - cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id()); >>> + cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id(), NULL); >>> x = __this_cpu_add_return(stats_updates, abs(val)); >>> if (x > MEMCG_CHARGE_BATCH) { >> I think the rstat set of functions are doing that already. So flush will >> only call CPUs that have called cgroup_rstat_updated() before. However, one > Yeah, I guess the detail is in cgroup_rstat_cpu_pop_updated(), but the > percpu lock(raw_spin_lock_irqsave) is still required, and cgroup_rstat_cpu_pop_updated() > is still called even through there isn't any update on this CPU. Yes, I think we may need to add a bitmask of what controllers have updates in cgroup_rstat_cpu structure. > >> deficiency that I am aware of is that there is no bitmap of which controller >> have update. The problem that I saw in cgroup v2 is that in a cgroup with >> both memory controller and block controller enabled, a >> cgroup_rstat_updated() call from memory cgroup later causes the rstat >> function to call into block cgroup flush method even though there is no >> update in the block controller. This is an area that needs improvement. >> >> Your code does allow the block controller to be aware of that and avoid >> further action, but I think it has to be done in the rstat code to be >> applicable to all controllers instead of just specific to block controller. > I guess it can be done by adding one percpu variable to 'struct cgroup'. > >> There is another problem that this approach. Suppose the system have 20 >> block devices and one of them has an IO operation. Now the flush method >> still needs to iterate all the 20 blkg's to do an update. The block >> controller is kind of special that the number of per-cgroup IO stats depends >> on the number of block devices present. Other controllers just have one set >> of stats per cgroup. > Yeah, and this one is really blkio specific issue, and your patch does > cover this one. Maybe you can add one callback to > cgroup_rstat_updated(), so the "blkg_iostat_set" instance is added into > percpu list under percpu lock of cgroup_rstat_cpu_lock, then the lockless > list isn't needed. The rstat API is generic. It may not be a good idea to put controller specific information into it. Yes, cgroup_rstat_cpu_lock is taken at the read side (flush). It may not taken on the write side (update). So it may not be easy to rely on this lock for synchronization between the read and write side. Cheers, Longman