Received: by 2002:a89:2c3:0:b0:1ed:23cc:44d1 with SMTP id d3csp1034666lqs; Wed, 6 Mar 2024 04:38:06 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUMtqfaO3qSJ7VYIzandwr+O9Bui3IcyYiTZdBhs9Q9IfLnZJCfsKSNloqn7GSD89GM0ifWv3/gTkVDTZdCe2HKHfifri0Z+ATUjPziBQ== X-Google-Smtp-Source: AGHT+IGZbnfT5gPUN+smAZsJy28uRwi5ZDGT7ZkmXAh5i34XP6U1ZkCNRxSJ5MnISaaLFNzkBm0q X-Received: by 2002:a17:906:6ac7:b0:a44:9d77:a457 with SMTP id q7-20020a1709066ac700b00a449d77a457mr9127113ejs.19.1709728685862; Wed, 06 Mar 2024 04:38:05 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709728685; cv=pass; d=google.com; s=arc-20160816; b=sLq7vgAQgPCK00yWpW+JRJLLProQyowWMt71jQ0pfVVf/2cEYJ41jpg/6JdSUzTFvC xCfTBS+uio+T6/X+o0w+xLoLr7JRr3FaNa6KsNeWeTYUjhfHL/+T6NR/ThFOeNK2+/sl XZ91rHY0YkJV0QYUF4zpEPbfpF/TCoP1Mm9WZtnz6HdOq5QZucW/ZDqycwR7ofgGqz4y WMZ5QnSgVbC6biTDIPNkb3kAgWaTi328oWRkviugrMutV44ERgoOQG4TygCuyN9PR+vX oO+JSkG/jSdY1YZr6O3jh50LdCPld0iPlUBgUeVGyLCKOYm7skR/A36gocsqxmDbx7LD jNFg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:content-language:in-reply-to:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:user-agent:date :message-id:from:references:cc:to:subject; bh=TFqX618jyLmA4aW5Evgg8HENpZKHkwi2d6jQ85MIXRA=; fh=OcnMfSnq4Rxe0hXLDCKc6B7WdeDu1e+jsa/oxTipwgY=; b=W21QRiyIpbVt0CNQDruB7VBEg3gl6JfusYGDkGf0kcfPI8xS7L3xDCzpglldEUG3Lz ++SjqKb07GTK1qI2fjl9K+CCa2K/keFyX5binJqpK87UonqSNr9ZG2n06E3wX7hr+szS e7yYAVZeT4pr/617fYwxLRGOl17h6g68V7swaEJWvvXsiRARj4q/avVQGi9r3gwFTbwO v2l11pCg0x/9zH9tJ+gkuM0sAELrJlr04vgGHFJeEXelDJ9KLubJbgfYyCPt7R4jFm7N JTk41NMPrUpgp84SS5/KIet1Gg476OAVxkhv2DRWCIt6g6TZSfK25118C0rZtr3Z4MHf yqAA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-93904-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-93904-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=QUARANTINE) header.from=huawei.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id ub13-20020a170907c80d00b00a3e80c191afsi6127538ejc.1026.2024.03.06.04.38.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Mar 2024 04:38:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-93904-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-93904-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-93904-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=QUARANTINE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 93C661F2405D for ; Wed, 6 Mar 2024 12:38:05 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 68DDB130ADC; Wed, 6 Mar 2024 12:37:48 +0000 (UTC) Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0E72878668; Wed, 6 Mar 2024 12:37:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.189 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709728667; cv=none; b=QpWkcgKPLc5Q1K5Vk5PO1NrQpp6hpoRlKWCD07ag9VSukTF2BFdyOhkI8znjGh5k8D50wHluq2C7xIQkuZgpMZAaMMdGnP3SLInIC0HzZ8cjE/afjJUh2jRfcay7J2Y8dAeq38TlSbkoomEQiTlUs1rPH9p7CDY+YtTJuU7lpKk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709728667; c=relaxed/simple; bh=H4l2DgnqyQlVDiFExqxtFkBp3iqUS/pZqNwziWPDq4U=; h=Subject:To:CC:References:From:Message-ID:Date:MIME-Version: In-Reply-To:Content-Type; b=qvMqvTdAUXSIB5KZiE4NvSAoivIkCipYaC+PQnluYCyaSQFfZj7mHmv0ogADWIdAx9G7DoQ+Sf5SgOwTdfhJ2SUXriTHo1/EmwhOPFmmTlCTieztvSc0OrSVo19EYP0PiHliIuFL7FINh7PP/aWg/0RQrsT1fBmRydN9T0Ml+fI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4TqX3L3NVrzNlbl; Wed, 6 Mar 2024 20:35:58 +0800 (CST) Received: from dggpemm500005.china.huawei.com (unknown [7.185.36.74]) by mail.maildlp.com (Postfix) with ESMTPS id ABDAC14011F; Wed, 6 Mar 2024 20:37:36 +0800 (CST) Received: from [10.69.30.204] (10.69.30.204) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Wed, 6 Mar 2024 20:37:36 +0800 Subject: Re: [RFC PATCH net-next v6 00/15] Device Memory TCP To: Mina Almasry CC: , , , , , , , , , , , , , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jonathan Corbet , Richard Henderson , Ivan Kokshaysky , Matt Turner , Thomas Bogendoerfer , "James E.J. Bottomley" , Helge Deller , Andreas Larsson , Jesper Dangaard Brouer , Ilias Apalodimas , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Arnd Bergmann , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , David Ahern , Willem de Bruijn , Shuah Khan , Sumit Semwal , =?UTF-8?Q?Christian_K=c3=b6nig?= , Pavel Begunkov , David Wei , Jason Gunthorpe , Shailend Chand , Harshitha Ramamurthy , Shakeel Butt , Jeroen de Borst , Praveen Kaligineedi References: <20240305020153.2787423-1-almasrymina@google.com> <6208950d-6453-e797-7fc3-1dcf15b49dbe@huawei.com> From: Yunsheng Lin Message-ID: Date: Wed, 6 Mar 2024 20:37:35 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500005.china.huawei.com (7.185.36.74) On 2024/3/6 3:38, Mina Almasry wrote: > On Tue, Mar 5, 2024 at 4:54 AM Yunsheng Lin wrote: >> >> On 2024/3/5 10:01, Mina Almasry wrote: >> >> ... >> >>> >>> Perf - page-pool benchmark: >>> --------------------------- >>> >>> bench_page_pool_simple.ko tests with and without these changes: >>> https://pastebin.com/raw/ncHDwAbn >>> >>> AFAIK the number that really matters in the perf tests is the >>> 'tasklet_page_pool01_fast_path Per elem'. This one measures at about 8 >>> cycles without the changes but there is some 1 cycle noise in some >>> results. >>> >>> With the patches this regresses to 9 cycles with the changes but there >>> is 1 cycle noise occasionally running this test repeatedly. >>> >>> Lastly I tried disable the static_branch_unlikely() in >>> netmem_is_net_iov() check. To my surprise disabling the >>> static_branch_unlikely() check reduces the fast path back to 8 cycles, >>> but the 1 cycle noise remains. >>> >> >> The last sentence seems to be suggesting the above 1 ns regresses is caused >> by the static_branch_unlikely() checking? > > Note it's not a 1ns regression, it's looks like maybe a 1 cycle > regression (slightly less than 1ns if I'm reading the output of the > test correctly): > > # clean net-next > time_bench: Type:tasklet_page_pool01_fast_path Per elem: 8 cycles(tsc) > 2.993 ns (step:0) > > # with patches > time_bench: Type:tasklet_page_pool01_fast_path Per elem: 9 cycles(tsc) > 3.679 ns (step:0) > > # with patches and with diff that disables static branching: > time_bench: Type:tasklet_page_pool01_fast_path Per elem: 8 cycles(tsc) > 3.248 ns (step:0) > > I do see noise in the test results between run and run, and any > regression (if any) is slightly obfuscated by the noise, so it's a bit > hard to make confident statements. So far it looks like a ~0.25ns > regression without static branch and about ~0.65ns with static branch. > > Honestly when I saw all 3 results were within some noise I did not > investigate more, but if this looks concerning to you I can dig > further. I likely need to gather a few test runs to filter out the > noise and maybe investigate the assembly my compiler is generating to > maybe narrow down what changes there. Yes, that is confusing enough that need more investigation. >