Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp402750iog; Fri, 24 Jun 2022 06:21:45 -0700 (PDT) X-Google-Smtp-Source: AGRyM1s1/rdGrAv3+XFOVRtkNCjenqVxjxsguGvLBEzQPIdcbhxdSru7y6ma25yoxszSw9ICHTpT X-Received: by 2002:a17:907:161e:b0:722:dcf0:2a0 with SMTP id hb30-20020a170907161e00b00722dcf002a0mr12843142ejc.694.1656076905280; Fri, 24 Jun 2022 06:21:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656076905; cv=none; d=google.com; s=arc-20160816; b=jdTblTjmRudfOBrxJJIHZJXMfMZSFVdtXG/UmzYMHUi0GaFF5+Ua7BTMR3G0bkqtAk NmA9r2odhclXMpdDGF1bPux64SN838H5zP5XQEdkZOTwMPqoqYyNy1LcAUnfs9wSMhx4 lE5KoRUU+R1Jg/x5iNC2lMNBeaFlqCwD9ju9gn4m0wHvVX9ln/Im+VEs+MykU+AeJsT5 zIEzyqcGZ3roEGetQv/bqvcXHT+2swMJDBwFUt9QxD5ZIjWcxqtmlrtPzDgm7HYAdoJm cTDB/mEOtPB+1SDT2FYQKoOwhedcPpZC0pwCcTluBJVCqOkE314QOfWtGfwO78y8egcv jBkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id; bh=m5G38uQ48IoXhIPgp0V5rvJNyTiRyKAX9J0yjbMrro8=; b=R9ylgTQ9NEZcM/YQZ6qLfcW3p6XCTPog/oQEwjFor2gnhy+rP1N1IB+CqMIljxfLuI jWHdY4+T3XnFpvBi0q9qEadDsv9P2AGnPZabjyzMxHNDuBosdVP5frJ3eDoaYGCqNlK+ rjI2KCMxXvr/bpxSmYd1kdomp477d5VS4K7XG/ZBT/qnyOpM15kRenf3oP1d8QOLKcb/ 1W3OLd/MlZAh5e3OukmNehNqlnb4h5Nb3ALCeliQEOt5qQZYlzuBOxSPWuhnbfDvmQM+ +VqsQ2l7SH6J4hkr9Xe9z9nOrhhRvBkrzqHVzCX8cxgT6CYXSN2rtWDl1zwiboLfJHRB ALXQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n14-20020a05640205ce00b00436d3f2d36esi2443887edx.318.2022.06.24.06.21.18; Fri, 24 Jun 2022 06:21:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232151AbiFXNQM (ORCPT + 99 others); Fri, 24 Jun 2022 09:16:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230254AbiFXNQK (ORCPT ); Fri, 24 Jun 2022 09:16:10 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16DA850E1C for ; Fri, 24 Jun 2022 06:16:08 -0700 (PDT) Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4LTyJq0vxkzkWYH; Fri, 24 Jun 2022 21:14:51 +0800 (CST) Received: from dggpeml500018.china.huawei.com (7.185.36.186) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Fri, 24 Jun 2022 21:16:05 +0800 Received: from [10.67.111.186] (10.67.111.186) by dggpeml500018.china.huawei.com (7.185.36.186) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Fri, 24 Jun 2022 21:16:05 +0800 Message-ID: <409fc8d0-119a-3358-0fc5-99a786a9564a@huawei.com> Date: Fri, 24 Jun 2022 21:16:05 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.1.1 Subject: Re: Perf regression from scheduler load_balance rework in 5.5? To: Vincent Guittot , David Chen CC: "linux-kernel@vger.kernel.org" , Ingo Molnar References: From: Zhang Qiao In-Reply-To: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.111.186] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpeml500018.china.huawei.com (7.185.36.186) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, 在 2022/6/24 16:22, Vincent Guittot 写道: > On Thu, 23 Jun 2022 at 21:50, David Chen wrote: >> >> Hi, >> >> I'm working on upgrading our kernel from 4.14 to 5.10 >> However, I'm seeing performance regression when doing rand read from windows client through smbd >> with a well cached file. >> >> One thing I noticed is that on the new kernel, the smbd thread doing socket I/O tends to stay on >> the same cpu core as the net_rx softirq, where as in the old kernel it tends to be moved around >> more randomly. And when they are on the same cpu, it tends to saturate the cpu more and causes >> performance to drop. >> >> For example, here's the duration (ns) the thread spend on each cpu I captured using bpftrace >> On 4.14: >> @cputime[7]: 20741458382 >> @cputime[0]: 25219285005 >> @cputime[6]: 30892418441 >> @cputime[5]: 31032404613 >> @cputime[3]: 33511324691 >> @cputime[1]: 35564174562 >> @cputime[4]: 39313421965 >> @cputime[2]: 55779811909 (net_rx cpu) >> >> On 5.10: >> @cputime[3]: 2150554823 >> @cputime[5]: 3294276626 >> @cputime[7]: 4277890448 >> @cputime[4]: 5094586003 >> @cputime[1]: 6058168291 >> @cputime[0]: 14688093441 >> @cputime[6]: 17578229533 >> @cputime[2]: 223473400411 (net_rx cpu) >> >> I also tried setting the cpu affinity of the smbd thread away from the net_rx cpu and indeed that >> seems to bring the perf on par with old kernel. I observed the same problem for the past two weeks. >> >> I noticed that there's scheduler load_balance rework in 5.5, so I did the test on 5.4 and 5.5 and >> it did show the behavior changed between 5.4 and 5.5. > > Have you tested v5.18 ? several improvements happened since v5.5 > >> >> Anyone know how to work around this? > > Have you enabled IRQ_TIME_ACCOUNTING ? CONFIG_IRQ_TIME_ACCOUNTING=y. > > When the time spent under interrupt becomes significant, scheduler > migrate task on another cpu My board has two cpus, and i used iperf3 to test upload bandwidth,then I saw the same situation, the iperf3 thread run on the same cpu as the NET_RX softirq. After debug in find_busiest_group(), i noticed when the cpu(env->idle is CPU_IDLE or CPU_NEWLY_IDLE) try to pull task, the busiest->group_type == group_fully_busy, busiest->sum_h_nr_running == 1, local->group_type==group_has_spare, and the loadbalance will failed at find_busiest_group(), as follows: find_busiest_group(): ... if (busiest->group_type != group_overloaded) { .... if (busiest->sum_h_nr_running == 1) goto out_balanced; ----> loadbalance will returned at here. .... Thanks, Qiao > Vincent>> >> Thanks, >> David > . >