Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp1828115ioo; Mon, 23 May 2022 04:23:42 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxUNcEaEZ8Eu9fKhB2h3iTvc3U5aUZXKuxteYAso8YUyB3oBWMvxDn2USx7/dOkMmVs7XlD X-Received: by 2002:a17:90b:2318:b0:1df:af66:1e8 with SMTP id mt24-20020a17090b231800b001dfaf6601e8mr25566892pjb.240.1653305022278; Mon, 23 May 2022 04:23:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653305022; cv=none; d=google.com; s=arc-20160816; b=pj9BTQ8SnkvYsCNMUU/MYOHbqvwX0qBGQGY+iloJgMAPIBTJnik8p/36e52wTZxlTh Im58X44Eqo7aGNhGFeiXcU57wOEKT4g7NPzp0uILY+CHMkRVosjMr8861YelVYMbis4g XMiI/EWoN7+Reu/uJPPBNU6YstMqVlrycp80frftAvWRKF9Gkxea9pUtAkPYRu++v2pa Riz09OzWUrmOlmuijZyHqxzN7r5k2sLXnMGOqBxVbexUhQaxOx9H0Ig8U90e9XkwD4xb mOc2my+WmJe6hMa0Z4PkgxYcGt06Va9bLBeeljWVA3hI/PS7lqkxTsSlW6LNiEQ6JwLS 6tww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id :dkim-signature; bh=EhOQGg1mWeCy/lfVLf7GwzqHA1hMMv05+TaoM0mW9F8=; b=MTUVKp3cbyAQc3ONr7/Uga7Qu//PpMoR9k7bgBbD9JxyZNUaieTix4QY1Gu7E0I1hs TnJ4x8iYU8yT+zmjAVK7yECmKw8lPou7Brv0mRl9AX9hk7aDJiBg1C928d7WT5RjuC1c iUNF0nnkiFrMocIgG/LX9Ay22POL/tGe25DaGNTjfnWh+Rt8IcJT6NirlY/Uamqf1KDk Ylz1DnOfFe1vYvuESxquAZ613LENECISw62oahGlHej3uNbXbVfpUANRJjzloq4xS511 hyt5TDRZYGcpwPEJjDMzHo9QAjpR6YgWKwuPGhZVurR9XDNSzpP0QP7UffXB0eLp9j5h YJxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@raspberrypi.com header.s=google header.b=sOoZ8mX2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=raspberrypi.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id q26-20020a65625a000000b003f64026ba9csi8952943pgv.600.2022.05.23.04.23.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 04:23:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@raspberrypi.com header.s=google header.b=sOoZ8mX2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=raspberrypi.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2BFD64D608; Mon, 23 May 2022 04:22:30 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234806AbiEWLWX (ORCPT + 99 others); Mon, 23 May 2022 07:22:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234800AbiEWLWV (ORCPT ); Mon, 23 May 2022 07:22:21 -0400 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2265311470 for ; Mon, 23 May 2022 04:22:20 -0700 (PDT) Received: by mail-wm1-x32b.google.com with SMTP id h205-20020a1c21d6000000b003972dda143eso5243172wmh.3 for ; Mon, 23 May 2022 04:22:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raspberrypi.com; s=google; h=message-id:date:mime-version:user-agent:subject:to:cc:references :from:in-reply-to:content-transfer-encoding; bh=EhOQGg1mWeCy/lfVLf7GwzqHA1hMMv05+TaoM0mW9F8=; b=sOoZ8mX2X9p1Xl73lr3Zcs5b7MmT0J3eRouAiK6GyN2poK2+owyeg9G88p/veeV2G7 tIR+LFlloX+inF6DfDeV/f74MxuCHf1fXrZHPWJBc7bpH+aFf3G7ySEHBZLanS9MgRnj 23HG8DRhv7q0c8Zqu7911qHNLGWpk9I1ny/+dVshWeWofQm2QArvxpyf+oB5Nlppxt5r 76akydWhU61p0TfnUSjvTNOiMXjaBH523uhSchQWGVqB72+ZvnOKh0O+fKYfV3W8rKAl wUceihSvxkLHAnILZNFqGXhoulDhZ2pQJifMtp9sNzqLznIZdbZaqEvOU78p5WB59xie HUkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :to:cc:references:from:in-reply-to:content-transfer-encoding; bh=EhOQGg1mWeCy/lfVLf7GwzqHA1hMMv05+TaoM0mW9F8=; b=EIZHRC6UEYw0kzpDnTe/4GBzao9TEhEl2jugOJHKina/YlzZLsBKDHM6ULSsGkWEWh GfdAgMxY9ogP989eBee7BL9uU1b5tXpj0NhM+BrMx1G7BumF6pYj7ukjW6OHZhyYnE9V DIvT0dV6i20tFw2tAKodCP5gPyAwUKlYh02ItdKefDZfFQrpJC5mJIZC5KRE80EPhoRc SenFsYmPbbQdI2RBxQOmtmSG4t0eoaWTFBnIXXvCipjfZvHmLx+VOvh4BOCkzfYEPoS6 gib0tiNcNB3FvRPB1+GhMQQAwiSxGNEJxMl5yg7/pZCSlEBg3iGMJFWiKmITioiWr7Cd KLKw== X-Gm-Message-State: AOAM531mg/xrz0jN9HrJ0XTHKrEg4Vw12vhwo87qUNA1GvHCNziAzWko 1rI0voxQm0nXZe2Pr51ZDzVI3g== X-Received: by 2002:a1c:4d0d:0:b0:397:30f6:b62b with SMTP id o13-20020a1c4d0d000000b0039730f6b62bmr18074451wmh.155.1653304938731; Mon, 23 May 2022 04:22:18 -0700 (PDT) Received: from ?IPV6:2a00:1098:3142:14:3110:d736:2a7:6aff? ([2a00:1098:3142:14:3110:d736:2a7:6aff]) by smtp.gmail.com with ESMTPSA id q4-20020adfbb84000000b0020c5253d90esm11838361wrg.90.2022.05.23.04.22.17 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 23 May 2022 04:22:18 -0700 (PDT) Message-ID: Date: Mon, 23 May 2022 12:22:18 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: vchiq: Performance regression since 5.18-rc1 To: Stefan Wahren , paulmck@kernel.org Cc: Marcelo Tosatti , Andrew Morton , Nicolas Saenz Julienne , Borislav Petkov , Minchan Kim , Mel Gorman , Juri Lelli , Thomas Gleixner , Sebastian Andrzej Siewior , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Linux ARM , regressions@lists.linux.dev, riel@surriel.com, viro@zeniv.linux.org.uk References: <77d6d498-7dd9-03eb-60f2-d7e682bb1b20@i2se.com> <20220521234616.GO1790663@paulmck-ThinkPad-P17-Gen-1> <20220523044818.GS1790663@paulmck-ThinkPad-P17-Gen-1> <58cb7fbb-d317-83e6-0427-d3f3944b24b8@raspberrypi.com> <2ddd354e-a2b6-077c-25be-6ef1b2118d04@i2se.com> From: Phil Elwell In-Reply-To: <2ddd354e-a2b6-077c-25be-6ef1b2118d04@i2se.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 23/05/2022 12:15, Stefan Wahren wrote: > Hi Phil, > > Am 23.05.22 um 13:01 schrieb Phil Elwell: >> Hi Stefan, >> >> On 23/05/2022 11:48, Stefan Wahren wrote: >>> Hi Phil, >>> >>> Am 23.05.22 um 11:29 schrieb Phil Elwell: >>>> Hi Stefan, >>>> >>>> On 23/05/2022 07:19, Stefan Wahren wrote: >>>>> Hi Paul, >>>>> >>>>> Am 23.05.22 um 06:48 schrieb Paul E. McKenney: >>>>>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote: >>>>>>> Hi Paul, >>>>>>> >>>>>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney: >>>>>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my >>>>>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance >>>>>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm: >>>>>>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu >>>>>>>>> >>>>>>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1]. >>>>>>>>> >>>>>>>>> Before commit: >>>>>>>>> >>>>>>>>> real    0m1,500s >>>>>>>>> user    0m0,068s >>>>>>>>> sys    0m0,846s >>>>>>>>> >>>>>>>>> After commit: >>>>>>>>> >>>>>>>>> real    7m11,449s >>>>>>>>> user    0m2,049s >>>>>>>>> sys    0m0,023s >>>>>>>>> >>>>>>>>> Best regards >>>>>>>>> >>>>>>>>> [1] - https://github.com/raspberrypi/userland >>>>>>>> Please feel free to try the patch shown below.  Or the pair of patches >>>>>>>> from Rik here: >>>>>>>> >>>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/ >>>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/ >>>>>>> I tried your patch and Rik's patches but in both cases vchiq_test runs 7 >>>>>>> minutes instead of ~ 1 second. >>>>>> That is surprising.  Do you boot with rcupdate.rcu_normal=1? >>>>> No, not explicit. >>>>>>    That would >>>>>> nullify my patch, but I would expect that Rik's patch would still provide >>>>>> increased performance even in that case. >>>>> I will retest with a fresh SD card image. >>>>>> >>>>>> Could you please characterize where the slowdown is occurring? >>>>> >>>>> Unfortunately i don't have a deep insight into driver and vchiq_test tool. >>>>> Just a user view. >>>>> >>>>> Do you think an strace would be a good starting point? >>>>> >>>>> @Phil Any advices to analyse this issue? >>>> >>>> Sending many small control packets: >>>> >>>>    vchiq_test -c 1 10000 >>>> >>>> essentially tests interrupt latency. Using a small number of large bulk >>>> transfers: >>>> >>>>    vchiq_test -b 10000 1 >>>> >>>> becomes a test of how long it takes to lock down pages. It also tests DMA >>>> transfer speeds, but since the DMA is run by the firmware (which you aren't >>>> changing), I think you can rule that. >>> Thanks i will try. >>>> >>>> You may also find it helpful to include "force_turbo=1" in config.txt for >>>> more predictable results. >>>> >>>> By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing any >>>> performance problems: >>> I assume you are using arm/bcm2709_defconfig and not arm/multi_v7_defconfig >>> as me? >> >> That's correct. Simply switching to multi_v7_defconfig breaks vchiq >> completely, presumably because it doesn't define CONFIG_BCM2835_VCHIQ. > sorry, forgot to mention. I that i enable VCHIQ as module on top of > multi_v7_defconfig. Downstream tree with multi_v7_defconfig + CONFIG_BCM2835_VCHIQ: pi@raspberrypi:~$ time vchiq_test -f 1 Functional test - iters:1 ======== iteration 1 ======== Testing bulk transfer for alignment. Testing bulk transfer at PAGE_SIZE. real 0m0.566s user 0m0.037s sys 0m0.166s Phil