Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp220430pxb; Wed, 18 Nov 2020 02:48:17 -0800 (PST) X-Google-Smtp-Source: ABdhPJy4QYiAlwmNNuZz2/ehxP8vbKzEN1lZ2846pBG/Xm3gBVlMv9naM7Af22LC4E1gV5Ps0pmK X-Received: by 2002:a50:fe02:: with SMTP id f2mr25099856edt.97.1605696497233; Wed, 18 Nov 2020 02:48:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605696497; cv=none; d=google.com; s=arc-20160816; b=tu5uTxtsD1OMtD2Hlrsg0OgL9K3ZM9x154w0NlMfInyDtjJSlIfYanodYZaWWIw/L7 jlFpJIQqLpolHoRRa6JHimagQRJNZbj4D6ijZSfY1O5DGD5ZQxwdM3rxTw2D25Zcl2q3 aCYQr09sQlzVyxJsFy7YxlP5kM0uheSXwhGO6I9wcxQ3vq8vjMW9IhHDzdE7Ors/1E5A PkD7hmw1v5QSR3Z7h2+Q8Fg/yumSiS8lkuYi0RS4zJmBhi/puwPNglBnkGE0mAEU0py8 uWU+XBdHR3E3AcqwUM/49rgYKW21Z2VVTfwFPXBIl/+u2pFh5B0X5Jro5oUwwjK8bc8v 4n+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:autocrypt :references:cc:to:from:subject; bh=CLYpFCmu1SyOG4tk7IHpJZeqg8mPw84tmH6GAaC0p94=; b=PnsfDkj0L+SP8mcmYxFpTm327cZNtXf1wqkheUhuAfFA/S9Fj391ANwVxSIitUpCoc o6vT3s/LwH3J7Ws06GIO6HBDxwW8rQNNIw8Z1sIqVFST247SJ4VZGjPlXKB45Cd9SZwX Bdh1CidDSBrpHLRJPFz9pgyT09GkWRmbTfP0andIC5rymGro4eaGN0cBNQo0kAJAZZ8B 28CjQaMlBGEnVuO47s2v6BVuJJJyTQi5DGxUZz9sUij+SMzCKHkdZCH/udI9Ogyrtf9j Yvjs0sp6IsAhscJWNtUGbn1i1oX6rsyf8lNTWTSQw0GpUWhO0UOdPsFywjE87OMyiDkb GWsQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w22si6124589ejf.740.2020.11.18.02.47.54; Wed, 18 Nov 2020 02:48:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726077AbgKRKpx (ORCPT + 99 others); Wed, 18 Nov 2020 05:45:53 -0500 Received: from mail2-relais-roc.national.inria.fr ([192.134.164.83]:19412 "EHLO mail2-relais-roc.national.inria.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725497AbgKRKpw (ORCPT ); Wed, 18 Nov 2020 05:45:52 -0500 X-IronPort-AV: E=Sophos;i="5.77,486,1596492000"; d="scan'208";a="478145187" Received: from clt-128-93-180-167.vpn.inria.fr (HELO [128.93.180.167]) ([128.93.180.167]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 18 Nov 2020 11:45:46 +0100 Subject: Re: [PATCH 1/4] drivers core: Introduce CPU type sysfs interface From: Brice Goglin To: Greg Kroah-Hartman Cc: Ricardo Neri , x86@kernel.org, Borislav Petkov , Ingo Molnar , Thomas Gleixner , "Rafael J. Wysocki" , Tony Luck , Len Brown , "Ravi V. Shankar" , linux-kernel@vger.kernel.org, Andi Kleen , Dave Hansen , "Gautham R. Shenoy" , Kan Liang , Srinivas Pandruvada References: <20201003011745.7768-1-ricardo.neri-calderon@linux.intel.com> <20201003011745.7768-2-ricardo.neri-calderon@linux.intel.com> <20201003085345.GA114893@kroah.com> <20201006005736.GD6041@ranerica-svr.sc.intel.com> <20201006073744.GA6753@kroah.com> <20201007031447.GB27938@ranerica-svr.sc.intel.com> <20201007051546.GA47583@kroah.com> <7233394d-982b-72cd-ceb9-d81161bd826f@gmail.com> <33efde37-562f-4c6a-72ba-2277533e3781@gmail.com> Autocrypt: addr=brice.goglin@gmail.com; prefer-encrypt=mutual; keydata= mQINBFNg91oBEADMfOyfz9iilNPe1Yy3pheXLf5O/Vpr+gFJoXcjA80bMeSWBf4on8Mt5Fg/ jpVuNBhii0Zyq4Lip1I2ve+WQjfL3ixYQqvNRLgfw/FL0gNHSOe9dVFo0ol0lT+vu3AXOVmh AM4IrsOp2Tmt+w89Oyvu+xwHW54CJX3kXp4c7COz79A6OhbMEPQUreerTavSvYpH5pLY55WX qOSdjmlXD45yobQbMg9rFBy1BECrj4DJSpym/zJMFVnyC5yAq2RdPFRyvYfS0c491adD/iw9 eFZY1XWj+WqLSW8zEejdl78npWOucfin7eAKvov5Bqa1MLGS/2ojVMHXJN0qpStpKcueV5Px igX8i4O4pPT10xCXZ7R6KIGUe1FE0N7MLErLvBF6AjMyiFHix9rBG0pWADgCQUUFjc8YBKng nwIKl39uSpk5W5rXbZ9nF3Gp/uigTBNVvaLO4PIDw9J3svHQwCB31COsUWS1QhoLMIQPdUkk GarScanm8i37Ut9G+nB4nLeDRYpPIVBFXFD/DROIEfLqOXNbGwOjDd5RWuzA0TNzJSeOkH/0 qYr3gywjiE81zALO3UeDj8TaPAv3Dmu7SoI86Bl7qm6UOnSL7KQxZWuMTlU3BF3d+0Ly0qxv k1XRPrL58IyoHIgAVom0uUnLkRKHczdhGDpNzsQDJaO71EPp8QARAQABtCFCcmljZSBHb2ds aW4gPGJnb2dsaW5AZGViaWFuLm9yZz6JAjgEEwECACIFAlNg+fkCGwMGCwkIBwMCBhUIAgkK CwQWAgMBAh4BAheAAAoJEESRkPMjWr07TFoP/3UyTaqL9bPWVB/L0Uf5kgk00K9mr3RRVfAG rdN1T57Gy4UsAl9gDRDjrtxK0hTohdktw6Bg4BcmMDGVxuc1KRdpaeF+hfecp5uYyb6v+Rxy N3cJ2liOZldLWKPlsTh+AXmLg6pDxQyqfh06XHZgpoUV4OgXoMkQUlyDFo5vjTdWu39t4YYl ajblh2+OsDuDxXPz5oCwbtoxnytcnF43lWCmi2Rg/nETT0Zv4mF9fqS2QiUl4d9Kg8r9TntI P36l+CJCGNnnqkk/684iqFPD/X22+2ail1q9J1ObPSfUd3TcxL2a0lfCjIDjKWJoXEdViyKB aHIC5se8auyhfJdcg69wqzaX//8iFXLG7ywqw8+cMaPuw0YqhPdG8xmWDldSXjRl1Sa/RZKp PkbIqTpR3Mv1ihwkkjLd/J56AYwFj7Uw2nS3O5cNNHFeUu0k3bUb8EzJEbGQ5eTUNEmzggFY aEnlATqP1zagI/oq/jNv96vLGvegGu0qDfp9SJlLMAWM7p4ZefzrnOTIRwMIeYhEovIwLtNw c+uCyBYdWjbY7hEHL2eDDRe1jHWLfEOLmicDH1HP21Nr7YUIrffzlqYoLGtOEk9/aHAVZ7qK O3ii1hj7xbJBh0UIuI1w6lF41j0unAk/td5NTdwZ6ygWVMOAJzOcPouxROahBqKNKXk31Zwf uQINBFNg91oBEADp3vwjw8tQBnNfYJNJMs6AXC8PXB5uApT1pJ0fioaXvifPNL6gzsGtAF53 aLeqB7UXuByHr8Bmsz7BvwA06XfXXdyLQP+8Oz3ZnUpw5inDIzLpRbUuAjI+IjUtguIKAkU1 rZNdCXMOqEwCaomRitwaiX9H7yiDTKCUaqx8yAuAQWactWDdyFii2FA7IwVlD/GBqMWVweZs MfeWgPumKB3jyElm1RpkzULrtKbu7MToMH2fmWqBtTkRptABkY7VEd8qENKJBZKJGiskFk6y lp8VzZdwbAtEDDTGK00Vg4PZGiIGbQo8mBqbc63DY+MdyUEksTTu2gTcqZMm/unQUJA8xB4J rTAyljo/peIt6lsQa4+/eVolfKL1t1C3DY8f4wMoqnZORagnWA2oHsLsYKvcnqzA0QtYIIb1 S1YatV+MNMFf3HuN7xr/jWlfdt59quXiOHU3qxIzXJo/OfC3mwNW4zQWJkG233UOf6YErmrS aTIBTIWF8CxGY9iXPaJGNYSUa6R/VJS09EWeZgRz9Gk3h5AyDrdo5RFN9HNwOj41o0cjeLDF 69092Lg5p5isuOqsrlPi5imHKcDtrXS7LacUI6H0c8onWoH9LuW99WznEtFgPJg++TAvf9M2 x57Gzl+/nYTB5/Kpl1qdPPC91zUipiKbnF5f8bQpol0WC+ovmQARAQABiQIfBBgBAgAJBQJT YPdaAhsMAAoJEESRkPMjWr074+0P/iEcN27dx3oBTzoeGEBhZUVQRZ7w4A61H/vW8oO8IPkZ v9kFr5pCfIonmHEbBlg6yfjeHXwF5SF2ywWRKkRsFHpaFWywxqk9HWXu8cGR1pFsrwC3Edos suVbEFNmhjHvcAo11nJ7JFzPTEnlPjE6OY9tEDwl+kp1WvyXqNk9bosaX8ivikhmhB477BA3 Kv8uUE7UL6p7CBdqumaOFISi1we5PYE4P/6YcyhQ9Z2wH6ad2PpwAFNBwxSu+xCrVmaDskAw knf6UVPN3bt67sFAaVgotepx6SPhBuH4OSOxVHMDDLMu7W7pJjnSKzMcAyXmdjON05SzSaIL wfceByvHAnvcFh2pXK9U4E/SyWZDJEcGRRt79akzZxls52stJK/2Tsr0vKtZVAwogiaKuSp+ m6BRQcVVhTo/Kq3E0tSnsTHFeIO6QFHKJCJv4FRE3Dmtz15lueihUBowsq9Hk+u3UiLoSmrM AZ6KgA4SQxB2p8/M53kNJl92HHc9nc//aCQDi1R71NyhtSx+6PyivoBkuaKYs+S4pHmtsFE+ 5+pkUNROtm4ExLen4N4OL6Kq85mWGf2f6hd+OWtn8we1mADjDtdnDHuv+3E3cacFJPP/wFV9 4ZhqvW4QcyBWcRNFA5roa7vcnu/MsCcBoheR0UdYsOnJoEpSZswvC/BGqJTkA2sf Message-ID: Date: Wed, 18 Nov 2020 11:45:46 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <33efde37-562f-4c6a-72ba-2277533e3781@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Le 17/11/2020 à 16:55, Brice Goglin a écrit : > Le 12/11/2020 à 11:49, Greg Kroah-Hartman a écrit : >> On Thu, Nov 12, 2020 at 10:10:57AM +0100, Brice Goglin wrote: >>> Le 12/11/2020 à 07:42, Greg Kroah-Hartman a écrit : >>>> On Thu, Nov 12, 2020 at 07:19:48AM +0100, Brice Goglin wrote: >>>>> Hello >>>>> >>>>> Sorry for the late reply. As the first userspace consumer of this >>>>> interface [1], I can confirm that reading a single file to get the mask >>>>> would be better, at least for performance reason. On large platforms, we >>>>> already have to read thousands of sysfs files to get CPU topology and >>>>> cache information, I'd be happy not to read one more file per cpu. >>>>> >>>>> Reading these sysfs files is slow, and it does not scale well when >>>>> multiple processes read them in parallel. >>>> Really? Where is the slowdown? Would something like readfile() work >>>> better for you for that? >>>> https://lore.kernel.org/linux-api/20200704140250.423345-1-gregkh@linuxfoundation.org/ >>> I guess readfile would improve the sequential case by avoiding syscalls >>> but it would not improve the parallel case since syscalls shouldn't have >>> any parallel issue? >> syscalls should not have parallel issues at all. >> >>> We've been watching the status of readfile() since it was posted on LKML >>> 6 months ago, but we were actually wondering if it would end up being >>> included at some point. >> It needs a solid reason to be merged. My "test" benchmarks are fun to >> run, but I have yet to find a real need for it anywhere as the >> open/read/close syscall overhead seems to be lost in the noise on any >> real application workload that I can find. >> >> If you have a real need, and it reduces overhead and cpu usage, I'm more >> than willing to update the patchset and resubmit it. >> >> > Hello > > I updated hwloc to use readfile instead of open+read+close on all those > small sysfs/procfs files. Unfortunately the improvement is very small, > only a couple percents. On a 40 core server, our library starts in 38ms > instead of 39ms. I can't deploy your patches on larger machines, but I > tested our code on a copy of their sysfs files saved on a local disk : > For a 256-thread KNL, we go from 15ms to 14ms. For a 896-core SGI > machine, from 73ms to 71ms. Sorry, I forgot to update some codepaths to properly use readfile yesterday :/ Here are updated and more precise numbers that show a non-negligible improvement. Again, we're measuring the entire hwloc topology discovery, which includes reading many sysfs file (improved thanks to readfile) and then building a hierarchy of objects describing the machine (not modified). Server sysfs files (dual-socket x 20 cores x SMT-2) default 43.48ms +/-4.48 readfile 42.15ms +/-4.58 => 3.1% better 1971 readfile calls => 674ns improvement per call Knight Landing sysfs stored on local hard drive (64 cores x SMT-4) default 14.60ms +/-0.91 readfile 13.63ms +/-1.05 => 6.6% better 2940 readfile calls => 329ns improvement per call SGI Altix UV sysfs stored on local hard drive (56 sockets x 8 coeurs x SMT-2) default 69.12ms +/-1.40 readfile 66.03ms +/-1.35 => 4.5% better 14525 readfile calls => 212ns improvement per call I don't know why the first case (real sysfs files) gets a much higher standard deviation and higher improvement per readfile call. The other two cases match what microbenmarks say (about 200ns improvement per readfile call). Brice > > I see 200ns improvement for readfile (2300) vs open+read+close (2500) on > my server when reading a single cpu topology file. With several > thousands of sysfs files to read in the above large hwloc tests, it > confirms an overall improvement in the order of 1ms. > > So, just like you said, the overhead seems to be pretty much lost in the > noise of hwloc doing its own stuff after reading hundreds of sysfs files :/ > > Brice > >