Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp332654ybz; Tue, 21 Apr 2020 09:51:39 -0700 (PDT) X-Google-Smtp-Source: APiQypKmTEncRjMez2AV7zok/TnBaJ6bw7xSlmnfXNwboOGvKOa0LYC3ZhBzwaTtoS6WeY2vwjQM X-Received: by 2002:a17:906:e210:: with SMTP id gf16mr22453874ejb.214.1587487899531; Tue, 21 Apr 2020 09:51:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587487899; cv=none; d=google.com; s=arc-20160816; b=AK1iuEUpudWgyZRWJI0jzdsJ9AsHlmPYM1BD1A/dwLwpnUljorLelzEpYrbO5xxDON 0OkHQEJbKxehq4UE/aYTfzMfUsyjci4m3f6Y9Oq67NP9uqSp6vGQKolThWLGRGldGwyB 74yHmIRmOL6m1qgORp0rB5KuuyjN5z3bm13tKSoo0AId3fzVwcoPalK4aQZQvgQaxfjo SLyTsC6OiWirPjPeDW//lrvFmMP7g1NUS4sEERETEz4YPh/pcri8T8wlI5DIw331VHxs /4NMXYzjL5/e7lBTFGQMPHxAi0K+3Vso6nVqw8IzdaYUI/lZ6X5YrLzI264PaRdJDQSO q8Jg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=T410BgQnQuXKYxEscOwjOVqkA5eP3x59KmpAyvDkg4s=; b=p6LJ2/j+zgdR7cZZCknGWbx3FieDvn5L49ANH4GYuy/TDxU6X8HkqUaJWfIjo4grrQ hgZnGYOmjB3ykF9+KrUDe2viE8doTNENq8SaVA8c/iVBkVUu6617PYHE2xz0yp/W5P8K BA3F0za+u9u/U1rvJa6IHJmJomUPb58zaivuKkHAWN7hOAEI4Igv1E2ym/UpWCVaA9Lb C9QJ2kBVGyUZ6ZB7kavPpz+2Y3Tin+5e93XdO2DxMlRjegmStxMtfSYSsAEkHrRykAMB NtD4+bFnIf5Rc9HE6AXP3vpTVKN+L4HL4YRw78b3AJABddPqWBu8VuUwZw+a9MqOZrSH 6zlQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="l0Bm/tax"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b12si1850596ejq.312.2020.04.21.09.51.15; Tue, 21 Apr 2020 09:51:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="l0Bm/tax"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728400AbgDUQt3 (ORCPT + 99 others); Tue, 21 Apr 2020 12:49:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1725963AbgDUQt2 (ORCPT ); Tue, 21 Apr 2020 12:49:28 -0400 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10A01C061A41; Tue, 21 Apr 2020 09:49:28 -0700 (PDT) Received: by mail-wm1-x344.google.com with SMTP id h2so4317119wmb.4; Tue, 21 Apr 2020 09:49:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=T410BgQnQuXKYxEscOwjOVqkA5eP3x59KmpAyvDkg4s=; b=l0Bm/taxusuNTxZw1FbXGtXDpfSusKDst3J4oiI9sMkyBNO92zGgBZEvRDjytEhR6X sB/4KTOtNLlSxcjR6/4guGlb5P5XX8fKUqAAPtLWS5GFAfm106bRNkpHF8/mZoJTe04u dtqfbHV4XrVi1WAtZ7jce6ZnnbjqSAvv0uB1+VwHUMFZ8YhkKTPvPQMbdIudkIjGQzIx oAwj6DIDHJhxi8hmuNAZsSlYkXuinZzdmnU0ZwjzHUrCgyFEE1BgIMc4ng71EE9W3edY 7440FDFYLkQpT0FVi6xrpzsUsvqqDL+iDRBuX9UTajdAA3Kqfe+n6oHBe0o21cyL4Yri xYJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=T410BgQnQuXKYxEscOwjOVqkA5eP3x59KmpAyvDkg4s=; b=pOcuCpSNqY5DZ5O2tRyPU0bq3XmdECqjo0nKTJftjG6Vzq24tYzyZay4O2WJs0T0x/ k4+qe0RNeT9xZNExyI2v2+1wiJlLHV/UQDIyMcPtdyYGv6UTiA2r2Dq+6X7ofph9P0Hb NMWMDpp7KQloXmTmfTBqnl1jYik4eAvIHOhzeuhDLwYSLgkqZtcWikMoOfx/jnABdlo2 GCggvVlYaDuAedpvHHdh0KNGfKiWqRG311MbhlJcK2owG/K5EM0i+AMxIaMUICPDhobh e2KNB/Dtt/jSFxafDHlsn3t8euVF5Tc2UdgIkPDBCyYvRz0nrV2NNnjpjATMDPfZoepU QY/g== X-Gm-Message-State: AGi0PubDin/liPWfcYf9dAwkL8sQK5y2+TQ0poDKDsqueWTyzUQoOwEw x83B9awDSsEqiwwH5c0loQ== X-Received: by 2002:a1c:544c:: with SMTP id p12mr5730030wmi.88.1587487766805; Tue, 21 Apr 2020 09:49:26 -0700 (PDT) Received: from avx2 ([46.53.252.84]) by smtp.gmail.com with ESMTPSA id h3sm4336863wrm.73.2020.04.21.09.49.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Apr 2020 09:49:26 -0700 (PDT) Date: Tue, 21 Apr 2020 19:49:24 +0300 From: Alexey Dobriyan To: Steven Rostedt Cc: Andy Shevchenko , akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, pmladek@suse.com, sergey.senozhatsky@gmail.com, linux@rasmusvillemoes.dk Subject: Re: [PATCH 03/15] print_integer: new and improved way of printing integers Message-ID: <20200421164924.GB8735@avx2> References: <20200420205743.19964-1-adobriyan@gmail.com> <20200420205743.19964-3-adobriyan@gmail.com> <20200420211911.GC185537@smile.fi.intel.com> <20200420212723.GE185537@smile.fi.intel.com> <20200420215417.6e2753ee@oasis.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20200420215417.6e2753ee@oasis.local.home> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 20, 2020 at 09:54:17PM -0400, Steven Rostedt wrote: > On Tue, 21 Apr 2020 00:27:23 +0300 > Andy Shevchenko wrote: > > > > > > > > TODO > > > > benchmark with mainline because nouveau is broken for me -( > > > > vsnprintf() changes make the code slower > > > > > > Exactly main point of this exercise. I don't believe that algos in vsprintf.c > > > are too dumb to use division per digit (yes, division by constant which is not > > > power of two is a heavy operation). > > > > > > > And second point here, why not to use existing algos from vsprintf.c? > > Exactly. The code in _print_integer_u32() doesn't look as fast as the > code in vsprintf() that happens to use lookup tables and converts > without any loops. > > Hint, loops are bad, they cause the CPU to slow down. Oh, come on! Loops make code fit into icache and μop decode cache. > Anyway, this patch series would require a pretty good improvement, as > the code replacing the sprintf() usages is pretty ugly compared to a > simple sprintf() call. No! Fast code must look ugly. Or in other words if you try to optimise integer printing to death you'll probably end with something like _print_integer(). When the very first patch changed /proc/stat to seq_put_decimal_ull() the speed up was 66% (or 33%). That's how slow printing was back then. It can be made slightly faster even now. > Randomly picking patch 6: > > static int loadavg_proc_show(struct seq_file *m, void *v) > { > unsigned long avnrun[3]; > > get_avenrun(avnrun, FIXED_1/200, 0); > > seq_printf(m, "%lu.%02lu %lu.%02lu %lu.%02lu %u/%d %d\n", > LOAD_INT(avnrun[0]), LOAD_FRAC(avnrun[0]), > LOAD_INT(avnrun[1]), LOAD_FRAC(avnrun[1]), > LOAD_INT(avnrun[2]), LOAD_FRAC(avnrun[2]), > nr_running(), nr_threads, > idr_get_cursor(&task_active_pid_ns(current)->idr) - 1); > return 0; > } > > *vs* > > static int loadavg_proc_show(struct seq_file *m, void *v) > { > unsigned long avnrun[3]; > char buf[3 * (LEN_UL + 1 + 2 + 1) + 10 + 1 + 10 + 1 + 10 + 1]; > char *p = buf + sizeof(buf); > int i; > > *--p = '\n'; > p = _print_integer_u32(p, idr_get_cursor(&task_active_pid_ns(current)->idr) - 1); > *--p = ' '; > p = _print_integer_u32(p, nr_threads); > *--p = '/'; > p = _print_integer_u32(p, nr_running()); > > get_avenrun(avnrun, FIXED_1/200, 0); > for (i = 2; i >= 0; i--) { > *--p = ' '; > --p; /* overwritten */ > *--p = '0'; /* conditionally overwritten */ > (void)_print_integer_u32(p + 2, LOAD_FRAC(avnrun[i])); > *--p = '.'; > p = _print_integer_ul(p, LOAD_INT(avnrun[i])); > } > > seq_write(m, p, buf + sizeof(buf) - p); > return 0; > } > > > I much rather keep the first version. I did the benchmarks (without stack protector though), everything except /proc/cpuinfo and /proc/meminfo became faster. This requires investigation and I can drop vsprintf() changes until then. Now given that /proc/uptime format cast in stone, code may look a bit ugly and unusual but it won't require maintainance