Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp4041464ybz; Mon, 20 Apr 2020 14:28:46 -0700 (PDT) X-Google-Smtp-Source: APiQypKkL6HQKLZzf6qilBugpyD/Uo+zqseI9iCAYRgzzX1eKVxpLAY4SzxKwEsusnQiRhCNQJon X-Received: by 2002:aa7:da0b:: with SMTP id r11mr16542116eds.63.1587418126385; Mon, 20 Apr 2020 14:28:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587418126; cv=none; d=google.com; s=arc-20160816; b=b1y8PyZz5OZihCPJLtWGZDfs9IBUcjD25DfCeZ0IZWbz3gK4HCPpq3yedy8m68I15T Rg+ROSQBiDldPdrOH8EpafHtG6IReH8kHLKE+O/wBGe1gcWGvYB+6D/NIyzRINCGS137 eI7QY3m15Psqj7GxwUsrAZMuxyGHz0iqWlWBJFD71UL6r8z+pyl9+fXKO/ShI1xtHTIV pUi+vldQhLb0gTA2hijKJY1LY7+3lNwQud1vSpvL6gXNYrk3qs5vdPfDYM3glknub9wR W6GJjCN+FZTyZbPvLITAcZS09lhOfmGufyxAVv3HxGtLVFxKh/LVRNvF/swzRVSS1u2j vpaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:organization:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:ironport-sdr:ironport-sdr; bh=eej9ayEvaMAwONrntoyORLrgTRH34RinbQtDGxWWQbQ=; b=BGHhRnu2tPILc2nb0syy1rF7IDTMqAaMJbj4lAdh4ZSrDDsrHRsT3xzUqk2YI2al9S sRVShZEiatDJwWvz9qQHblbcII1+r+drR2JiFJw2DL74IyjK8bnJ4e/EqVokWUdQO5WK Iq3y6VAjjBRHe/+rB0NJHbTt1ghOnpRhf1+WUdosLW8ZNYwDNV8DphNGR7ilBmmx33Hi FFwkGEcK+IP9x5J776wIdYeKpHZr7PuUqjbUqN8hGwi7VqyfK/7m2wlApGXSq1khJ3zx HZ9LVmGLrWSfbQ4MfWfWUa5SZIkX3T6/pQxG9rLGpBFo+MiWm6WuFC3S6V5zys0TpUaw FNwA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u10si250410ejt.145.2020.04.20.14.28.22; Mon, 20 Apr 2020 14:28:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726764AbgDTV1X (ORCPT + 99 others); Mon, 20 Apr 2020 17:27:23 -0400 Received: from mga11.intel.com ([192.55.52.93]:35383 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726050AbgDTV1X (ORCPT ); Mon, 20 Apr 2020 17:27:23 -0400 IronPort-SDR: TEs/9Fnz7hcwNvlBPeJ4OVtvGPzLhNLQFtRw305M0D7cs5XyEVbTF+dRAgUgrnlluBdl7hcVhx KmFfpZ3Vg6hw== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Apr 2020 14:27:22 -0700 IronPort-SDR: inL+gV1iNgpp4Pg9Jrp7kPw0b7fxvNXaWpgLg9468RJgLqjzIFFXB084vJwePYH/m1c+jM1v90 Dxza4s6gBnMA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,407,1580803200"; d="scan'208";a="258483635" Received: from smile.fi.intel.com (HELO smile) ([10.237.68.40]) by orsmga006.jf.intel.com with ESMTP; 20 Apr 2020 14:27:20 -0700 Received: from andy by smile with local (Exim 4.93) (envelope-from ) id 1jQdwp-0027yy-Co; Tue, 21 Apr 2020 00:27:23 +0300 Date: Tue, 21 Apr 2020 00:27:23 +0300 From: Andy Shevchenko To: Alexey Dobriyan Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, pmladek@suse.com, rostedt@goodmis.org, sergey.senozhatsky@gmail.com, linux@rasmusvillemoes.dk Subject: Re: [PATCH 03/15] print_integer: new and improved way of printing integers Message-ID: <20200420212723.GE185537@smile.fi.intel.com> References: <20200420205743.19964-1-adobriyan@gmail.com> <20200420205743.19964-3-adobriyan@gmail.com> <20200420211911.GC185537@smile.fi.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200420211911.GC185537@smile.fi.intel.com> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 21, 2020 at 12:19:11AM +0300, Andy Shevchenko wrote: > On Mon, Apr 20, 2020 at 11:57:31PM +0300, Alexey Dobriyan wrote: > > Time honored way to print integers via vsnprintf() or equivalent has > > unavoidable slowdown of parsing format string. This can't be fixed in C, > > without introducing external preprocessor. > > > > seq_put_decimal_ull() partially saves the day, but there are a lot of > > branches inside and overcopying still. > > > > _print_integer_*() family of functions is meant to make printing > > integers as fast as possible by deleting format string parsing and doing > > as little work as possible. > > > > It is based on the following observations: > > > > 1) memcpy is done in forward direction > > it can be done backwards but nobody does that, > > > > 2) digits can be extracted in a very simple loop which costs only > > 1 multiplication and shift (division by constant is not division) > > > > All the above asks for the following signature, semantics and pattern of > > printing out beloved /proc files: > > > > /* seq_printf(seq, "%u %llu\n", A, b); */ > > > > char buf[10 + 1 + 20 + 1]; > > char *p = buf + sizeof(buf); > > > > *--p = '\n'; > > p = _print_integer_u64(p, B); > > *--p = ' '; > > p = _print_integer_u32(p, A); > > > > seq_write(seq, p, buf + sizeof(buf) - p); > > > > 1) stack buffer capable of holding the biggest string is allocated. > > > > 2) "p" is pointer to start of the string. Initially it points past > > the end of the buffer WHICH IS NOT NUL-TERMINATED! > > > > 3) _print_integer_*() actually prints an integer from right to left > > and returns new start of the string. > > > > <--------| > > 123 > > ^ > > | > > +-- p > > > > 4) 1 character is printed with > > > > *--p = 'x'; > > > > It generates very efficient code as multiple writes can be > > merged. > > > > 5) fixed string is printed with > > > > p = memcpy(p - 3, "foo", 3); > > > > Complers know what memcpy() does and write-combine it. > > 4/8-byte writes become 1 instruction and are very efficient. > > > > 6) Once everything is printed, the result is written to seq_file buffer. > > It does only one overflow check and 1 copy. > > > > This generates very efficient code (and small!). > > > > In regular seq_printf() calls, first argument and format string are > > constantly reloaded. Format string will most likely with [rip+...] which > > is quite verbose. > > > > seq_put_decimal_ull() will do branches (and even more branches > > with "width" argument) > > > > > TODO > > benchmark with mainline because nouveau is broken for me -( > > vsnprintf() changes make the code slower > > Exactly main point of this exercise. I don't believe that algos in vsprintf.c > are too dumb to use division per digit (yes, division by constant which is not > power of two is a heavy operation). > And second point here, why not to use existing algos from vsprintf.c? -- With Best Regards, Andy Shevchenko