Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp12441pxv; Wed, 21 Jul 2021 14:05:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy0uX3ru3fhu8PzLTLuytvgyj8o81uSYqCYA9R5TWFBPd33fOLtCkQklNoUQHCcSmZiKUQZ X-Received: by 2002:a92:c266:: with SMTP id h6mr2524057ild.273.1626901531240; Wed, 21 Jul 2021 14:05:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626901531; cv=none; d=google.com; s=arc-20160816; b=GsMx8vruwEBF5frmxXE1huOsQR6x+F6r34PsfdqiLe2zKwzfuxSjNnHXPS71mOebtv 9IOes9WhQgngHnkFxm8FLTZZpjcWyPb5VISxBb8N+DF0DO3H5SQMwEYe7LjVVCrz9c2m zEuf3QizZaWrQSuo7j59lQF5WMPEk4s7psiGfc5KBNv+3qw1V9GlXcsNJjDdq+dkYSfC iJsekzp7oB+GjP3MR4qcL6Qb10COsI+IATjwZw4gImLcwiVcgzhduP+VCusdAU7vSD+S G83sRKZk2qFmM2xwzMkNgcoIkNUOP1IUvnkO+Ee+J2VbOXXyqrBwavmTxhNHS9TsJBrK 31vQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:mail-followup-to:reply-to:message-id :subject:cc:to:from:date:dkim-signature:dkim-signature; bh=DnugCTEBu/3zdvXcY55icg8lMm81MI8TLb/CodM5bQ4=; b=Zb+3xiytJ3Bi0TlLTO9Doy9DUj0l+5v5mnuYU63HQ7dnBw57mlreeDyUZGFNeujjD3 qtTAmTCb3ZbBKEdF2R2PSfoojJFtdMAZG7OOe8j426hieBTsEglnHgMNhbJIz3G0BPFO yM7/2ccmt9JW1JL7/bpRnRRHdgy9fott4r90shw0mLOa3W3WJCMIKAb3r4BRswc9ddQJ NC8EpmQH+Wf2XW1ozpyrqnx8GHCBWf1eenxGJqxBjFJa2/PJykQmcuYEuzLTn90Ma5sg DfVjEQUJn/ug7/U97oPA95WjglbsF7D/MG87vWBh/U+rme4WnoIKor2udd2EvZMd3KwK eqeA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=Q3dgsvHC; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519 header.b=i77W9cAZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l3si27559391ils.48.2021.07.21.14.05.19; Wed, 21 Jul 2021 14:05:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=Q3dgsvHC; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519 header.b=i77W9cAZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230427AbhGUTcn (ORCPT + 99 others); Wed, 21 Jul 2021 15:32:43 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:39506 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229742AbhGUTch (ORCPT ); Wed, 21 Jul 2021 15:32:37 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 565512258A; Wed, 21 Jul 2021 20:13:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1626898392; h=from:from:reply-to:reply-to:date:date:message-id:message-id:to:to: cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DnugCTEBu/3zdvXcY55icg8lMm81MI8TLb/CodM5bQ4=; b=Q3dgsvHCULcJ/SqVmQxhIrkMuUCDzdLhFmrnDGcZNqkr1xgEy4QK7yHLVfABgJtYUUer54 jz62fd45a7ax5Z7G87TQRg350dZfoMia9EFD5kHKVfBu/aJWy21zRZaCiyVGX21E84WorC ph13hTPc6uWv72rfmRD2i0p/A5PundI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1626898392; h=from:from:reply-to:reply-to:date:date:message-id:message-id:to:to: cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DnugCTEBu/3zdvXcY55icg8lMm81MI8TLb/CodM5bQ4=; b=i77W9cAZ6xZdrQcCdPf5TLuSIGRu7hT3cO+1QLjU8gPMujSJyBm94BUhf3i5V0qKexTICD N/XC57je7LvTQHAw== Received: from ds.suse.cz (ds.suse.cz [10.100.12.205]) by relay2.suse.de (Postfix) with ESMTP id 46C59A3B85; Wed, 21 Jul 2021 20:13:12 +0000 (UTC) Received: by ds.suse.cz (Postfix, from userid 10065) id CA8AEDA701; Wed, 21 Jul 2021 22:10:29 +0200 (CEST) Date: Wed, 21 Jul 2021 22:10:29 +0200 From: David Sterba To: Linus Torvalds Cc: Nikolay Borisov , Linux Kernel Mailing List , Nick Desaulniers , linux-fsdevel , Dave Chinner Subject: Re: [PATCH] lib/string: Bring optimized memcmp from glibc Message-ID: <20210721201029.GQ19710@twin.jikos.cz> Reply-To: dsterba@suse.cz Mail-Followup-To: dsterba@suse.cz, Linus Torvalds , Nikolay Borisov , Linux Kernel Mailing List , Nick Desaulniers , linux-fsdevel , Dave Chinner References: <20210721135926.602840-1-nborisov@suse.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23.1-rc1 (2014-03-12) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 21, 2021 at 11:00:59AM -0700, Linus Torvalds wrote: > On Wed, Jul 21, 2021 at 6:59 AM Nikolay Borisov wrote: > > > > This is glibc's memcmp version. The upside is that for architectures > > which don't have an optimized version the kernel can provide some > > solace in the form of a generic, word-sized optimized memcmp. I tested > > this with a heavy IOCTL_FIDEDUPERANGE(2) workload and here are the > > results I got: > > Hmm. I suspect the usual kernel use of memcmp() is _very_ skewed to > very small memcmp calls, and I don't think I've ever seen that > (horribly bad) byte-wise default memcmp in most profiles. > > I suspect that FIDEDUPERANGE thing is most likely a very special case. > > So I don't think you're wrong to look at this, but I think you've gone > from our old "spend no effort at all" to "look at one special case". The memcmp in question is fs/remap_range.c:vfs_dedupe_file_range_compare 253 src_addr = kmap_atomic(src_page); 254 dest_addr = kmap_atomic(dest_page); ... 259 if (memcmp(src_addr + src_poff, dest_addr + dest_poff, cmp_len)) 260 same = false; 261 262 kunmap_atomic(dest_addr); 263 kunmap_atomic(src_addr); so adding a memcmp_large that compares by native words or u64 could be the best option. There's some alignment of the starting offset and length but that can be special cased and fall back to standard memcmp. The dedupe ioctl is typically called on ranges spanning many pages so the overhead of the non-paged portions should be insignificant.