Received: by 10.213.65.68 with SMTP id h4csp1797633imn; Mon, 19 Mar 2018 13:32:37 -0700 (PDT) X-Google-Smtp-Source: AG47ELuKMtLSyp/LrG/Mv/K41DwpVieF7hBZt8IjSk+VKCvh73ST9RPqQ4As2k19i8Z7Uz5NFXPY X-Received: by 2002:a17:902:8c94:: with SMTP id t20-v6mr4464437plo.95.1521491557002; Mon, 19 Mar 2018 13:32:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521491556; cv=none; d=google.com; s=arc-20160816; b=nr4OA2gFyoFMqFq6Iv7tp/KQU224x8da+k6nXwj7Jh3YhSSyuhsgpN6k4Xo5sTZFdc qZRSNWg+iJIwejg0K+0FCpaAF9jn9xAXZmoA+WlMKX0hMDE3T9kgC+3DnmDaW6YFJuSs MRbDPES6UIcuJg0QIn7KQuayiikDrz/Pfsq7YmnBWa4ZySxUoMbteGIsNgOA1W4anaVS kwa4+PNqQT38G6V6/qll6wFHK/eX+GgXVJrlatiKG8Re3txthq/tWTPSNXpTaERJrt3q OOcyZ3fI7+AzWAzxigwgr9yqvlHHyzX3dpJ9xD5HEH9MNo+USc6oRlduJ7PfAZsfaYMT xr1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=pj6/Os0r01sOZwb+eNAmNDcaZ3tKObJGargDuS90Zoc=; b=Pex/CbmBJVaC+xZ7+U1ihDbObbNPpgy4MRw5ZsajJmTPRb/6bU3zYYiGtnD83zP+bU DItEnQWEq5kRfgwlY6knERxlekTMK9PpWFYljg/9R09EgTetXuBiaRSeDxMXtMINMGA3 ogPef40lkuQOkbnyXV0gr3IEXWt3QeLtSn13N4arB9/8HwNTJbDFA1ugZmGTM5u85BQI FgC1LYkilJC3tpqsX1wuEf/Wn5fxBvxZFgcP5hw7M18YfmWt697TVe0sd0PDIZ+UB5W3 XqGy6fuREu4Z2vXqIZB3mngqeCwsIRAaCjgH7GydXXXC1XOMBaHVgiOGBC++psq0rZlz uOWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=NI9oIsqG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l69si30860pfk.180.2018.03.19.13.32.22; Mon, 19 Mar 2018 13:32:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=NI9oIsqG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S971356AbeCSUab (ORCPT + 99 others); Mon, 19 Mar 2018 16:30:31 -0400 Received: from mail-pl0-f65.google.com ([209.85.160.65]:34913 "EHLO mail-pl0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S968853AbeCSUaZ (ORCPT ); Mon, 19 Mar 2018 16:30:25 -0400 Received: by mail-pl0-f65.google.com with SMTP id p9-v6so8422822pls.2; Mon, 19 Mar 2018 13:30:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=pj6/Os0r01sOZwb+eNAmNDcaZ3tKObJGargDuS90Zoc=; b=NI9oIsqGHYUPt70tdDgoQwmL2MYZcsodsw//Gddwq/KMjpF9CVmm3pc/1mKAyLoC2C l94ETb5t40ROsQtDSGbcvOkxxMkEKoIqIxWlt3vMz+B7SSgXi5G5MMs5YjEAcQGZMG0m sMSZlsuSt4iTAQR9M4VfjTD1WakYYcpEEXEhGZCOjrkTcZVWtvQF691WGAw8eBXbx9SK 5RlawAYh82NJUFTdMW+2XcQCI7KXpKnoB2bCCmq9oK23YMELvKhewqAfkwI+920WBUWO eQMHKkz9cg+VEBntJ8jdn5MoBbLigMT0LvbKTQpHCS3Oy5jD8JXAkZ6uzl3QXQTNrgSN mfyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=pj6/Os0r01sOZwb+eNAmNDcaZ3tKObJGargDuS90Zoc=; b=O1AeFvOLtQa87Lzun3ANVYlnHkBS+IdCjxK2FAYf6e3Iy++ScFU0S8X7LCg9Ej+OUD r9GA4fMxL1oTBna/Q7VHUTtTftl1RAqL1E5IXcREYDY/V28U8yGQyTU1/yBT6PcQL9ch Nkyp//ig7ajxFMghfu0i3qfn77iYj81Y04FmSzNL395UyvpSIVuFMXzIhaABKnhVolpo B9aS7ru6q7RozxZc2qD08KiuUBukDllY/re5Bb65dUi7vvQR+LRPemnXOeznMEnmQDpD QotvDk+WStZI9gspTjlstNLP/MsX8xKn4ponrE2Yip2IGb8Y/JClmbXZFAeavoQtTcJc Ge7w== X-Gm-Message-State: AElRT7FcBeueoHNZfFCVDwesQGlyCIe9Mz9he3O5RZCkDKnc7IBD0Wrj zM3gj3cXqtHUkFIf3ByXxJU= X-Received: by 2002:a17:902:8d87:: with SMTP id v7-v6mr13766055plo.146.1521491425002; Mon, 19 Mar 2018 13:30:25 -0700 (PDT) Received: from gmail.com ([2620:15c:17:3:dc28:5c82:b905:e8a8]) by smtp.gmail.com with ESMTPSA id 5sm40336pfh.133.2018.03.19.13.30.23 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 19 Mar 2018 13:30:23 -0700 (PDT) Date: Mon, 19 Mar 2018 13:30:21 -0700 From: Eric Biggers To: Andiry Xu Cc: Nikolay Borisov , Linux FS Devel , Linux Kernel Mailing List , "linux-nvdimm@lists.01.org" , Dan Williams , "Rudoff, Andy" , coughlan@redhat.com, Steven Swanson , Dave Chinner , Jan Kara , swhiteho@redhat.com, miklos@szeredi.hu, Jian Xu , Andiry Xu , Herbert Xu Subject: Re: [RFC v2 05/83] Add NOVA filesystem definitions and useful helper routines. Message-ID: <20180319203021.GA59118@gmail.com> References: <1520705944-6723-1-git-send-email-jix024@eng.ucsd.edu> <1520705944-6723-6-git-send-email-jix024@eng.ucsd.edu> <0924a2b3-6f21-4aaf-224d-2f5accc21d10@gmail.com> <20180311192256.GA630@zzz.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 19, 2018 at 12:39:55PM -0700, Andiry Xu wrote: > On Sun, Mar 11, 2018 at 12:22 PM, Eric Biggers wrote: > > On Sun, Mar 11, 2018 at 02:00:13PM +0200, Nikolay Borisov wrote: > >> [Adding Herbert Xu to CC since he is the maintainer of the crypto subsys > >> maintainer] > >> > >> On 10.03.2018 20:17, Andiry Xu wrote: > >> > >> > >> > +static inline u32 nova_crc32c(u32 crc, const u8 *data, size_t len) > >> > +{ > >> > + u8 *ptr = (u8 *) data; > >> > + u64 acc = crc; /* accumulator, crc32c value in lower 32b */ > >> > + u32 csum; > >> > + > >> > + /* x86 instruction crc32 is part of SSE-4.2 */ > >> > + if (static_cpu_has(X86_FEATURE_XMM4_2)) { > >> > + /* This inline assembly implementation should be equivalent > >> > + * to the kernel's crc32c_intel_le_hw() function used by > >> > + * crc32c(), but this performs better on test machines. > >> > + */ > >> > + while (len > 8) { > >> > + asm volatile(/* 64b quad words */ > >> > + "crc32q (%1), %0" > >> > + : "=r" (acc) > >> > + : "r" (ptr), "0" (acc) > >> > + ); > >> > + ptr += 8; > >> > + len -= 8; > >> > + } > >> > + > >> > + while (len > 0) { > >> > + asm volatile(/* trailing bytes */ > >> > + "crc32b (%1), %0" > >> > + : "=r" (acc) > >> > + : "r" (ptr), "0" (acc) > >> > + ); > >> > + ptr++; > >> > + len--; > >> > + } > >> > + > >> > + csum = (u32) acc; > >> > + } else { > >> > + /* The kernel's crc32c() function should also detect and use the > >> > + * crc32 instruction of SSE-4.2. But calling in to this function > >> > + * is about 3x to 5x slower than the inline assembly version on > >> > + * some test machines. > >> > >> That is really odd. Did you try to characterize why this is the case? Is > >> it purely the overhead of dispatching to the correct backend function? > >> That's a rather big performance hit. > >> > >> > + */ > >> > + csum = crc32c(crc, data, len); > >> > + } > >> > + > >> > + return csum; > >> > +} > >> > + > > > > Are you sure that CONFIG_CRYPTO_CRC32C_INTEL was enabled during your tests and > > that the accelerated version was being called? Or, perhaps CRC32C_PCL_BREAKEVEN > > (defined in arch/x86/crypto/crc32c-intel_glue.c) needs to be adjusted. Please > > don't hack around performance problems like this; if they exist, they need to be > > fixed for everyone. > > > > I have performed the crc32c test on a Xeon X5647 at 2.93GHz, 14G DDR3 > memory at 1066MHz platform. > You are right that enabling CONFIG_CRYPTO_CRC32C_INTEL improves the > performance significantly. nova_crc32c() is still slightly faster than > crc32c() with the flag enabled. > > Result numbers are follows: data size in bytes, latency in ns, column > 3 is crc32c() with CONFIG_CRYPTO_CRC32C_INTEL enabled and column 4 > disabled. > > data size (bytes) nova_crc32c() crc32c() -enabled > crc32c() -disabled > 64 19 21 56 > 128 28 29 99 > 256 46 43 182 > 512 82 149 354 > 1024 157 232 728 > 2048 305 415 1440 > 4096 603 725 2869 > Probably CRC32C_PCL_BREAKEVEN needs to be adjusted for that CPU, as I suggested may be the case; notice that your measured speeds are about the same before 512 (CRC32C_PCL_BREAKEVEN) bytes, but the crypto API version is slower at >= 512 bytes. It would be possible to set the breakeven point in crc32c_intel_mod_init() depending on the CPU. Again, if the performance is not good enough you need to fix it for everyone, not hack around it. Thanks, Eric