Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp4363027pxb; Tue, 10 Nov 2020 14:43:14 -0800 (PST) X-Google-Smtp-Source: ABdhPJxcPrALLrQuGr5uRg/Or4BppXjxIqq6JZ1E8AjPXzon4540++yOdI05DCsFi8osw0UD/kZy X-Received: by 2002:a50:8acc:: with SMTP id k12mr1453844edk.257.1605048194260; Tue, 10 Nov 2020 14:43:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605048194; cv=none; d=google.com; s=arc-20160816; b=eoHJaSVmTqk9mcttioHs88JV/4u37Td4vASy7cpF15ICxRtKF5kf4iZnYf8SSrW6jS dWA91inBxp8smj/mccT7AmweW46cshC7h3E9C3lGwOhLBmfdLkOqq3Ej+rE4TsmwdnxN 9WtA9G1RWebTgtWFivY5pH1ZdAdxd3YaCdNQrHeetQAb9Mcbu7FNcxcFq3HPeyucKXZC Du2KLzyWhXVuhLoPjDCucVntE/DcmlBPupwZ/3LF63jPfpPytwa7hceFqaj1MOT7xLKs 542SwQq78pyFm9wgzXC7y/g3KacvW7HihDX6zjJ0ID03k2oW9X+FeFRHn8fG9POuTEMJ JqaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=NRHRA3dKhIbMgb5WL8qK9dto+UXsm4X1NyrdOoYkubs=; b=hrX+ElQHzBcrG6O3zDTulCznSHahLp+OijyVjKcqADG06AzyAChglQeOv5LAr2ZzYj zCuUhJsgbx0dOOxKhuNvH1q7wAxAN3sWrdgZQ1/+V+4A7rh3XUEZ2aqhOZFJqzm0lBRT +QA1HTbevNAZSmzZvcSuk6HIXfiQFhZz7V2q3RsD9WSdGPJQYuBB1ZsIw3BbwRgxCGXY WGak66BWWqDxGAcd/uqnBLUwB8k4VpWu1kgX0bmebDqhuW0DZOk2vCNh/KUQCzJzRgXE Btz+Q2Z2W0Pdl9rwLRtgUhp2VrTvTXpVd/5ai71+FaLUNZqHnHUm+utLUfxK0Asa5opu y71Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Kad0oMCe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o2si91802edv.416.2020.11.10.14.42.50; Tue, 10 Nov 2020 14:43:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Kad0oMCe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732240AbgKJWkL (ORCPT + 99 others); Tue, 10 Nov 2020 17:40:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732123AbgKJWkL (ORCPT ); Tue, 10 Nov 2020 17:40:11 -0500 Received: from mail-pf1-x443.google.com (mail-pf1-x443.google.com [IPv6:2607:f8b0:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B67DC0613D1 for ; Tue, 10 Nov 2020 14:40:11 -0800 (PST) Received: by mail-pf1-x443.google.com with SMTP id y7so181124pfq.11 for ; Tue, 10 Nov 2020 14:40:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=NRHRA3dKhIbMgb5WL8qK9dto+UXsm4X1NyrdOoYkubs=; b=Kad0oMCetDBD359mRaATbdWv2pomotW5/Yn01pwT8iBS3NuvNojaW8ID+DXfDit87Q 2zTGODEuB8OBXXBsuLpAQUahyBJ1eobfNOTflDzkK/OqdRW8Y/M8DKn84oMS9f7NFgYv 2exY/LAIJ7eCw4oabwKobO7EbysTpvfbuOEXA58r6EARcW//3dnGDOW+53VIDruksULz WbL2phMowO9ZPpvayKTRLLg60wc87W9n0mOZX7XaQqw9oEAIjXoHuhLjHhiDfh0Wcari 5hCZFn8RRY6TAfquR4Bhriu6+Jdds929nDdMdt0J3Fad859IidR2dcltBomiBG+IyIP2 48eA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=NRHRA3dKhIbMgb5WL8qK9dto+UXsm4X1NyrdOoYkubs=; b=YIlxj8JJ6QnbK5O4UZYh3/o7UVRsNKlAJmTL6Ru5HodcdvI8qztYJUBzn/I2HrIHU9 dDtp9GmvJl7l/30vfce6xYg6xaHbWwvCMoD0uQZpcOsYSbX8Ep124ja2nv6u2Kq0XPOl S7eloH4cYJv2Y89XViDkEB5SzjRFWKOGqxmcs9boKRcOYfu6psKen3NRfnaUsfanE8Uv OfMNoHCR8G3B/OtuoNv0PkFN9WSB871+5MNjppEUuP7UzLcf2z0JLKGqzRfNZnNHbcbI 1sI1veHRSfimax+3c1KRKCeN3/FshdveNo+vi09e9Hd6M3q3JuTrhxft03EECy/51PDi hk+w== X-Gm-Message-State: AOAM532/dU+hn9x4c1YKXwB8palKE75WHSwUkoxIC/N5FTVXIqfLIHq4 X3i/nriue0jXU5EGn1PaiQKFG4Udr5Bk0paf3zpOqA== X-Received: by 2002:a17:90a:4881:: with SMTP id b1mr430686pjh.32.1605048010845; Tue, 10 Nov 2020 14:40:10 -0800 (PST) MIME-Version: 1.0 References: <20201106051436.2384842-1-adrian.ratiu@collabora.com> <20201106051436.2384842-3-adrian.ratiu@collabora.com> <20201106101419.GB3811063@ubuntu-m3-large-x86> <87wnyyvh56.fsf@collabora.com> <871rh2i9xg.fsf@iwork.i-did-not-set--mail-host-address--so-tickle-me> <20201110221511.GA1373528@rani.riverdale.lan> In-Reply-To: From: Nick Desaulniers Date: Tue, 10 Nov 2020 14:39:59 -0800 Message-ID: Subject: Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization To: Arvind Sankar Cc: Adrian Ratiu , Nathan Chancellor , Arnd Bergmann , Linux ARM , clang-built-linux , Russell King , LKML , Collabora Kernel ML , Ard Biesheuvel Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 10, 2020 at 2:39 PM Nick Desaulniers wrote: > > On Tue, Nov 10, 2020 at 2:36 PM Nick Desaulniers > wrote: > > > > On Tue, Nov 10, 2020 at 2:15 PM Arvind Sankar wrote: > > > > > > On Tue, Nov 10, 2020 at 01:41:17PM -0800, Nick Desaulniers wrote: > > > > On Mon, Nov 9, 2020 at 11:51 AM Adrian Ratiu wrote: > > > > > > > > > > On Fri, 06 Nov 2020, Nick Desaulniers > > > > > wrote: > > > > > > +#pragma clang loop vectorize(enable) > > > > > > do { > > > > > > p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; p1[1] ^= > > > > > > p2[1] ^ p3[1] ^ p4[1] ^ p5[1]; > > > > > > ``` seems to generate the vectorized code. > > > > > > > > > > > > Why don't we find a way to make those pragma's more toolchain > > > > > > portable, rather than open coding them like I have above rather > > > > > > than this series? > > > > > > > > > > Hi again Nick, > > > > > > > > > > How did you verify the above pragmas generate correct vectorized > > > > > code? Have you tested this specific use case? > > > > > > > > I read the disassembly before and after my suggested use of pragmas; > > > > look for vld/vstr. You can also add -Rpass-missed=loop-vectorize to > > > > CFLAGS_xor-neon.o in arch/arm/lib/Makefile and rebuild > > > > arch/arm/lib/xor-neon.o with CONFIG_BTRFS enabled. > > > > > > > > > > https://godbolt.org/z/1oo9M6 > > > > > > With the __restrict__ keywords added, clang seems to vectorize the loop, > > > but still reports that vectorization wasn't beneficial -- any idea > > > what's going on? > > Anyways, it's not safe to make that change in the kernel unless you > can guarantee that callers of these routines do not alias or overlap. s/callers/parameters passed by callers/ > > > > > I suspect that loop-vectorize is a higher level pass that relies on > > slp-vectorizer for the transform. > > > > $ clang -O2 --target=arm-linux-gnueabi -S -o - foo.c -mfpu=neon -mllvm > > -print-after-all > > ... > > *** IR Dump After SLP Vectorizer *** > > (bunch of <4 x i32> types) > > > > If you add -Rpass-missed=slp-vectorizer, observe that the existing > > warnings from -Rpass-missed=loop-vectorize disappear; I suspect > > loop-vectorize will print a "remark" if passes it calls did not, but > > returned some for of error code. > > > > -Rpass=slp-vectorizer shows that it vectorizes two sequences of the > > loop, and warns that some third portion (that's > > non-immediately-obvious to me) was non beneficial. > > > > -- > > Thanks, > > ~Nick Desaulniers > > > > -- > Thanks, > ~Nick Desaulniers -- Thanks, ~Nick Desaulniers