Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp4362460pxb; Tue, 10 Nov 2020 14:41:54 -0800 (PST) X-Google-Smtp-Source: ABdhPJw4+k0r654nwFWQmesUEmoPmcm/9LtajHeZ7x1HS4gXQOssNlb52ClY7EZcUWly7HSNq5ec X-Received: by 2002:a50:d751:: with SMTP id i17mr1734395edj.337.1605048114466; Tue, 10 Nov 2020 14:41:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605048114; cv=none; d=google.com; s=arc-20160816; b=C8rVxw56oZT483uiFEO5m+Ff0wV0BpqWR5q6zuhs8R2Wut5ES8kHEsAIa3Y7Zf5mhk Cbb6FigWmW0DcOm9OZchN8w1t94PlfEKFaaTexgi39lOJMB0UlczIq6W6o7hN4IaqyvR 1mNPjpAWgALFDDIibXDZM7kPLRWYTjOnMwQT1mRoDXn0/W50o1WM7lQnB+EJs4zSEUqf 7aiYN6HV6ZuuCjF29AyZUh7vAqDvVReZRi5yzzEHFEcIcT80zO6xqsSr6VgWKMtbfrPX 3vsLKzhJnsYhZbM/UDn4DXGsw3DLg+vHdQxwp4X4A7NOu/jIXse+Spg/uLZV19MO6WVD QHZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=x01X3oGlO2FWJs/nEcm6l2z8oosZN74c8doXPDQlDH8=; b=rYwwNrK6yn7fVwnGOB0L5wfcIgY/i0lmo42RvwXxgQj4zla8Er/loXNFxbc+V9ZYJ/ Yg8INliN7dPmYJTO/aueQgheRxYBJncPIo1lf0rd8n0P3uaVgYJU+HATVFQyhZRE7Ab2 RdYkWyQcu68iX0ap4xrTvf1tQwzbCnCLFePEMwnNzn0g+n+r1EfWLZxig6TuX0QIaDO6 gxcUo0lOQ9JemlSM9FUMeo9W4pjWwYpKjUybvhJj7Wz/JN6qyNUlZd3N2R6lHF6l7fpk /MOho2H37ljtvyBgAUByH0LeWNAuNBdaSXi0AeMTfC/xgL0RCRSaHHu/k8n8YMuqI9Ow w6hw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=PDaTbORq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x9si12146ejy.403.2020.11.10.14.41.29; Tue, 10 Nov 2020 14:41:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=PDaTbORq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732206AbgKJWja (ORCPT + 99 others); Tue, 10 Nov 2020 17:39:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52792 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726688AbgKJWja (ORCPT ); Tue, 10 Nov 2020 17:39:30 -0500 Received: from mail-pf1-x441.google.com (mail-pf1-x441.google.com [IPv6:2607:f8b0:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F300AC0613D1 for ; Tue, 10 Nov 2020 14:39:29 -0800 (PST) Received: by mail-pf1-x441.google.com with SMTP id w6so228590pfu.1 for ; Tue, 10 Nov 2020 14:39:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=x01X3oGlO2FWJs/nEcm6l2z8oosZN74c8doXPDQlDH8=; b=PDaTbORqaF9OsJWeMMH4Nn/m77MsSx7JGQ/5Al2acvW7WJTSQmPK3mh3ELUjd6Vb3O COx/cSxK8sSFQEjj8eJdTDTraM3x7kC0wnksp7FH4z8sGeSInaOToS5W40xZjDNswPMQ aq6LJPf9mi1OMxThfF+hbQhV3gPzMy8KXUL2tWi12Wgwzra44xaM+h2Cg+XQCaWJe7TL SLPqFbTN+dbWZLGzkzl9VBpUHNs5t2n4WzAw/GHIsfFQZgkQzrvppdraQKnQaPG4CptH gb6KrzCG9ZG1o4IXqt8y5Ir5fz/NiHEoOohnOQVBh05jJb9airObpKVQSubg0FeEKg/V yWkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=x01X3oGlO2FWJs/nEcm6l2z8oosZN74c8doXPDQlDH8=; b=HuFu11NV/OeJgDs76Seago5heEdS0EawDD1GNyfZH/xwVdiax/zu/SbTCh19jdugVv y7kw7rMv7d0d/kAe/K3gHMvYNje2w3JGoYRDC7mTESnfipDQYNTdQfaUvDNgjRpYhY+h XCEihSav6XSduXaF/MBLQXNkDXLu8AcOBlV4gckQNwtFq7YsggoQ2YbIK0gy1HclKux8 RmHKk9Ri7xkI1rymYGdBS8kQrz+sbZD05HHNpNrkmHzzpisD/njKj0ggR16L/ifaZdxu 3zuMNfonKoGDgXqSpVn2VNAdmlzQIym5ywG1OnadSPqOAwedmP5qk/FhpU8ONa4OOZgl hyCg== X-Gm-Message-State: AOAM533GsRNJyRjbfn0uYnyRxL+k4ZKREq+XT1YOuo/0HM53+OAKDH+L 8xKtLavx/RBtPWlD6zmrsYa9o239futt6hVVISAVyg== X-Received: by 2002:a62:5e06:0:b029:164:a9ca:b07e with SMTP id s6-20020a625e060000b0290164a9cab07emr14974079pfb.36.1605047969331; Tue, 10 Nov 2020 14:39:29 -0800 (PST) MIME-Version: 1.0 References: <20201106051436.2384842-1-adrian.ratiu@collabora.com> <20201106051436.2384842-3-adrian.ratiu@collabora.com> <20201106101419.GB3811063@ubuntu-m3-large-x86> <87wnyyvh56.fsf@collabora.com> <871rh2i9xg.fsf@iwork.i-did-not-set--mail-host-address--so-tickle-me> <20201110221511.GA1373528@rani.riverdale.lan> In-Reply-To: From: Nick Desaulniers Date: Tue, 10 Nov 2020 14:39:18 -0800 Message-ID: Subject: Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization To: Arvind Sankar Cc: Adrian Ratiu , Nathan Chancellor , Arnd Bergmann , Linux ARM , clang-built-linux , Russell King , LKML , Collabora Kernel ML , Ard Biesheuvel Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 10, 2020 at 2:36 PM Nick Desaulniers wrote: > > On Tue, Nov 10, 2020 at 2:15 PM Arvind Sankar wrote: > > > > On Tue, Nov 10, 2020 at 01:41:17PM -0800, Nick Desaulniers wrote: > > > On Mon, Nov 9, 2020 at 11:51 AM Adrian Ratiu wrote: > > > > > > > > On Fri, 06 Nov 2020, Nick Desaulniers > > > > wrote: > > > > > +#pragma clang loop vectorize(enable) > > > > > do { > > > > > p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; p1[1] ^= > > > > > p2[1] ^ p3[1] ^ p4[1] ^ p5[1]; > > > > > ``` seems to generate the vectorized code. > > > > > > > > > > Why don't we find a way to make those pragma's more toolchain > > > > > portable, rather than open coding them like I have above rather > > > > > than this series? > > > > > > > > Hi again Nick, > > > > > > > > How did you verify the above pragmas generate correct vectorized > > > > code? Have you tested this specific use case? > > > > > > I read the disassembly before and after my suggested use of pragmas; > > > look for vld/vstr. You can also add -Rpass-missed=loop-vectorize to > > > CFLAGS_xor-neon.o in arch/arm/lib/Makefile and rebuild > > > arch/arm/lib/xor-neon.o with CONFIG_BTRFS enabled. > > > > > > > https://godbolt.org/z/1oo9M6 > > > > With the __restrict__ keywords added, clang seems to vectorize the loop, > > but still reports that vectorization wasn't beneficial -- any idea > > what's going on? Anyways, it's not safe to make that change in the kernel unless you can guarantee that callers of these routines do not alias or overlap. > > I suspect that loop-vectorize is a higher level pass that relies on > slp-vectorizer for the transform. > > $ clang -O2 --target=arm-linux-gnueabi -S -o - foo.c -mfpu=neon -mllvm > -print-after-all > ... > *** IR Dump After SLP Vectorizer *** > (bunch of <4 x i32> types) > > If you add -Rpass-missed=slp-vectorizer, observe that the existing > warnings from -Rpass-missed=loop-vectorize disappear; I suspect > loop-vectorize will print a "remark" if passes it calls did not, but > returned some for of error code. > > -Rpass=slp-vectorizer shows that it vectorizes two sequences of the > loop, and warns that some third portion (that's > non-immediately-obvious to me) was non beneficial. > > -- > Thanks, > ~Nick Desaulniers -- Thanks, ~Nick Desaulniers