Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp6394916ybp; Tue, 15 Oct 2019 14:23:01 -0700 (PDT) X-Google-Smtp-Source: APXvYqzYZVAV7ajFqp+5zUo0wqRf9V3QUD7A8KA1BJmxQT02QRzBY95l1tb9PXuBmAF5UgocoInG X-Received: by 2002:a05:6402:a4f:: with SMTP id bt15mr36003111edb.121.1571174581756; Tue, 15 Oct 2019 14:23:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571174581; cv=none; d=google.com; s=arc-20160816; b=lW7km/LJngHgvZ8DqXYFfbPshlDe9aVZ2FWJKm7bssN4fgGyVZDZV4cabI+NUYop4x bl4p/Ba3HxzaMyIwwk56Wr4aW4cmNRZ40uPluHjkrOySgGpIlz2plDhCgtWb88L8Mdso en4nAc5qFrMtKLFSs9xlV5ong8fFxVerqwXqu6udAajN1isy+F88WSZJaHNHIvzjBmCt U8FWc52dpN7YbqXRhL033DOpt3R95LfzuT0tGP5YKOXGUsjgA/nUI5I46wUG3R6IF+UL 4rnSbBCBdzAumSkv/sOgbAhdvHj+u/e1UsUQOCCH8z3wqKLcFd5pTKFieCUu5KA0M/5q 5VDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=hBqT9Ybr/Ipic63e3mRedQn/6ltePkNsW7NCVWavp3g=; b=ct1kUSDhU3qfvUElm/6saN8ddi/a06wbE2lxsUosfyNFjTwTXXDhQKT9PvIl11HiJG MtWOAzFsDSofhYSlPb/fS0JQYAVQBk0jUCnkgMjBSYR8r9jFDwm/yAVbAbY9/QH8ZT8V YGbSoQrukQlGKH1MIyryZlJYIu5uEpm/lgx3g4nvC6GEYAWnvZBE67drwIePiPsuaIA5 uN2FUfURDwnuTn2zOJ0xmV3tvT9RrSVpqq5UJxoiwTsuqm73Rvc8lk5kyM2dOk0qdRon qW9f8CMavmCRzVJHHOCzkS+v2eoJNIijt4/Brf1UWhGJC0pzUkrj1i4jZXfUB4emv5tA 3zCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ljbIiqwT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 53si15509767edz.275.2019.10.15.14.22.38; Tue, 15 Oct 2019 14:23:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ljbIiqwT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389020AbfJOSGJ (ORCPT + 99 others); Tue, 15 Oct 2019 14:06:09 -0400 Received: from mail-pl1-f172.google.com ([209.85.214.172]:33752 "EHLO mail-pl1-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725820AbfJOSGJ (ORCPT ); Tue, 15 Oct 2019 14:06:09 -0400 Received: by mail-pl1-f172.google.com with SMTP id d22so9970755pls.0 for ; Tue, 15 Oct 2019 11:06:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=hBqT9Ybr/Ipic63e3mRedQn/6ltePkNsW7NCVWavp3g=; b=ljbIiqwTrf9NgeFAUtAMBFbK8YLkRUnSe2tlxycaq/vf6DF+H5f5YZwmFn6BZThyMx GMnP05W2z06TEWbrr7e6mBtKXgyQ5vwWFyMHlx0eP1C8o7USaba3spZ3vAdAFxGi5Ych 6R3em4hnjlCV/+9+5aVmTHipoSIFG1IeYAEjGz4CKeox9hY6sRiPne5GqYdWOhcz8SMs ux2fodB3vdy1OJcfDNCV1X1Es8hxIxqLqoRVfszKdjWtw1kgx1acKA9VnvaPt5lEaSSO Xuy0A5V5kkr3AnMXf5BTFZe6/IlOrY6vcqNsE8S1jhoMBzXKNhN5JGucaw2Q+m7Wv0na +gXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hBqT9Ybr/Ipic63e3mRedQn/6ltePkNsW7NCVWavp3g=; b=eaiUKE/MdrHoU9SbqO+i4EwNvqhGaKYjJuiEY+/uw9/tw6reqkuLrnRq/uYm5UkAUd fEclmWFui4YEe2IaQ0oJ655qrp5hXqjBnkKBqQntC8ieAEQZSQmaB6K0nJOWRXjSU6C+ cnEc9J5EDnj7lh4nOU+plEGwAF5kl0xdDX5MEK4+yVtk94u/x16jeqa3O8w2yiNKzzVy H0ce+OJShF63M/8uElLkpCCRAVBRDSZnIcGVeHgSZn/hYLDn/u88Oh40BK1jW1t6JC5j FuZygaxbGatHovJKSzEzURBrij5kY2kx2AnToK4L+QnIyI8gi0ELeQ+O/JvCqYsgrt8C 5XPQ== X-Gm-Message-State: APjAAAX+eHuuu0RhXNU351zk8Zk591/f/UOVnqM+uKGBvrXKC0IQeSJQ RICw0m1bS55JHsHh0IcJazNArPmrX+zqfGmhDiqEOw== X-Received: by 2002:a17:902:9b83:: with SMTP id y3mr36165530plp.179.1571162768113; Tue, 15 Oct 2019 11:06:08 -0700 (PDT) MIME-Version: 1.0 References: <9e4d6378-5032-8521-13a9-d9d9519d07de@amd.com> In-Reply-To: From: Nick Desaulniers Date: Tue, 15 Oct 2019 11:05:56 -0700 Message-ID: Subject: Re: AMDGPU and 16B stack alignment To: Arnd Bergmann , "S, Shirish" Cc: "Wentland, Harry" , "Deucher, Alexander" , "yshuiv7@gmail.com" , "andrew.cooper3@citrix.com" , clang-built-linux , Matthias Kaehlcke , "S, Shirish" , "Zhou, David(ChunMing)" , "Koenig, Christian" , amd-gfx list , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 15, 2019 at 12:19 AM Arnd Bergmann wrote: > > On Tue, Oct 15, 2019 at 9:08 AM S, Shirish wrote: > > On 10/15/2019 3:52 AM, Nick Desaulniers wrote: > > > My gcc build fails with below errors: > > > > dcn_calcs.c:1:0: error: -mpreferred-stack-boundary=3 is not between 4 and 12 > > > > dcn_calc_math.c:1:0: error: -mpreferred-stack-boundary=3 is not between 4 and 12 I was able to reproduce this failure on pre-7.1 versions of GCC. It seems that when: 1. code is using doubles 2. setting -mpreferred-stack-boundary=3 -mno-sse2, ie. 8B stack alignment than GCC produces that error: https://godbolt.org/z/7T8nbH That's already a tall order of constraints, so it's understandable that the compiler would just error likely during instruction selection, but was eventually taught how to solve such constraints. > > > > While GPF observed on clang builds seem to be fixed. Thanks for the report. Your testing these patches is invaluable, Shirish! > > Ok, so it seems that gcc insists on having at least 2^4 bytes stack > alignment when > SSE is enabled on x86-64, but does not actually rely on that for > correct operation > unless it's using sse2. So -msse always has to be paired with > -mpreferred-stack-boundary=3. Seemingly only for older versions of GCC, pre 7.1. > > For clang, it sounds like the opposite is true: when passing 16 byte > stack alignment > and having sse/sse2 enabled, it requires the incoming stack to be 16 > byte aligned, I don't think it requires the incoming stack to be 16B aligned for sse2, I think it requires the incoming and current stack alignment to match. Today it does not, which is why we observe GPFs. > but passing 8 byte alignment makes it do the right thing. > > So, should we just always pass $(call cc-option, -mpreferred-stack-boundary=4) > to get the desired outcome on both? Hmmm...I would have liked to remove it outright, as it is an ABI mismatch that is likely to result in instability and non-fun-to-debug runtime issues in the future. I suspect my patch does work for GCC 7.1+. The question is: Do we want to either: 1. mark AMDGPU broken for GCC < 7.1, or 2. continue supporting it via stack alignment mismatch? 2 is brittle, and may break at any point in the future, but if it's working for someone it does make me feel bad to outright disable it. What I'd image 2 looks like is (psuedo code in a Makefile): if CC_IS_GCC && GCC_VERSION < 7.1: set stack alignment to 16B and hope for the best So my diff would be amended to keep the stack alignment flags, but only to support GCC < 7.1. And that assumes my change compiles with GCC 7.1+. (Looks like it does for me locally with GCC 8.3, but I would feel even more confident if someone with hardware to test on and GCC 7.1+ could boot test). -- Thanks, ~Nick Desaulniers