Received: by 2002:a25:b794:0:0:0:0:0 with SMTP id n20csp5978595ybh; Wed, 7 Aug 2019 14:57:47 -0700 (PDT) X-Google-Smtp-Source: APXvYqwwDyeJPJFJ8UdeI5d8zOqpvn+ftSo+WQLacuAjFN+0XhNlZw1+Qv0xz4XxqCJwDfP/qBzg X-Received: by 2002:a63:ee04:: with SMTP id e4mr9568283pgi.53.1565215067340; Wed, 07 Aug 2019 14:57:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565215067; cv=none; d=google.com; s=arc-20160816; b=AIrDdQpE9r7XkUcCtsajkFWmXMa+s8uw1FrtUX5le2GOmsnEcUYW/49/NgNQqOuj8f sxEwpp5q4KsVZX5SP1VSq3KEhAb3mqDmF/N3MS706EIYepsOHbyzmcfF5rA5DbQ5eMUm Wr+hZsSpsiWSP+CKCrBKRx2iEdZeE9i7ZYJh2qnlvsJpZ2bBVdEYwB2VgepYVBT/cSFB qEWUGpcSu8eZUjf5sCja8ookXSu3L2flscKSEvEKLLCSK4GepW3EwCE3Od1/Ldt6i3hQ 5hwmW10TZH5+bh3NzYtIfErB07m16M5oWqwvpEd1lmyqRYcf+LX5VNi0gUZn0jL3rUut Ol0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=fd+dofbpCV/6k5WcFl4oZMCWdF8fj4kR4/a+wjxKmsA=; b=vXS1TzlHvXO/2D8YF1c9IF7gZBFLfX9SX46U/GsQcoFCa+iLp40Nq+6PEmy6pUMpb7 slnZl19ahPuLF63KvhG2oxtWkWUbAc7YBCZgsBLvfh3Z2P/BTeJBC7CLiQXAHrZjx7gj +KHqdnuvkH3EkM5PtOMGi5mM75KERn/ccK/iINwpa1jVNB6ZW2xnUrR/JOEXCFD6UFQp /uMa61OUPnky7oZL+MUnBk4DLReFDHugCBmJfPsvM1oBVI2T+L5nTCjFnmE9hiwrB89n QmaJLrFJdOoFAGHa/lrmpVf9aJB8y9gJUjtS/blM1bGfVfwsFDSnDQLgncndqz5R0m73 y+8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=rAUbgR3z; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p23si20619879pgi.76.2019.08.07.14.57.32; Wed, 07 Aug 2019 14:57:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=rAUbgR3z; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730389AbfHGVM2 (ORCPT + 99 others); Wed, 7 Aug 2019 17:12:28 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:35306 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729714AbfHGVM1 (ORCPT ); Wed, 7 Aug 2019 17:12:27 -0400 Received: by mail-pl1-f193.google.com with SMTP id w24so42565866plp.2 for ; Wed, 07 Aug 2019 14:12:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=fd+dofbpCV/6k5WcFl4oZMCWdF8fj4kR4/a+wjxKmsA=; b=rAUbgR3zhEw1fjUVF8KM3rovMm6jms4eUTwDrpb/7hk+mJRmY+GevkEEjZ8jd9WxJ8 ebboqjF0o1XIe0OyOtHde46MsTIXqNDzKbfU14DIZd/Xg3c2k72+RFur9R2NQ6CghXkb kvSyjMG5CjfRXi8FIkIWOq1aMaubFv5QFuFCjjJdeAsifVfE6p6WjJcnUmgtWFYkBGVZ m1LPTv3MkQd6vdZv9sdS0bQAe2kA/7mk1TFwOGPkDQzcMC/5KjUC/P+kZhx2g/JMS5NV /GJkm5fugsOWIlIhHvsuntFuhFO1K2oR3ulrDn3XIO2GAdfJxdQO/P+tKaNbuLtzqKRt uosA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fd+dofbpCV/6k5WcFl4oZMCWdF8fj4kR4/a+wjxKmsA=; b=cUr5f3yg/csAB/3LmfHdeOo5gpfjKSQGnVFVcKInPFodzT1j7uii/QzgOeTrjuqkkG Ml6zutZexdw6evY7FJ+muIUyYTOqHjPV631dpqd0GrFMOW83Blav1B6DCZU4Qrpoq2wb BXRsm+n9zmqaAB0oBFUQp+0zccwgO7xl7kwK/1CQ4nngeQVYMosOpkurimIOAY4VLezB 8TxdevGQKsD2x7xfpvz9BxM2k4CmHR6IuL4cU1Q7obA20Y/gYpQfO/wl1kwiZ4vCrgBZ 8Fl1hKv5lj8epfGErFjM8fIlns6tirUnxS8Ni93Omk1l6g9MQwIYYWe38r8JHeEgKzo6 E/LA== X-Gm-Message-State: APjAAAXGdrGFyN4mAQiRDxhTmHgiD3iNE37mhmjP3oTTDaDGYNQ8EvI7 0atOvbmsNXiBgCUPjLHyOtv0wHoDukTLfv9O4LK8wg== X-Received: by 2002:a17:902:3363:: with SMTP id a90mr9572730plc.119.1565212346434; Wed, 07 Aug 2019 14:12:26 -0700 (PDT) MIME-Version: 1.0 References: <20190729211014.39333-1-ndesaulniers@google.com> In-Reply-To: From: Nick Desaulniers Date: Wed, 7 Aug 2019 14:12:15 -0700 Message-ID: Subject: Re: [PATCH] mips: avoid explicit UB in assignment of mips_io_port_base To: "Maciej W. Rozycki" Cc: Ralf Baechle , Paul Burton , James Hogan , Nathan Chancellor , Eli Friedman , Hassan Naveed , Stephen Kitt , Serge Semin , Mike Rapoport , Andrew Morton , Michal Hocko , linux-mips@vger.kernel.org, LKML , clang-built-linux , regehr@cs.utah.edu, Philip Reames , Alexander Potapenko Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sorry for the delayed response, literally sent the patch then went on vacation. On Mon, Jul 29, 2019 at 3:16 PM Maciej W. Rozycki wrote: > > On Mon, 29 Jul 2019, Nick Desaulniers wrote: > > > The code in question is modifying a variable declared const through > > pointer manipulation. Such code is explicitly undefined behavior, and > > is the lone issue preventing malta_defconfig from booting when built > > with Clang: > > > > If an attempt is made to modify an object defined with a const-qualified > > type through use of an lvalue with non-const-qualified type, the > > behavior is undefined. > > > > LLVM is removing such assignments. A simple fix is to not declare > > variables const that you plan on modifying. Limiting the scope would be > > a better method of preventing unwanted writes to such a variable. This is now documented in the LLVM release notes for Clang-9: https://github.com/llvm/llvm-project/commit/e39e79358fcdd5d8ad809defaa821f0bbfa809a5 > > > > Further, the code in question mentions "compiler bugs" without any links > > to bug reports, so it is difficult to know if the issue is resolved in > > GCC. The patch was authored in 2006, which would have been GCC 4.0.3 or > > 4.1.1. The minimal supported version of GCC in the Linux kernel is > > currently 4.6. > > It's somewhat older than that. My investigation points to: > > commit c94e57dcd61d661749d53ee876ab265883b0a103 > Author: Ralf Baechle > Date: Sun Nov 25 09:25:53 2001 +0000 > > Cleanup of include/asm-mips/io.h. Now looks neat and harmless. Oh indeed, great find! So it looks to me like the order of events is: 1. https://github.com/jaaron/linux-mips-ip30/commit/c94e57dcd61d661749d53ee876ab265883b0a103 in 2001 first introduces the UB. mips_io_port_base is defined non-const in arch/mips/kernel/setup.c, but then declared extern const (and modified via UB) in include/asm-mips/io.h. A setter is created, but not a getter (I'll revisit this below). This appears to work (due to luck) for a few years until: 2. https://github.com/mpe/linux-fullhistory/commit/966f4406d903a4214fdc74bec54710c6232a95b8 in 2006 adds a compiler barrier (reload all variables) and this appears to work. The commit message mentions that reads after modification of the const variable were buggy (likely GCC started taking advantage of the explicit UB around this time as well). This isn't a fix for UB (more thoughts below), but appears to work. 3. https://github.com/llvm/llvm-project/commit/b45631090220b732e614b5530bbd1d230eb9d38e in 2019 removes writes to const variables in LLVM as that's explicit UB. We observe the boot failure in mips and narrow it down to this instance. I can see how throwing a compiler barrier in there made subsequent reads after UB writes appear to work, but that was more due to luck and implementation details of GCC than the heart of the issue (ie. not writing code that is explicitly undefined behavior)(and could change in future versions of GCC). Stated another way, the fix for explicit UB is not hacks, but avoiding the UB by rewriting the problematic code. > However the purpose of the arrangement does not appear to me to be > particularly specific to a compiler version. > > > For what its worth, there was UB before the commit in question, it just > > added a barrier and got lucky IRT codegen. I don't think there's any > > actual compiler bugs related, just runtime bugs due to UB. > > Does your solution preserves the original purpose of the hack though as > documented in the comment you propose to be removed? The function modified simply writes to a global variable. It's not clear to my why the value about to be modified would EVER be loaded before modification. > Clearly it was defined enough to work for almost 18 years, so it would be > good to keep the optimisation functionally by using different means that > do not rely on UB. "Defined enough" ??? https://youtu.be/Aq_1l316ow8?t=17 > This variable is assigned at most once throughout the > life of the kernel and then early on, so considering it r/w with all the > consequences for all accesses does not appear to me to be a good use of > it. Note: it's not possible to express the semantics of a "write once variable" in C short of static initialization (AFAIK, without explicit violation of UB, but Cunningham's Law may apply). (set_io_port_base is called in ~20 places) Thinking more about this while I was away, I think what this code has needed since 2001 is proper encapsulation. If you want a variable that is written from one place only, but readable throughout, then the pattern I'd use is: 1. declare a getter in a .h file. 2. define/qualify `mips_io_port_base` as `static` and non-const in a .c file where it's modified. 3. define the getter and setter in the above .c file. That would rely on linkage to limit the visibility of the symbol for modification. But, we'd then need to export the getter, vs the symbol itself. There's also on the order of ~20 call sites that would need to be changed to invoke the getter rather than read the raw variable. Also, it's unlikely the getter gets inlined across translation units (short of LTO, which the mainline kernel doesn't support today). I think my patch here (https://lkml.org/lkml/2019/7/29/1636) is minimally and much less invasive. > Maybe a piece of inline asm to hide the initialisation or suchlike then? I think that would still be UB as the definition would not be changed; you'd still be modifying a variable declared const. -- Thanks, ~Nick Desaulniers