Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp2855349ybi; Mon, 17 Jun 2019 11:30:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqxUBSaksLQIQNiiLX3cAk9kHmOOlfNT1RPFKXHF+be78/xJzahrlZzhTam6yCT9vrdb6AAJ X-Received: by 2002:a63:dd53:: with SMTP id g19mr39678442pgj.3.1560792410832; Mon, 17 Jun 2019 10:26:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560792410; cv=none; d=google.com; s=arc-20160816; b=1GwJ01ihlnXq+b5oLFHe0Z8yKNquhJfwmBmlHoIIC56H2c4WCPRBG8J+x1p4KSvJFo dAqNyBUVMsal7cwf+DBDJuJY29/1cq8phz2/DM1p5G8SOxqXmmmRr6MpQ2C77Xt8Dy8i w4d1WOf2M2TOBt4hFs2BMFmDY5lj/jPcMumw4DCE8chRyP0/afUXPnqKmbaJDncrKqR3 IEm4BHgXRknw0n+NXCpEReJOaIk4J5MKX8SUw6Rz5huEDXwuJQC08a82IwpgUcS8qxWh o4FOUfbJP+mDShZorz7ltfRXtRjbK+EX1CDN6Ghabep4JFqXyhD7z077UfKCq8WK636Y 6/IA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=7VdbgjzwUK3rYxLrjxOj4k1lOI204uHk7xZBbLf6wbg=; b=jvLg/UF80mc/DAcB4MYBFqkOBYCA8TTOTJ4g42rrYrNMWXoeEOHV7JDNaMhumhvDSU lfTJRTpKbGuwveOjglaWNN7utdY6zbXD6DKiJ185LZeD/Zj27mcfnvvLNs4ih8l3sYvI 9ImUg7ZV7KIQty4v1DKoUFhU6Hu9CVjNdnQBqm7qACP4B0x3O4r8/0f8yl2KfDeSkNAS Zbo6smQdy26JO8+pm5FNOm1sQEjEDJUjtBWv6DcN9DLZYme+eeUj3Oil7mazIyjy31ai 02y8SQI2Tjbe3WFcqZrverJyGNK7V9QqnU7kaQ/aGjEcGJNeZhkXEvfY8zBbEtHaHGBO mfwQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w4si10624228pgr.150.2019.06.17.10.26.35; Mon, 17 Jun 2019 10:26:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728385AbfFQR0Z (ORCPT + 99 others); Mon, 17 Jun 2019 13:26:25 -0400 Received: from foss.arm.com ([217.140.110.172]:57290 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726004AbfFQR0Y (ORCPT ); Mon, 17 Jun 2019 13:26:24 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D841628; Mon, 17 Jun 2019 10:26:23 -0700 (PDT) Received: from fuggles.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B94A93F246; Mon, 17 Jun 2019 10:26:22 -0700 (PDT) Date: Mon, 17 Jun 2019 18:26:20 +0100 From: Will Deacon To: Ard Biesheuvel Cc: Kees Cook , Jayachandran Chandrasekharan Nair , "catalin.marinas@arm.com" , Jan Glauber , Linus Torvalds , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" Subject: Re: [RFC] Disable lockref on arm64 Message-ID: <20190617172620.GK30800@fuggles.cambridge.arm.com> References: <20190614070914.GA21961@dc5-eodlnx05.marvell.com> <20190614095846.GC10506@fuggles.cambridge.arm.com> <20190614103850.GG10659@fuggles.cambridge.arm.com> <201906142026.1BC27EDB1E@keescook> <201906150654.FF4400F7C8@keescook> <201906161429.BCE1083@keescook> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.1+86 (6f28e57d73f2) () Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 17, 2019 at 01:33:19PM +0200, Ard Biesheuvel wrote: > On Sun, 16 Jun 2019 at 23:31, Kees Cook wrote: > > On Sat, Jun 15, 2019 at 04:18:21PM +0200, Ard Biesheuvel wrote: > > > Yes, I am using the same saturation point as x86. In this example, I > > > am not entirely sure I understand why it matters, though: the atomics > > > guarantee that the write by CPU2 fails if CPU1 changed the value in > > > the mean time, regardless of which value it wrote. > > > > > > I think the concern is more related to the likelihood of another CPU > > > doing something nasty between the moment that the refcount overflows > > > and the moment that the handler pins it at INT_MIN/2, e.g., > > > > > > > CPU 1 CPU 2 > > > > inc() > > > > load INT_MAX > > > > about to overflow? > > > > yes > > > > > > > > set to 0 > > > > > > > > set to INT_MIN/2 > > > > Ah, gotcha, but the "set to 0" is really "set to INT_MAX+1" (not zero) > > if you're using the same saturation. > > > > Of course. So there is no issue here: whatever manipulations are > racing with the overflow handler can never result in the counter to > unsaturate. > > And actually, moving the checks before the stores is not as trivial as > I thought, E.g., for the LSE refcount_add case, we have > > " ldadd %w[i], w30, %[cval]\n" \ > " adds %w[i], %w[i], w30\n" \ > REFCOUNT_PRE_CHECK_ ## pre (w30)) \ > REFCOUNT_POST_CHECK_ ## post \ > > and changing this into load/test/store defeats the purpose of using > the LSE atomics in the first place. > > On my single core TX2, the comparative performance is as follows > > Baseline: REFCOUNT_TIMING test using REFCOUNT_FULL (LSE cmpxchg) > 191057942484 cycles # 2.207 GHz > 148447589402 instructions # 0.78 insn per > cycle > > 86.568269904 seconds time elapsed > > Upper bound: ATOMIC_TIMING > 116252672661 cycles # 2.207 GHz > 28089216452 instructions # 0.24 insn per > cycle > > 52.689793525 seconds time elapsed > > REFCOUNT_TIMING test using LSE atomics > 127060259162 cycles # 2.207 GHz Ok, so assuming JC's complaint is valid, then these numbers are compelling. In particular, my understanding of this thread is that your optimised implementation doesn't actually sacrifice any precision; it just changes the saturation behaviour in a way that has no material impact. Kees, is that right? If so, I'm not against having this for arm64, with the premise that we can hide the REFCOUNT_FULL option entirely given that it would only serve to confuse if exposed. Will