Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp2307315imd; Fri, 2 Nov 2018 09:11:02 -0700 (PDT) X-Google-Smtp-Source: AJdET5echZnCYU/w7Yj+O0XN2f0kUMhpuOkXDv6bSi+kvgDvnLhZ/qn28eM1G7KhyEkwGH/GME/o X-Received: by 2002:a17:902:2e81:: with SMTP id r1-v6mr12371613plb.212.1541175062875; Fri, 02 Nov 2018 09:11:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1541175062; cv=none; d=google.com; s=arc-20160816; b=ttCDyEM4jzgG/UXwwPVPfrxVd6mc9fQZPEoL5lillTJonl0ZkedaJAKoHMAeVfIxcz zlR8oW6x1/5wqwQJ+cUYVECZEBvRbTyDpIBmUnd6pVS+YCssfn/UeVuliLPRFhcsM75N KOfR56N4bLg1LStHFSwyyn1ZQOQeQP6lOrV/isQKDk5oMW1lYv0vxDvYlWjQiyyD//9q DNjzdvvlHIDQM2FTtG5awmTzdgMrHSwYbXTbS5r9U7nBfnJyYFmBTqJtz3y/NLF/RS1Y XWlDzR+vM/nx2BKAy2yw9i4+STK4qh4hpl0N8Acg+p06qluIA+/r/wzyfCTG/JuOTrlt gkqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=ovwVCPrhNH8wB2LP69Bf0BigzECHd/rhwAMYh1uJA8s=; b=zVW9gfR8lanFJBVNTv0X/HzJDsbQMK97C+nuCXwy50vhDE1w9BG5nKGnSP76lBieSe 1AbJrMZBpooAXBxHUWQW0kdLq/ZHWNvPCPzuXw6ky+198QpHmI3ZF/JKAzsi/j/zsIQ4 wu1AMYs/BoF9jVMdfrUuaFm6DMKg0MVEH5IFGcllcD4XqY3nuPEjszW+JZUgZYRJ8yRD syQRy/3RWqHYVleLiDD+rp+Viw2phDbtnqeb9FHKZHpsoEeysclqPN0pHjsHO7dpTofe CWE7B8FDnga8mrdoSsCHMdusJE/wUzlDmTqWIws8FSUDEKenaWPWfw5vqspfeNOw6c+3 QDcg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p10-v6si33098175plk.77.2018.11.02.09.10.48; Fri, 02 Nov 2018 09:11:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727551AbeKCBQ3 (ORCPT + 99 others); Fri, 2 Nov 2018 21:16:29 -0400 Received: from foss.arm.com ([217.140.101.70]:43524 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726316AbeKCBQ3 (ORCPT ); Fri, 2 Nov 2018 21:16:29 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 082151596; Fri, 2 Nov 2018 09:08:53 -0700 (PDT) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1BBA03F718; Fri, 2 Nov 2018 09:08:49 -0700 (PDT) Date: Fri, 2 Nov 2018 16:08:45 +0000 From: Mark Rutland To: Mathieu Desnoyers Cc: Richard Henderson , Will Deacon , linux-kernel , libc-alpha , Carlos O'Donell , Florian Weimer , Joseph Myers , Szabolcs Nagy , Thomas Gleixner , Ben Maurer , Peter Zijlstra , "Paul E. McKenney" , Boqun Feng , Dave Watson , Paul Turner , linux-api Subject: Re: Supporting core-specific instruction sets (e.g. big.LITTLE) with restartable sequences Message-ID: <20181102160844.ihge4najccn63cxi@lakrids.cambridge.arm.com> References: <313542172.8.1541171544337.JavaMail.zimbra@efficios.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <313542172.8.1541171544337.JavaMail.zimbra@efficios.com> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Mathieu, Richard, On Fri, Nov 02, 2018 at 11:12:24AM -0400, Mathieu Desnoyers wrote: > Hi Richard, > > I stumbled on these articles: > > - https://medium.com/@jadr2ddude/a-big-little-problem-a-tale-of-big-little-gone-wrong-e7778ce744bb > - https://www.mono-project.com/news/2016/09/12/arm64-icache/ > > and discussed them with Will Deacon. He told me you were looking into > gcc atomics and it might be worthwhile to discuss the possible use of > the new rseq system call that has been added in Linux 4.18 for those > use-cases. > > Basically, the use-cases targeted are those where some cores on the > system support a larger instruction set than others. So for instance, > some cores could use a faster atomic add instruction than others, > which should rely on a slower fallback. This is also the same story > for reading the performance monitoring unit counters from user-space: > it depends on the feature-set supported by the CPU on which the > instruction is issued. Same applies to cores having different > cache-line sizes. Please note that upstream arm64 Linux does not expose mismatched ISA feature to userspace. We go to great pains to expose a uniform set of supported features. The two issues referenced above are both handled by the kernel, and no userspace changes are required to handle them. We do not intend or expect to expose mismatched features to userspace. Correctly-written userspace should not use optional instructions unless the kernel has advertised their presence via a hwcap (or via ID register emulation). > The main problem is that the kernel can migrate a thread at any point > between user-space reading the current cpu number and issuing the > instruction. This is where rseq can help. > > The core idea to solve the instruction set issue is to set a mask of > cpus supporting the new instruction in a library constructor, and then > load cpu_id, use it with the mask, and branch to either the new or old > instruction, all with a rseq critical section. If the kernel needs to > abort due to preemption or signal delivery, the abort behavior would > be to issue the fallback (slow) atomic operation, which guarantees > progress even if single-stepping. > > As long as the load, test and branch is faster than the performance > delta between the old and new atomic instruction, it would be worth > it. Specifically w.r.t. the atomics, the kernel will only expose the presence of the ARMv8.1 atomic instructions when supported by all CPUs in the system. > In the case of PMU read from user-space, using rseq to figure out how > to issue the PMU read enables a use-case which is not otherwise > possible to do on big.LITTLE. On rseq abort, it would fallback to a > system call to read the PMU counter. This abort behavior guarantees > forward progress. We do not currently expose any PMU registers to userspace. If we were to expose them for big.LITTLE, rseq may be of use, but no-one has done the groundwork to investigate this. > The second article is about cache line size discrepancy between CPUs. > Here again, doing the cacheline flushing in a rseq critical section > could allow tuning it to characteristics of the actual core it is > running on. The fast-path would use a stride fitting the current core > characteristics, and if rseq needs to abort, the slow-path would > fall-back to a conservative value which would fit all cores (smaller > cache line size on the overall system). This is already handled by the kernel, and the proposed rseq approach is not correct -- cache maintenance must *always* use the system-wide minimum cacheline size, or stale entries will be left on some CPUs, which will result in later failures. Thanks, Mark.