Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp2244448imd; Fri, 2 Nov 2018 08:14:10 -0700 (PDT) X-Google-Smtp-Source: AJdET5d3X1kuib2zghWPa1ZCX5ZjrzpmYb0jFM8iN/wqjDNRJghCUGmzPtTMvwn8CRSHGb3VRGLH X-Received: by 2002:a63:e156:: with SMTP id h22mr11347374pgk.255.1541171650000; Fri, 02 Nov 2018 08:14:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1541171649; cv=none; d=google.com; s=arc-20160816; b=VwBcB3C8HzsjKi1CZQkkoJflJv5LapZE/dDcKWXkUzZ7PPMLL7nKEoRc0BDSg4DX2B lN8YRhx2MIIuE6yzx1uyLzpGMunBZyBU22KqhCVImE0ZJnDlmQBgLgi7JhvEzs5GrlaK Sl4/6/UVPgk+GLnn8IToA23J3rME9UWKIUzuuZ+Ojt7iaJcRHywk9oonbXb1SHYVYCix j0sKegNjiX+sfjx4NM+llq+pSrTYFNZz6sM65HkTYvvmX9wqr0+S5zApCsXLinoetu2v zdqn58djuYsvvQkj+9bkQX1cooSGGA2teIJASpl1+3rVEmjtn3rElZ+S1rLtb6nCcJKF jcRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:thread-topic:thread-index :content-transfer-encoding:mime-version:subject:message-id:cc:to :from:date:dkim-signature:dkim-filter; bh=FMT4afYSIxiNq8Boha7u49U6AnJvCmo/YgDzIreHAV8=; b=Sn2Y7QszkBy1vOh9Cq7bOp9yIkGPOS0u1C9wx1DT7JZMXTSuCgYgXxE8xsKtga+1XU rC6XykDaYMmI6eUjNoEKXe6KLaVbjBqH/+9FgLuX2mvYXw2peJcaasdVuYpYpQ6iwM1o NNIC5sQeX652t+xx70jRBXs9a5uO749yGv6cO2EUQ9TScI+tBs9QE6H3BbDUS0GklOj0 QYXR1jYx77wZvOLAPP8orxRW7nR4vQUPb8nldQNCB/g8KeZCBRDrMel0xyJXHv4OHcku Nm0JwApAxDZdDmYutIfWH0FnYeRr7mkVLPDF5BRqJRLT1UQUf8KBtV9bkFSDs6jMVYqg H+Dg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=fnlv2DVV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d9si2770801pgb.105.2018.11.02.08.13.54; Fri, 02 Nov 2018 08:14:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=fnlv2DVV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728007AbeKCATu (ORCPT + 99 others); Fri, 2 Nov 2018 20:19:50 -0400 Received: from mail.efficios.com ([167.114.142.138]:40580 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726316AbeKCATu (ORCPT ); Fri, 2 Nov 2018 20:19:50 -0400 Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id 355A823837B; Fri, 2 Nov 2018 11:12:25 -0400 (EDT) Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10032) with ESMTP id LsGapA0_mxDm; Fri, 2 Nov 2018 11:12:24 -0400 (EDT) Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id A6586238372; Fri, 2 Nov 2018 11:12:24 -0400 (EDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com A6586238372 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1541171544; bh=FMT4afYSIxiNq8Boha7u49U6AnJvCmo/YgDzIreHAV8=; h=Date:From:To:Message-ID:MIME-Version; b=fnlv2DVVJJ78M7v1oCpHbrxNbR1Zoe7ZiQSo1FVcqCW7v0IzUfvouF54/T9+EsglP YOTTDr8/L4sWA7f3mFJPbR2O8dgKu4z5LTA6Ja0/73sVG6mylrk6KvzbKA/tnpZkF1 rbi4uhC39H87VVIeRucwAcZre+6iCchMBvXsUAZip9samX/Qevfbf7nMwYvcFhuTNg edy1hIVU4Y+befBzpklIrDvv7e2Ec8G41Yr28wnf+DoQEmpCm7yqhT4iZZvu2FB8SA 78f+lUdo9iAfo5ILs542grc8rP3s4otKx13mZynDJbWcqP4xAzrfrlQMZ1HvOei6JF i2CBuBnftowyQ== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10026) with ESMTP id 0KqKnruDxgTJ; Fri, 2 Nov 2018 11:12:24 -0400 (EDT) Received: from mail02.efficios.com (mail02.efficios.com [167.114.142.138]) by mail.efficios.com (Postfix) with ESMTP id 85100238369; Fri, 2 Nov 2018 11:12:24 -0400 (EDT) Date: Fri, 2 Nov 2018 11:12:24 -0400 (EDT) From: Mathieu Desnoyers To: Richard Henderson Cc: Will Deacon , linux-kernel , libc-alpha , Carlos O'Donell , Florian Weimer , Joseph Myers , Szabolcs Nagy , Thomas Gleixner , Ben Maurer , Peter Zijlstra , "Paul E. McKenney" , Boqun Feng , Ben Maurer , Dave Watson , Paul Turner , linux-api Message-ID: <313542172.8.1541171544337.JavaMail.zimbra@efficios.com> Subject: Supporting core-specific instruction sets (e.g. big.LITTLE) with restartable sequences MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.142.138] X-Mailer: Zimbra 8.8.10_GA_3047 (ZimbraWebClient - FF52 (Linux)/8.8.10_GA_3041) Thread-Index: xWnqFvmB9UO5++Ah0yHPR4XP7DDblw== Thread-Topic: Supporting core-specific instruction sets (e.g. big.LITTLE) with restartable sequences Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Richard, I stumbled on these articles: - https://medium.com/@jadr2ddude/a-big-little-problem-a-tale-of-big-little-gone-wrong-e7778ce744bb - https://www.mono-project.com/news/2016/09/12/arm64-icache/ and discussed them with Will Deacon. He told me you were looking into gcc atomics and it might be worthwhile to discuss the possible use of the new rseq system call that has been added in Linux 4.18 for those use-cases. Basically, the use-cases targeted are those where some cores on the system support a larger instruction set than others. So for instance, some cores could use a faster atomic add instruction than others, which should rely on a slower fallback. This is also the same story for reading the performance monitoring unit counters from user-space: it depends on the feature-set supported by the CPU on which the instruction is issued. Same applies to cores having different cache-line sizes. The main problem is that the kernel can migrate a thread at any point between user-space reading the current cpu number and issuing the instruction. This is where rseq can help. The core idea to solve the instruction set issue is to set a mask of cpus supporting the new instruction in a library constructor, and then load cpu_id, use it with the mask, and branch to either the new or old instruction, all with a rseq critical section. If the kernel needs to abort due to preemption or signal delivery, the abort behavior would be to issue the fallback (slow) atomic operation, which guarantees progress even if single-stepping. As long as the load, test and branch is faster than the performance delta between the old and new atomic instruction, it would be worth it. In the case of PMU read from user-space, using rseq to figure out how to issue the PMU read enables a use-case which is not otherwise possible to do on big.LITTLE. On rseq abort, it would fallback to a system call to read the PMU counter. This abort behavior guarantees forward progress. The second article is about cache line size discrepancy between CPUs. Here again, doing the cacheline flushing in a rseq critical section could allow tuning it to characteristics of the actual core it is running on. The fast-path would use a stride fitting the current core characteristics, and if rseq needs to abort, the slow-path would fall-back to a conservative value which would fit all cores (smaller cache line size on the overall system). Once again, this abort behavior guarantees forward progress. This would only work, of course, if cacheline invalidation done on a big core end up being propagated to other cores in a way that clears all the cache lines corresponding to the one targeted on the big core. Thoughts ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com