Received: by 10.192.165.156 with SMTP id m28csp965598imm; Mon, 16 Apr 2018 11:37:09 -0700 (PDT) X-Google-Smtp-Source: AIpwx487Og0TefMPHFeo/6Vb/U1+4pu9OMmDtEMuyLcUxw+68wKxUtof/gDqwznpOW/XR0jV+tBk X-Received: by 2002:a17:902:a9c2:: with SMTP id b2-v6mr16789410plr.181.1523903828987; Mon, 16 Apr 2018 11:37:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523903828; cv=none; d=google.com; s=arc-20160816; b=h4WmBC9COwxhlC3WQGlFPa+24L7FSub2ePlgEVEjM4hlTzvixYKRjYpvkmv3sSOFCp +lki2QAwb7N/iEXoPaYiVCWjgtroGAXXr7c93OXY2vH9+c6C0OX8rYD1hy3Brm2luPxW XySjRWDS9YExYBvpTV6iMRxnQq3OH7opma2x8NnGkF2g0vEVCshtC+Bf04l2NwYZBxeW V3FUl2MszEMGTIKWVHHzesW2yn9JKLxHXe49uqizhaVsLK7v1sIhdyt75HYN/exNLNWk 7DBbHYCOY7DwVdflNHUq5U8TAKuYITD5XBkqekVCuk37294i4dH01adYjddzDpxGs5NI NAlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:arc-authentication-results; bh=BJtkkQtQiQPfT2lydaKN55dmg0BiaiYuUE02mzkBhwE=; b=B1B6ZnOUVCvRX/jYMPLk/eaAXS+fAKop0FLLDxoqliutt2BU3TZL/wfcW14P3/Z9s8 q2LdJ2yMjKP3qlV+yBdyaviV59RNLjAuUewmeV1w3V0twHSJlaJuFtFUXKdkRl6Bfy2b lJDLuE9pLa9OTxbY8dNa7b0IdVA3mXJIC5udiswh2QJFLYCFNruSNhpSz+mEwe2WAGHL UNlUh0EBkDVuRrF+SNP5eQ1WJ3KIiIbqRtPC9+6JDc7OhFQYPvGBvxVD6FZdEV3TtEQI 6NudvAeApPibwYySjgim1wISD2VvYMt+Os1N0f7CpNTFex/upBsXahE5SYa+hqaBlkdz aJLQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d128si10150863pgc.445.2018.04.16.11.36.55; Mon, 16 Apr 2018 11:37:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753239AbeDPSfM (ORCPT + 99 others); Mon, 16 Apr 2018 14:35:12 -0400 Received: from mail.efficios.com ([167.114.142.138]:45586 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753041AbeDPSfK (ORCPT ); Mon, 16 Apr 2018 14:35:10 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 96CCB1B13D6; Mon, 16 Apr 2018 14:35:09 -0400 (EDT) Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail02.efficios.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id giBpXvmVDp2Z; Mon, 16 Apr 2018 14:35:08 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id D26481B13D3; Mon, 16 Apr 2018 14:35:08 -0400 (EDT) X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail02.efficios.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id gwMBhexJ8hah; Mon, 16 Apr 2018 14:35:08 -0400 (EDT) Received: from mail02.efficios.com (mail02.efficios.com [167.114.142.138]) by mail.efficios.com (Postfix) with ESMTP id B2B821B13C7; Mon, 16 Apr 2018 14:35:08 -0400 (EDT) Date: Mon, 16 Apr 2018 14:35:08 -0400 (EDT) From: Mathieu Desnoyers To: Andy Lutomirski Cc: Linus Torvalds , Peter Zijlstra , "Paul E. McKenney" , Boqun Feng , Dave Watson , linux-kernel , linux-api , Paul Turner , Andrew Morton , Russell King , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Andrew Hunter , Andi Kleen , Chris Lameter , Ben Maurer , rostedt , Josh Triplett , Catalin Marinas , Will Deacon , Michael Kerrisk Message-ID: <542721578.11358.1523903708510.JavaMail.zimbra@efficios.com> In-Reply-To: References: <20180412192800.15708-1-mathieu.desnoyers@efficios.com> <20180412192800.15708-13-mathieu.desnoyers@efficios.com> Subject: Re: [RFC PATCH for 4.18 12/23] cpu_opv: Provide cpu_opv system call (v7) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.142.138] X-Mailer: Zimbra 8.8.7_GA_1964 (ZimbraWebClient - FF52 (Linux)/8.8.7_GA_1964) Thread-Topic: cpu_opv: Provide cpu_opv system call (v7) Thread-Index: NmD6hr5r2xyh9jymP4P9RCOu0pasUA== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Apr 14, 2018, at 6:44 PM, Andy Lutomirski luto@amacapital.net wrote: > On Thu, Apr 12, 2018 at 12:43 PM, Linus Torvalds > wrote: >> On Thu, Apr 12, 2018 at 12:27 PM, Mathieu Desnoyers >> wrote: >>> The cpu_opv system call executes a vector of operations on behalf of >>> user-space on a specific CPU with preemption disabled. It is inspired >>> by readv() and writev() system calls which take a "struct iovec" >>> array as argument. >> >> Do we really want the page pinning? >> >> This whole cpu_opv thing is the most questionable part of the series, >> and the page pinning is the most questionable part of cpu_opv for me. >> >> Can we plan on merging just the plain rseq parts *without* this all >> first, and then see the cpu_opv thing as a "maybe future expansion" >> part. >> >> I think that would make Andy happier too. >> > > It only makes me happier if the userspace code involved is actually > going to work when single-stepped, which might actually be the case > (fingers crossed). Specifically for single-stepping, the __rseq_table section introduced at user-level will allow newer debuggers and tools which do line and instruction-level single-stepping to skip over rseq critical sections. However, this breaks existing debuggers and tools. For a userspace tracer tool such as LTTng-UST, requiring upgrade to newer debugger versions would limit its adoption in the field. So if using rseq breaks current debugger tools, lttng-ust won't use rseq until single-stepping can be done in a non-breaking way, or will have to wait until most end-user deployments (distributions used in the field) include debugger versions that skip over the code identified by the __rseq_table section, which will take many years. > That being said, I'm not really convinced that > cpu_opv() makes much difference here, since I'm not entirely convinced > that user code will actually use it or that user code will actually be > that well tested. C'est la vie. For the use-case of cpu_opv invoked as single-stepping fall-back, this path will indeed not be executed often enough to be well-tested. I'm considering the following approach to allow user-space to test cpu_opv more thoroughly: we can introduce an environment variable, e.g.: - RSEQ_DISABLE=1: Disable rseq thread registration, - RSEQ_DISABLE=random: Randomly disable rseq thread registration (some threads use rseq, other threads end up using the cpu_opv fallback) which would disable the rseq fast-path for all or some threads, and thus allow thorough testing of cpu_opv used as single-stepping fallback. Thoughts ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com