Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2828000imu; Fri, 23 Nov 2018 15:37:07 -0800 (PST) X-Google-Smtp-Source: AFSGD/V4tnroPDckwLFfp0lWoksZsE+Dd8OlLK7T40aX5S5htUMH82fESy6ptaVr7Xrbp/jKNmx4 X-Received: by 2002:a62:c505:: with SMTP id j5mr3763130pfg.149.1543016227713; Fri, 23 Nov 2018 15:37:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543016227; cv=none; d=google.com; s=arc-20160816; b=U59w5wTOTWR0cX25vOm/ieKUJECboitHz4CL7NgUtT4YUpIC2+S5MDCjAzT7G0X5bL CpJXN5GPOvp/gHw0bJvgLHISWpPicFg5nOQPLJjsDhkrLP+RZwayuk4U1X+92pI7aWRZ zMbkvD4VugiacSkG/NJNXlJsRWj/PKrC68suPGmCXLgyvqirxmuaN78BtoFuZblh8kqd 8OTKHGeUaYTtH2eqR7rfSSKiy7fdajmqU+MyK3nSMaH84BByIETM/lfr7qk6mI/NxRJW OkNoapZ9NA2nsUcGPnG/4RieI0Odpmc9Sp4jzmueWnrpuiaC2Kxm7b/a/x/A7CtkCPBr CyWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:dkim-signature:dkim-filter; bh=f9l8GxdJ82Q0LdlBiPs0Kd1qm98AmnPQGq5wUPcULN0=; b=pbZXITyPJnabEakCV3675jSCUc2/+OJ9NnyZHEd6//BRFIAibyq6Ocu3WAwuvwuEcc M9T/xMEKpriW8pRdMSxWsdSydqFpIguqGFP6aWhbQkvxX5g8/G50+MzGRrDmTeTg5JLT 5EFSEuxF9yY+DitBM2Un+xTSqAajiN3HiMnJsJ3MEHoM21Zivtx3g+dd0RJxxvwQ0BUO bZl9OFKuPn0DtHd7fj+utkRwci4yh0XjtYQVAiFtdCuCEIOiDRpafMBfEb4xfo5h8tbg ATGu7Vt+kZqh7Cu2XCDS91yj5PZOgFnWvP8Jj0wPS/77peXgEe6zUbJv+AanZy7F9iaQ hi+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=mDBWDTUT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c8si37269753pgc.65.2018.11.23.15.36.52; Fri, 23 Nov 2018 15:37:07 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=mDBWDTUT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405849AbeKWC04 (ORCPT + 99 others); Thu, 22 Nov 2018 21:26:56 -0500 Received: from mail.efficios.com ([167.114.142.138]:52270 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728825AbeKWC04 (ORCPT ); Thu, 22 Nov 2018 21:26:56 -0500 Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id 4939F133084; Thu, 22 Nov 2018 10:47:01 -0500 (EST) Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10032) with ESMTP id ZJBK8hBcCeWE; Thu, 22 Nov 2018 10:47:00 -0500 (EST) Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id 6A08913307A; Thu, 22 Nov 2018 10:47:00 -0500 (EST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 6A08913307A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1542901620; bh=f9l8GxdJ82Q0LdlBiPs0Kd1qm98AmnPQGq5wUPcULN0=; h=Date:From:To:Message-ID:MIME-Version; b=mDBWDTUTpnLKCltzA3JWZ/2x3+BG7NOZqHra3+IZT46bdW5M0TuaC1EQdBKYwIX3f g1PII7f5jTj18rBKMvRdcxYTLWBWi5/XnV8PWB5LCP+EpSXym9y4a8oB09zux9em1D 61o9SCjcXem6yFo7trgFjM/coJkzRvvUIRhqsz16zLpHvufSRJ36CjeHVx3mVxllJw Gc9ysTRje1RH3AAABO+cifsMi5pLst1We00unOJVRoFsOGco9jlsoXN1EuPqFEfq65 tHoq7krgMcAAJ4KasmtWiOgZ9INet0SjeGa6kbUgblRMblWxA7T2AF3H1RpE83/46T Sign5FlrLl7Rw== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10026) with ESMTP id dszltgBJ_YBh; Thu, 22 Nov 2018 10:47:00 -0500 (EST) Received: from mail02.efficios.com (mail02.efficios.com [167.114.142.138]) by mail.efficios.com (Postfix) with ESMTP id 4D2BE13306D; Thu, 22 Nov 2018 10:47:00 -0500 (EST) Date: Thu, 22 Nov 2018 10:47:00 -0500 (EST) From: Mathieu Desnoyers To: Rich Felker Cc: carlos , Florian Weimer , Joseph Myers , Szabolcs Nagy , libc-alpha , Thomas Gleixner , Ben Maurer , Peter Zijlstra , "Paul E. McKenney" , Boqun Feng , Will Deacon , Dave Watson , Paul Turner , linux-kernel , linux-api Message-ID: <686626451.10113.1542901620250.JavaMail.zimbra@efficios.com> In-Reply-To: <20181122151444.GE23599@brightrain.aerifal.cx> References: <20181121183936.8176-1-mathieu.desnoyers@efficios.com> <20181122143603.GD23599@brightrain.aerifal.cx> <782067422.9852.1542899056778.JavaMail.zimbra@efficios.com> <20181122151444.GE23599@brightrain.aerifal.cx> Subject: Re: [RFC PATCH v4 1/5] glibc: Perform rseq(2) registration at nptl init and thread creation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.142.138] X-Mailer: Zimbra 8.8.10_GA_3047 (ZimbraWebClient - FF52 (Linux)/8.8.10_GA_3041) Thread-Topic: glibc: Perform rseq(2) registration at nptl init and thread creation Thread-Index: FFMd+FaOuvuQM2PXyMkZh8poo3vAZQ== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Nov 22, 2018, at 10:14 AM, Rich Felker dalias@libc.org wrote: > On Thu, Nov 22, 2018 at 10:04:16AM -0500, Mathieu Desnoyers wrote: >> ----- On Nov 22, 2018, at 9:36 AM, Rich Felker dalias@libc.org wrote: >> >> > On Wed, Nov 21, 2018 at 01:39:32PM -0500, Mathieu Desnoyers wrote: >> >> Register rseq(2) TLS for each thread (including main), and unregister >> >> for each thread (excluding main). "rseq" stands for Restartable >> >> Sequences. >> > >> > Maybe I'm missing something obvious, but "unregister" does not seem to >> > be a meaningful operation. Can you clarify what it's for? >> >> There are really two ways rseq TLS can end up being unregistered: either >> through an explicit call to the rseq "unregister", or when the OS frees the >> thread's task struct. >> >> You bring an interesting point here: do we need to explicitly unregister >> rseq at thread exit, or can we leave that to the OS ? >> >> The key thing to look for here is whether it's valid to access the >> TLS area of the thread from preemption or signal delivery happening >> at the very end of START_THREAD_DEFN. If it's OK to access it until >> the very end of the thread lifetime, then we could do without an >> explicit unregistration. However, if at any given point of the late >> thread lifetime we end up in a situation where reading or writing to >> that TLS area can cause corruption, then we need to carefully >> unregister it before that memory is reclaimed/reused. > > The thread memory cannot be reused until after kernel task exit, > reported via the set_tid_address futex. Also, assuming signals are > blocked (which is absolutely necessary for other reasons) nothing in > userspace can touch the rseq state after this point anyway. As discussed in the other leg of the email thread, disabling signals is not enough to prevent the kernel to access the rseq TLS area on preemption. > I was more confused about the need for reference counting, though. > Where would anything be able to observe a state other than "refcnt>0"? > -- in which case tracking it makes no sense. If the goal is to make an > ABI thatsupports environments where libc doesn't have rseq support, > and a third-party library is providing a compatible ABI, it seems all > that would be needed it a boolean thread-local "is_initialized" flag. > There does not seem to be any safe way such a library could be > dynamically unloaded (which would require unregistration in all > threads) and thus no need for a count. Here is one scenario: we have 2 early adopter libraries using rseq which are deployed in an environment with an older glibc (which does not support rseq). Of course, none of those libraries can be dlclose'd unless they somehow track all registered threads. But let's focus on how exactly those libraries can handle lazily registering rseq. They can use pthread_key, and pthread_setspecific on first use by the thread to setup a destructor function to be invoked at thread exit. But each early adopter library is unaware of the other, so if we just use a "is_initialized" flag, the first destructor to run will unregister rseq while the second library may still be using it. The same problem arises if we have an application early adopter which explicitly deal with rseq, with a library early adopter. The issue is similar, except that the application will explicitly want to unregister rseq before exiting the thread, which leaves a race window where rseq is unregistered, but the library may still need to use it. The reference counter solves this: only the last rseq user for a thread performs unregistration. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com