Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3240868imu; Sat, 24 Nov 2018 00:52:31 -0800 (PST) X-Google-Smtp-Source: AFSGD/Uvj76OR+iUo4NG4G68Fqt33jNlJb0o/SUb1USP8HKObuyopxKjI10HzItnnJxLOd5jw0du X-Received: by 2002:a65:624c:: with SMTP id q12mr17265715pgv.379.1543049551244; Sat, 24 Nov 2018 00:52:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543049551; cv=none; d=google.com; s=arc-20160816; b=dd/1zRU7+kaJKF5n0pFSTj7859njgOiDwGDpPjKezqsXhl/0KW5OSrSug6IeB9Njsr Pp+TfHWJt54ScL41k+Ig53OZHzC803CGYywYJcyXxxSrHq5+g5Jxzi9e8K/4Hd/3rDt3 7Z3/Bsl6tpguV18Q9gic/qhc1cqTznF6SCo5fdDeHhXUBMiSunUaKPc3EJ56RG/BMydT PNyJgrOfgHif7fOW2rNelkRH5bXIi742BR8hqe0mc9oTa9kHtAIvU6vxv9B9W8FhGStj fu8wBEIy9fbdJzLOuFFA8mTbumKzN0iJRyaByQG3H6h+ZAomGnpokCBPRpADL/vS/vbV o+HA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:dkim-signature:dkim-filter; bh=OrbKv5IFil1JCcY2aYtDEIKFr4ZRWb/R8RG+ETlTIuA=; b=KluFQCoS1SPaP/cQweRpyN25yBmG/cxzsjdpLXB28ECsxprGe9X9VFBxLDjh+TUg+k GtD4lohDS7EGhxNqAP+uZ5QxdtFt2qsnbnaMq6ODE7BtBowKK9AdoVegwI/FTdO474hd ER+9Ra15QcFjKWdTJ42CIvRtlblUziRqUs3ENzYcp83DJ8fsWjlBb89a92Qp1aKXnXHB Zrqt8y56nFsPVdhZxkObyvuSbh9tyiTp6xDnUDndPaCJq6pmY+S+33Y7IdqVUrtj+ay2 sMFQsEDQvHwrkaxu9F3hAjma0lZoV8Kfws4c2y0v29HJWb9HTUkoSyimIVjQBH5koQiZ SZsA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=HlX7kxio; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r10si16788274pgg.143.2018.11.24.00.52.17; Sat, 24 Nov 2018 00:52:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=HlX7kxio; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726716AbeKXHzD (ORCPT + 99 others); Sat, 24 Nov 2018 02:55:03 -0500 Received: from mail.efficios.com ([167.114.142.138]:36568 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726668AbeKXHzD (ORCPT ); Sat, 24 Nov 2018 02:55:03 -0500 Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id 550469ACC3; Fri, 23 Nov 2018 16:09:08 -0500 (EST) Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10032) with ESMTP id 5-hhJ7RwYubo; Fri, 23 Nov 2018 16:09:07 -0500 (EST) Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id ABF809ACB7; Fri, 23 Nov 2018 16:09:07 -0500 (EST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com ABF809ACB7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1543007347; bh=OrbKv5IFil1JCcY2aYtDEIKFr4ZRWb/R8RG+ETlTIuA=; h=Date:From:To:Message-ID:MIME-Version; b=HlX7kxioma2CyO+I31RrsfXyHs/Ym/sRaS2VEsiQ0KRQbw0pa+C5YUQDq9muTaQ0p FCjPv/mNkZOAoFRrOLDgwUOSHVWx4ooKfGvnpZiomFOW1ZxC4JrkELGc+MJJ4KiEP1 rmtRFNJ2Swu9++hkBDjaKulTfSLtzTN/ykNYskOH8XxJlZtHaJiy1r60YN93e2hFz+ shA3muLVX0wXhBPgHqZeTmZkiiWKgSyRMe2WnptVjTcTM84JAxdWejEJQmcZZXH17K BZ7ghGcYSCljsvPBylbTJt9DvhDew4zukUiZnCfhUypQfxnWSrBat5AlNL1LZDmLzW 9i3FWnyA/Y92Q== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10026) with ESMTP id EjLOfgWaK5_2; Fri, 23 Nov 2018 16:09:07 -0500 (EST) Received: from mail02.efficios.com (mail02.efficios.com [167.114.142.138]) by mail.efficios.com (Postfix) with ESMTP id 8AA2B9ACB0; Fri, 23 Nov 2018 16:09:07 -0500 (EST) Date: Fri, 23 Nov 2018 16:09:07 -0500 (EST) From: Mathieu Desnoyers To: Rich Felker Cc: Florian Weimer , carlos , Joseph Myers , Szabolcs Nagy , libc-alpha , Thomas Gleixner , Ben Maurer , Peter Zijlstra , "Paul E. McKenney" , Boqun Feng , Will Deacon , Dave Watson , Paul Turner , linux-kernel , linux-api Message-ID: <1758017676.12041.1543007347347.JavaMail.zimbra@efficios.com> In-Reply-To: <20181123183558.GM23599@brightrain.aerifal.cx> References: <20181121183936.8176-1-mathieu.desnoyers@efficios.com> <20181122171010.GH23599@brightrain.aerifal.cx> <871s7cvt1l.fsf@oldenburg.str.redhat.com> <20181123142843.GJ23599@brightrain.aerifal.cx> <1150466925.11664.1542992720871.JavaMail.zimbra@efficios.com> <20181123173019.GK23599@brightrain.aerifal.cx> <865273158.11687.1542995541389.JavaMail.zimbra@efficios.com> <20181123183558.GM23599@brightrain.aerifal.cx> Subject: Re: [RFC PATCH v4 1/5] glibc: Perform rseq(2) registration at nptl init and thread creation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.142.138] X-Mailer: Zimbra 8.8.10_GA_3047 (ZimbraWebClient - FF52 (Linux)/8.8.10_GA_3041) Thread-Topic: glibc: Perform rseq(2) registration at nptl init and thread creation Thread-Index: Y0KdXQ8z9UGjS4gF+vUfwWTpiA8z3g== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Nov 23, 2018, at 1:35 PM, Rich Felker dalias@libc.org wrote: > On Fri, Nov 23, 2018 at 12:52:21PM -0500, Mathieu Desnoyers wrote: >> ----- On Nov 23, 2018, at 12:30 PM, Rich Felker dalias@libc.org wrote: >> >> > On Fri, Nov 23, 2018 at 12:05:20PM -0500, Mathieu Desnoyers wrote: >> >> ----- On Nov 23, 2018, at 9:28 AM, Rich Felker dalias@libc.org wrote: >> >> [...] >> >> > >> >> > Absolutely. As long as it's in libc, implicit destruction will happen. >> >> > Actually I think the glibc code shound unconditionally unregister the >> >> > rseq address at exit (after blocking signals, so no application code >> >> > can run) in case a third-party rseq library was linked and failed to >> >> > do so before thread exit (e.g. due to mismatched ref counts) rather >> >> > than respecting the reference count, since it knows it's the last >> >> > user. This would make potentially-buggy code safer. >> >> >> >> OK, let me go ahead with a few ideas/questions along that path. >> > ^^^^^^^^^^^^^^^ >> >> >> >> Let's say our stated goal is to let the "exit" system call from the >> >> glibc thread exit path perform rseq unregistration (without explicit >> >> unregistration beforehand). Let's look at what we need. >> > >> > This is not "along that path". The above-quoted text is not about >> > assuming it's safe to make SYS_exit without unregistering the rseq >> > object, but rather about glibc being able to perform the >> > rseq-unregister syscall without caring about reference counts, since >> > it knows no other code that might depend on rseq can run after it. >> >> When saying "along that path", what I mean is: if we go in that direction, >> then we should look into going all the way there, and rely on thread >> exit to implicitly unregister the TLS area. >> >> Do you see any reason for doing an explicit unregistration at thread >> exit rather than simply rely on the exit system call ? > > Whether this is needed is an implementation detail of glibc that > should be permitted to vary between versions. Unless glibc wants to > promise that it would become a public guarantee, it's not part of the > discussion around the API/ABI. Only part of the discussion around > implementation internals of the glibc rseq stuff. > > Of course I may be biased thinking application code should not assume > this since it's not true on musl -- for detached threads, the thread > frees its own stack before exiting (and thus has to unregister > set_tid_address and set_robustlist before exiting). OK, so on glibc, the implementation could rely on exit side-effect to implicitly unregister rseq. On musl, based on the scenario you describe, the library should unregister rseq explicitly before stack reclaim. Am I understanding the situation correctly ? > >> >> First, we need the TLS area to be valid until the exit system call >> >> is invoked by the thread. If glibc defines __rseq_abi as a weak symbol, >> >> I'm not entirely sure we can guarantee the IE model if another library >> >> gets its own global-dynamic weak symbol elected at execution time. Would >> >> it be better to switch to a "strong" symbol for the glibc __rseq_abi >> >> rather than weak ? >> > >> > This doesn't help; still whichever comes first in link order would >> > override. Either way __rseq_abi would be in static TLS, though, >> > because any dynamically-loaded library is necessarily loaded after >> > libc, which is loaded at initial exec time. >> >> OK, AFAIU so you argue for leaving the __rseq_abi symbol "weak". Just making >> sure I correctly understand your position. > > I don't think it matters, and I don't think making it weak is > meaningful or useful (weak in a shared library is largely meaningless) > but maybe I'm missing something here. Using a "weak" symbol in early adopter libraries is important, so they can be loaded together into the same process without causing loader errors due to many definitions of the same strong symbol. Using "weak" in a C library is something I'm not sure is a characteristic we want or need, because I doubt we would ever want to load two libc at the same time in a given process. The only reason I see for using "weak" for the __rseq_abi symbol in the libc is if we want to allow early adopter applications to define __rseq_abi as a strong symbol, which would make some sense. > >> Something can be technically correct based on the current implementation, >> but fragile with respect to future changes. We need to carefully distinguish >> between the two when exposing ABIs. > > Yes. > >> >> There has been presumptions about signals being blocked when the thread >> >> exits throughout this email thread. Out of curiosity, what code is >> >> responsible for disabling signals in this situation ? >> >> This question is still open. > > I can't find it -- maybe it's not done in glibc. It is in musl, and I > assumed glibc would also do it, because otherwise it's possible to see > some inconsistent states from signal handlers. Maybe these are all > undefined due to AS-unsafety of pthread_exit, but I think you can > construct examples where something could be observably wrong without > breaking any rules. Good to know for the musl case. > >> > Related to this, >> >> is it valid to access a IE model TLS variable from a signal handler at >> >> _any_ point where the signal handler nests over thread's execution ? >> >> This includes early start and just before invoking the exit system call. >> > >> > It should be valid to access *any* TLS object like this, but the >> > standards don't cover it well. Right now access to dynamic TLS from >> > signal handlers is unsafe in glibc, but static is safe. >> >> Which is a shame for the lttng-ust tracer, which needs global-dynamic >> TLS variables so it can be dlopen'd, but aims at allowing tracing from >> signal handlers. It looks like due to limitations of global-dynamic >> TLS, tracing from instrumented signal handlers with lttng-ust tracepoints >> could crash the process if the signal handler nests early at thread start >> or late before thread exit. One way out of this would be to ensure signals >> are blocked at thread start/exit, but I can't find the code responsible for >> doing this within glibc. > > Just blocking at start/exit won't solve the problem because > global-dynamic TLS in glibc involves dynamic allocation, which is hard > to make AS-safe and of course can fail, leaving no way to make forward > progress. How hard would it be to create a async-signal-safe memory pool, which would be always accessed with signals blocked, so we could fix those corner-cases for good ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com