Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3171031imu; Fri, 23 Nov 2018 23:14:47 -0800 (PST) X-Google-Smtp-Source: AFSGD/UsN8hvqBAnBklG9jbWDdMbu5Ul4Wvc5KMSb5IODOQCUzk/aqh8meN0xGQ2xLVbgz1kK8KK X-Received: by 2002:a17:902:e18d:: with SMTP id cd13mr16062286plb.262.1543043687728; Fri, 23 Nov 2018 23:14:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543043687; cv=none; d=google.com; s=arc-20160816; b=n9+kRdIwRtUlPRFdC+hLBjzl07GGW2yDKFuIyMTj4v5KJWauO17Pa1AcaSyaXCI84z uR9fdGqlHrhOcmJeFFtlZmf8aQU4EFXw2wXGkipYHWOhvNr+mEfgQjhWe5AEnZ5dmYfJ 2NpSe1ECwWyVIci88GLdqaA//z11USi+e2T1PDOVXpzE1KF3vAzhQKGg8/+u6wOJWzgO MnyADp8B8rTZYP+XyTJu+6zSdcPMivrC/X9gicds3PWlcTBYkpANdQmqbsFb1un+zvFk qJOwrfeQgAxFtcjH8tAvAmlUc0wmdSv70bkEbrGR2xge0sRnbsqwel+TQxcBp0+Bne0T ftBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=EEXRO/WazCgzwvGW+pxMngFdaUQrw2tUMrGdunFeusU=; b=EAa8g2SEklLFhUtBsBpgpDW6dMN1RrY6/GHggFhmUdxWMoO5FXpysUbULDraL/HJpF yV4Xbgtfw75vRpJRh4PcZi3p0UaEXD/6cPEqxMROByXTDNpGWquGRVR3AYnyq0iEpqD7 yLqWfQcKta+/8wueBW4vzTHvnXsOqyF0UMNZvCcLX1u/a8oZcydq5NrxIWPz8Sm9vkgw Xz0dXOegDY8tZkWm/RbbFQ7XIlD/Uf4J3ebPy6BxoBKLkWV0bw8S0VKqo8H7BpZPwQxr 4EkrYIJfoYJwMUGLHsiYV+O50zukLS8abqZJFUG8/76HUx4eqBfHsS7yoZsvGCS2HoRx cE3A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h19si37812030plr.67.2018.11.23.23.14.33; Fri, 23 Nov 2018 23:14:47 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392200AbeKWFnF (ORCPT + 99 others); Fri, 23 Nov 2018 00:43:05 -0500 Received: from 216-12-86-13.cv.mvl.ntelos.net ([216.12.86.13]:58466 "EHLO brightrain.aerifal.cx" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729904AbeKWFnF (ORCPT ); Fri, 23 Nov 2018 00:43:05 -0500 Received: from dalias by brightrain.aerifal.cx with local (Exim 3.15 #2) id 1gPuER-0005bl-00; Thu, 22 Nov 2018 19:01:43 +0000 Date: Thu, 22 Nov 2018 14:01:43 -0500 From: Rich Felker To: Mathieu Desnoyers Cc: Szabolcs Nagy , Florian Weimer , nd , carlos , Joseph Myers , libc-alpha , Thomas Gleixner , Ben Maurer , Peter Zijlstra , "Paul E. McKenney" , Boqun Feng , Will Deacon , Dave Watson , Paul Turner , linux-kernel , linux-api Subject: Re: [RFC PATCH v4 1/5] glibc: Perform rseq(2) registration at nptl init and thread creation Message-ID: <20181122190143.GI23599@brightrain.aerifal.cx> References: <20181121183936.8176-1-mathieu.desnoyers@efficios.com> <20181122143603.GD23599@brightrain.aerifal.cx> <782067422.9852.1542899056778.JavaMail.zimbra@efficios.com> <87a7m1ywni.fsf@oldenburg.str.redhat.com> <20181122151710.GF23599@brightrain.aerifal.cx> <875zwpyw81.fsf@oldenburg.str.redhat.com> <1306224240.10055.1542900799576.JavaMail.zimbra@efficios.com> <7032ab45-8314-b85d-5b4d-f83b41dab5b6@arm.com> <1602745030.10426.1542911744479.JavaMail.zimbra@efficios.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1602745030.10426.1542911744479.JavaMail.zimbra@efficios.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 22, 2018 at 01:35:44PM -0500, Mathieu Desnoyers wrote: > ----- On Nov 22, 2018, at 11:24 AM, Szabolcs Nagy Szabolcs.Nagy@arm.com wrote: > > > On 22/11/18 15:33, Mathieu Desnoyers wrote: > >> ----- On Nov 22, 2018, at 10:21 AM, Florian Weimer fweimer@redhat.com wrote: > >>> Right, but in case of user-supplied stacks, we actually free TLS memory > >>> at this point, so signals need to be blocked because the TCB is > >>> (partially) gone after that. > >> > >> Unfortuntately, disabling signals is not enough. > >> > >> With rseq registered, the kernel accesses the rseq TLS area when returning to > >> user-space after _preemption_ of user-space, which can be triggered at any > >> point by an interrupt or a fault, even if signals are blocked. > >> > >> So if there are cases where the TLS memory is freed while the thread is still > >> running, we _need_ to explicitly unregister rseq beforehand. > > > > i think the man page should point this out. > > Yes, I should add this to the proposed rseq(2) man page. > > > > > the memory of a registered rseq object must not be freed > > before thread exit. (either unregister it or free later) > > > > and ideally also point out that c language thread storage > > duration does not provide this guarantee: it may be freed > > by the implementation before thread exit (which is currently > > not observable, but with the rseq syscall it is). > > How about the following wording ? > > Memory of a registered rseq object must not be freed before the > thread exits. Reclaim of rseq object's memory must only be > done after either an explicit rseq unregistration is performed > or after the thread exit. Keep in mind that the implementation > of the Thread-Local Storage (C language __thread) lifetime does > not guarantee existence of the TLS area up until the thread exits. This is all really ugly for application/library code to have to deal with. Maybe if the man page is considered as documenting the syscall only, and not something you can use, it's okay, but "until the thread exits" is not well-defined in the sense you want it here. It's more like "until the kernel task for the thread exits", and the whole point is that there is some interval in time between the abstract thread exit and the kernel task exit that is not observable without rseq but is observable if the rseq is wrongly left installed. Rich