Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp5555251imm; Tue, 12 Jun 2018 09:32:24 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLc1LlEwgjdFjDkGnmnBo2A6goO5qudD8z0EVIbJKtk+aEIyqrapVicN7nfzSGMOABaXN1z X-Received: by 2002:a63:7459:: with SMTP id e25-v6mr925364pgn.186.1528821144877; Tue, 12 Jun 2018 09:32:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528821144; cv=none; d=google.com; s=arc-20160816; b=Tr86aB5wkTEJ4LH43HrYZrDITcJWtVk2Tzq7oyBY1sRpYzCsfu7qdcAYIvnXBwJ0ZF WbRZE7IZSgou20/0YOO9GUqapi1tkvfm30cxqAixvNQFEZXxHVnnw7Tv3QwHHiKvMytP Kvbxfw/npXWWFVTsfe8daNC1fkAENqwpSHum36Thtq33g1Uu6FenWPINbQggDafLkFDE FSjEFZAgY/a7OKPaLUu2b/bO4XFw0cnOVVfp0QT/B01uRbpHK6TtEqUFJFxnTytr/a4S kyTz9pAeJ/fxrcxdC3keq4VNw5EqssxBaUKjVSLxpP0PZccA9DvAEM2gF+WoaFVjUCZk DhZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:dkim-signature:dkim-filter :arc-authentication-results; bh=yBOoi91EzQEa9Q73mTLYbkKnKn+bN6EAn+3fn73MpZk=; b=VYdVE68n3iL9zMEXb64aLfqOt6HA6GPdlU6e3wVW+2vqiXmm09STeHIEl0ZwnRV6gg /UPRsJhDAa/bBBQ5vGehd4yarax9KafomlOX6xd4kK3Af4TdVLT6Za1KFVx5/E0PZWaw CE5Kmxuwf5TLAi6jzZ/d+T1qReNiV3/9txSxE+iXLy6ggTdOkdcDONmzTD7LnT1BshWj Etx2LcAh6e3ckh7eep1XRFoE2GMD0Qz0fYiG8Xj/JPBX++QupbZS0JzidkTkOnnZYWjl oNfXQpNX8YfLIre1B6uDrztsAEt/E6XInWkjxn20IpDfPpF5E8uCRgGleCYgZolAg6MO CQsQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=gwJCYIxD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i197-v6si416756pgc.161.2018.06.12.09.32.10; Tue, 12 Jun 2018 09:32:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=gwJCYIxD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934322AbeFLQb1 (ORCPT + 99 others); Tue, 12 Jun 2018 12:31:27 -0400 Received: from mail.efficios.com ([167.114.142.138]:38556 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933640AbeFLQbZ (ORCPT ); Tue, 12 Jun 2018 12:31:25 -0400 Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id CAEC32292E8; Tue, 12 Jun 2018 12:31:24 -0400 (EDT) Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10032) with ESMTP id UWcQH6xj7PlN; Tue, 12 Jun 2018 12:31:24 -0400 (EDT) Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id 491C42292E1; Tue, 12 Jun 2018 12:31:24 -0400 (EDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 491C42292E1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1528821084; bh=yBOoi91EzQEa9Q73mTLYbkKnKn+bN6EAn+3fn73MpZk=; h=Date:From:To:Message-ID:MIME-Version; b=gwJCYIxD+q2SgD9eaNK8c6AqE9NxoCmm19ikjXw18dJB7a0txX9YCr4EAnFUj5kWA AcsbzorMjyTZ1zPlgiF/1Aa3yiBqzo+5SByGFAhQ+qYbh/V4INembekbbDFcRdNX00 SkOYh3/jmboAvAN2RtYsOximGJvYWPlBOCxyJTkoT1I1z8ShEzXzy7mEuwH+sS2JjA LW72rHYBU4ZevnzPEUlmtLgvAAs6zpvhYVAN4Qak7Y9OQd5+C8OX5HNnOH83DG8Frl P4uPrTmD4g7Du7MI7TPGDF8aPmOzWbH/OeT5/41JtT9tYsodEGIpJEzUP6qYp2NsJM rZDKot6ddTUAg== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10026) with ESMTP id IqJwZyfHlA2R; Tue, 12 Jun 2018 12:31:24 -0400 (EDT) Received: from mail02.efficios.com (mail02.efficios.com [167.114.142.138]) by mail.efficios.com (Postfix) with ESMTP id 339F22292D4; Tue, 12 Jun 2018 12:31:24 -0400 (EDT) Date: Tue, 12 Jun 2018 12:31:24 -0400 (EDT) From: Mathieu Desnoyers To: Florian Weimer Cc: carlos , Peter Zijlstra , "Paul E. McKenney" , Boqun Feng , Thomas Gleixner , linux-kernel , libc-alpha Message-ID: <417742741.11550.1528821084084.JavaMail.zimbra@efficios.com> In-Reply-To: <091061df-3482-8762-30e4-feaf3417be11@redhat.com> References: <1084280721.10859.1528746558696.JavaMail.zimbra@efficios.com> <31fc101a-295b-067b-1a82-7e9e509fc92f@redhat.com> <305409897.10888.1528747473727.JavaMail.zimbra@efficios.com> <091061df-3482-8762-30e4-feaf3417be11@redhat.com> Subject: Re: Restartable Sequences system call merged into Linux MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.142.138] X-Mailer: Zimbra 8.8.8_GA_2096 (ZimbraWebClient - FF52 (Linux)/8.8.8_GA_1703) Thread-Topic: Restartable Sequences system call merged into Linux Thread-Index: 6g+JFMuw6m/7S6+/IUYOkRqtm5cSXw== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Jun 12, 2018, at 9:11 AM, Florian Weimer fweimer@redhat.com wrote: > On 06/11/2018 10:04 PM, Mathieu Desnoyers wrote: >> ----- On Jun 11, 2018, at 3:55 PM, Florian Weimer fweimer@redhat.com wrote: >> >>> On 06/11/2018 09:49 PM, Mathieu Desnoyers wrote: >>>> It should be noted that there can be only one rseq TLS area registered per >>>> thread, >>>> which can then be used by many libraries and by the executable, so this is a >>>> process-wide (per-thread) resource that we need to manage carefully. >>> >>> Is it possible to resize the area after thread creation, perhaps even >>> from other threads? >> >> I'm not sure why we would want to resize it. The per-thread area is fixed-size. >> Its layout is here: include/uapi/linux/rseq.h: struct rseq > > Looks I was mistaken and this is very similar to the robust mutex list. > > Should we treat it the same way? Always allocate it for each new thread > and register it with the kernel? That would be an efficient way to do it, indeed. There is very little performance overhead to have rseq registered for all threads, whether or not they intend to run rseq critical sections. > >> The ABI is designed so that all users (program and libraries) can interact >> through this per-thread TLS area. > > Then the user code needs just the address of the structure. Yes. > > How much coordination is needed between different users of this > interface? Looking at the the section hacks, I don't think we want to > put this into glibc at this stage. It looks more like something for > which we traditionally require compiler support. I really don't mind maintaining a separate project containing librseq along with the headers needed to facilitate declaration of rseq critical sections. This specifically does not need much coordination between users of the interface. The part which really requires coordination between users is registration to the kernel (and ownership) of the rseq TLS area. I have a few possible approaches in mind (feel free to suggest other options): A) glibc exposes a strong __rseq_abi TLS symbol: - should ideally *not* be global-dynamic for performance reasons, - registration to kernel can either be handled explicitly by requiring application or libraries to call an API, or implicitly at thread creation, - requires all rseq users to upgrade to newer glibc. Early rseq users (libs and applications) registering their own rseq TLS will conflict with newer glibc. B) librseq.so exposes a strong __rseq_abi symbol: - should ideally *not* be global-dynamic for performance reasons, but testing shows that using initial-exec causes issues in situations where librseq.so ends up being dlopen'd (e.g. java virtual machine dlopening the lttng-ust tracer linked against librseq.so), - registration/unregistration of area to kernel can either be performed lazily on first use, destruction done using pthread_key, or require an explicit API call from application, - A per-thread refcount in a TLS could allow many users to call the registration/unregistration API, and lazy registration, - an early-user application which also exposes a __rseq_abi strong symbol would conflict with librseq.so. C) __rseq_abi symbol declared weak within each user (application, librseq, other libraries, glibc): - should ideally *not* be global-dynamic for performance reasons, - however, initial-exec causes issues when librseq or early user libraries are dlopen'd (e.g. java runtime dlopening lttng-ust), - a weak symbol allow combining early user libs/apps with glibc/librseq exposing the same symbol, - considering that glibc is AFAIK never dlopen'd, does not cause exhaustion of initial-exec TLS entries in cases where librseq.so or early adopter libs are dlopen'd, - if glibc implicitly registers the rseq area, *and* librseq.so also wants to register it, *and* early adopters also want to register it, we should come up with a refcount scheme in the TLS ensuring that registration and unregistration is only done with the first/last user comes/goes away. Thoughts ? Thanks! Mathieu > > Thanks, > Florian -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com