Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp1287268ybt; Tue, 7 Jul 2020 11:54:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzzcNc0QbphA4lJSntxELT799tiHlXPMmRA0eOiVAxpT2vm53TN8AT29P6AFM2GCYWI3jIQ X-Received: by 2002:aa7:d744:: with SMTP id a4mr62515015eds.94.1594148099604; Tue, 07 Jul 2020 11:54:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594148099; cv=none; d=google.com; s=arc-20160816; b=Ymy7UbETHR0rf2tE3yg+QLxCCJdXfvXwW6GDpnVlY6vh4YOFJaFdIAmR4bKyAWSwp+ oKe0+8c+a7QTadUYQ4H3aoOWsU+WIYpKQpGbHqi0t/91JUC9Z54kypkThbLfbCPMgfhw 4vE/rIlOeC/5CZTecIja3q9rI+J3t8zzvOlH4vYN4+mi/JT/FQlXaD569L5DP7Rbue7V fX6w3GoQ9otbdptCxG7huLqSg9z7G9AKeXwMXUmHzozuu5y2W4oGXyYgJoYYCOVKy6T4 6eOXiqIanBP6W/PIyWmHr5+h0dPjWy4Q2XP+S4uh8oiTadwB33Ldi5adzRi2ldt6fYGJ eklg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :dkim-signature; bh=Qd7GvFWxfTyKmbpo/F2p4KRc33HgtT765IWzuAxWIYI=; b=Q3Q6+YZmHYojeEq888/fmWoiagoB/y2HlF1Fyw/4h4jCpzzYnIVPOTiKVPPhBjtYxn v0Ew2u6NtP3YgjZMYdm16IdQMJ47ekfS/+vMQrB8Ehtyrr38XP2Em3uBG7iXQvDs5myf PA0tDbcQ/nH2kMnv1AxvJ/jBn2BKYJMxQlqcKJdZd+ZPYo/bTTZgl5ezlEh4BbOw84dk EO8Fo8oaV1PG0sP0gn/xYwihNH8HF3OAqo95c85JcjJ7gojZ8vswT4UUgRk464vLmTBn ZDe50TrNAf68x3ruZoozcBOIjXRBCoYNot/5bNzceinqc12CBtjXgd8/Yr9/QBvfbS6G 8GRQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=amIbU5He; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w10si15953861ejv.343.2020.07.07.11.54.36; Tue, 07 Jul 2020 11:54:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=amIbU5He; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728249AbgGGSxv (ORCPT + 99 others); Tue, 7 Jul 2020 14:53:51 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:37556 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728183AbgGGSxu (ORCPT ); Tue, 7 Jul 2020 14:53:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594148028; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Qd7GvFWxfTyKmbpo/F2p4KRc33HgtT765IWzuAxWIYI=; b=amIbU5Hen67T1YXf0ewg06Ypcg1FocY2IxneqeZrN6AVDNSsTnoxuVGbc8K6CT2AYFirdf CquM+jdzlLaUkVlzMfO8lP6Vdt7ThwciNudSWGQQTdstIknELnsVtJ14JCX00gYOY63Iq5 ziGoXa55u6MFEX0v8eAmhDr/0+fAMG0= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-303-VElsKiqMNwSuGUBqB-2SpA-1; Tue, 07 Jul 2020 14:53:44 -0400 X-MC-Unique: VElsKiqMNwSuGUBqB-2SpA-1 Received: by mail-qk1-f197.google.com with SMTP id o26so29216395qko.7 for ; Tue, 07 Jul 2020 11:53:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=Qd7GvFWxfTyKmbpo/F2p4KRc33HgtT765IWzuAxWIYI=; b=Rc/zFXsVjj1BkswYGkQy/Rzrw4OAzZmAAr13PLk4pvawSoFQY7vHyUgVrFss1ywj6v IdDqDTDBOPyq6Ehlnmd2Rgp3+5TlVXWmHy+h4e2iB+cHOiuVPQek+eF0beSRh6JbLE3I gyvVgEcDW7mrOkqcEinvfVXDcjiXnkM6++eoGcFPtJP1iF8xX2wUbD1R5Gm1vvnxjpHn kOPc+zNuTPB3rI2oKXILuet2EBC/G2woyP3/eT73kJneqCqOUlchOnkiWqEnLHdhbA+H SWCgIy0SueCST4Yme7wy5eNn/2FCyfOEtaD9VLrKOleTpN8RV08qDRsOFdqbWRim2Uzd jYsw== X-Gm-Message-State: AOAM531IykQvKszpMfNgZPEPCAd4TXGxmqufI4GwMxYK3HJOQthSdJrK l3trE66aL3sXPOl169B31E+YJqaDjnBmuNl/+m0hM0RjPp9DWzPVgAiCW8bKerw+3udqUSEefuO Mnwl4O5ruKglNQbfb/RHskcVO X-Received: by 2002:a37:64cd:: with SMTP id y196mr44880289qkb.303.1594148024014; Tue, 07 Jul 2020 11:53:44 -0700 (PDT) X-Received: by 2002:a37:64cd:: with SMTP id y196mr44880268qkb.303.1594148023641; Tue, 07 Jul 2020 11:53:43 -0700 (PDT) Received: from [192.168.1.4] (198-84-170-103.cpe.teksavvy.com. [198.84.170.103]) by smtp.gmail.com with ESMTPSA id x29sm20420381qtx.74.2020.07.07.11.53.42 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 07 Jul 2020 11:53:42 -0700 (PDT) Subject: Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID To: Mathieu Desnoyers , Florian Weimer Cc: Thomas Gleixner , linux-kernel , Peter Zijlstra , paulmck , Boqun Feng , "H. Peter Anvin" , Paul Turner , linux-api , Dmitry Vyukov , Neel Natu References: <20200706204913.20347-1-mathieu.desnoyers@efficios.com> <20200706204913.20347-4-mathieu.desnoyers@efficios.com> <87fta3zstr.fsf@mid.deneb.enyo.de> <2088331919.943.1594118895344.JavaMail.zimbra@efficios.com> <874kqjzhkb.fsf@mid.deneb.enyo.de> <378862525.1039.1594123580789.JavaMail.zimbra@efficios.com> From: Carlos O'Donell Organization: Red Hat Message-ID: Date: Tue, 7 Jul 2020 14:53:41 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <378862525.1039.1594123580789.JavaMail.zimbra@efficios.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/7/20 8:06 AM, Mathieu Desnoyers wrote: > ----- On Jul 7, 2020, at 7:32 AM, Florian Weimer fw@deneb.enyo.de wrote: > >> * Mathieu Desnoyers: >> >>> Those are very good points. One possibility we have would be to let >>> glibc do the rseq registration without the RSEQ_FLAG_RELIABLE_CPU_ID >>> flag. On kernels with the bug present, the cpu_id field is still good >>> enough for typical uses of sched_getcpu() which does not appear to >>> have a very strict correctness requirement on returning the right >>> cpu number. >>> >>> Then libraries and applications which require a reliable cpu_id >>> field could check this on their own by calling rseq with the >>> RSEQ_FLAG_RELIABLE_CPU_ID flag. This would not make the state more >>> complex in __rseq_abi, and let each rseq user decide about its own >>> fate: whether it uses rseq or keeps using an rseq-free fallback. >>> >>> I am still tempted to allow combining RSEQ_FLAG_REGISTER | >>> RSEQ_FLAG_RELIABLE_CPU_ID for applications which would not be using >>> glibc, and want to check this flag on thread registration. >> >> Well, you could add a bug fix level field to the __rseq_abi variable. > > Even though I initially planned to make the struct rseq_abi extensible, > the __rseq_abi variable ends up being fix-sized, so we need to be very > careful in choosing what we place in the remaining 12 bytes of padding. > I suspect we'd want to keep 8 bytes to express a pointer to an > "extended" structure. > > I wonder if a bug fix level "version" is the right approach. We could > instead have a bitmask of fixes, which the application could independently > check. For instance, some applications may care about cpu_id field > reliability, and others not. I agree with Florian. Developers are not interested in a bitmask of fixes. They want a version they can check and that at a given version everything works as expected. In reality today this is the kernel version. So rseq is broken from a developer perspective until they can get a new kernel or an agreement from their downstream vendor that revision Z of the kernel they are using has the fix you've just committed. Essentially this problem solves itself at level higher than the interfaces we're talking about today. Encoding this as a bitmask of fixes is an overengineered solution for a problem that the downstream communities already know how to solve. I would strongly suggest a "version" or nothing. >> Then applications could check if the kernel has the appropriate level >> of non-buggyness. But the same thing could be useful for many other >> kernel interfaces, and I haven't seen such a fix level value for them. >> What makes rseq so special? > > I guess my only answer is because I care as a user of the system call, and > what is a system call without users ? I have real applications which work > today with end users deploying them on various kernels, old and new, and I > want to take advantage of this system call to speed them up. However, if I > have to choose between speed and correctness (in other words, not crashing > a critical application), I will choose correctness. So if I cannot detect > that I can safely use the system call, it becomes pretty much useless even > for my own use-cases. Yes. In the case of RHEL we have *tons* of users in the same predicament. They just wait until the RHEL kernel team releases a fixed kernel version and check for that version and deploy with that version. Likewise every other user of a kernel. They solve it by asking their kernel provider, internal or external, to verify the fix is applied to the deployment kernel. If they are an ISV they have to test and deploy similar strategies for multiple kernel version. By trying to automate this you are encoding into the API some level of package management concepts e.g. patch level, revision level, etc. This is difficult to do without a more generalized API. Why do it just for rseq? Why do it with the few bits you have? It's not a great fit IMO. Just let the kernel version be the arbiter of correctness. >> It won't help with the present bug, but maybe we should add an rseq >> API sub-call that blocks future rseq registration for the thread. >> Then we can add a glibc tunable that flips off rseq reliably if people >> do not want to use it for some reason (and switch to the non-rseq >> fallback code instead). But that's going to help with future bugs >> only. > > I don't think it's needed. All I really need is to have _some_ way to > let lttng-ust or liburcu query whether the cpu_id field is reliable. This > state does not have to be made quickly accessible to other libraries, > nor does it have to be shared between libraries. It would allow each > user library or application to make its own mind on whether it can use > rseq or not. That check is "kernel version > x.y.z" :-) or "My kernel vendor told me it's fixed." -- Cheers, Carlos.