Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp4005002pxb; Mon, 8 Feb 2021 05:51:52 -0800 (PST) X-Google-Smtp-Source: ABdhPJwEYZNG9ziWVuN/U6OGL9VlJQFNGOEGMzLuW8eZF9IZlwMjtGqFDvAyzNpjfy9C/6+JMksj X-Received: by 2002:aa7:da92:: with SMTP id q18mr17226699eds.91.1612792312183; Mon, 08 Feb 2021 05:51:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612792312; cv=none; d=google.com; s=arc-20160816; b=jI+85u1qGVRzh+riHXOx7c2IeLoIG0mOY8NWfFecmP1QpZ41GewoajpH4y5pHaJlsO g7FhbJu5FP/8L4B+2lvZPlc2/8t8ewrSej1k57ZfBx06VGX3dsRtXoyiu0ie+iuwUIGx 2SvN9Vhv0DXhTBjvzFJiIOpIUT7OU1NkC2SNTMSmjmJ9pS2/PT0HjOlhKiMdO53jVMEe w4sqR4+fuzyxon6NrSGZEQpi/XAgmORX/9PwzERxYq95v5RhJdKO0HGxzB/Dawur9DB9 OKKzPpO2cpeVahvzjq81AFkWilH4pWwGs/YDza7hYFTAUVQEBDRbOEoUW+jjEHPE4k/4 ujtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject:from :references:cc:to; bh=NO/Yjr1WXy0Cv6N5jM4XPyu4QuYLYWq2oHa4aoCqnq0=; b=jE+gl993pk1CArd3pYW4jED0jxxo+588t0u/FAqjW7zIraY2ZkBP9y+HVTHxb/LaA6 1dWH0r/7fhiViGWklJ/XvqKBycZwDDzx+JlwgdrZfADQANvoMH9sHYb1QWOpA4t55g6o PLOLznIAgFlTymeqJWz6gcMsKV9t8Bu6MyWlZytD9H05i5cRYbXBlXxsJhQAWIXIj074 mdKRsyIsJ+6CjjMhS59U5rWyRMWVLI96g7nlaCeFqxIMYlMMpyC4MWyc4CCehr4mip+w FwMvTVWONjHjvLfPKYrbBVIneexAQNsaYfRXJn0R/dFngSJHuZCniO2a+70Dr20s0QM7 WWsw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r11si15940769edc.353.2021.02.08.05.51.28; Mon, 08 Feb 2021 05:51:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231166AbhBHNur (ORCPT + 99 others); Mon, 8 Feb 2021 08:50:47 -0500 Received: from mail.netline.ch ([148.251.143.178]:56217 "EHLO netline-mail3.netline.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230228AbhBHNul (ORCPT ); Mon, 8 Feb 2021 08:50:41 -0500 Received: from localhost (localhost [127.0.0.1]) by netline-mail3.netline.ch (Postfix) with ESMTP id 869DC2A6046; Mon, 8 Feb 2021 14:49:58 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at netline-mail3.netline.ch Received: from netline-mail3.netline.ch ([127.0.0.1]) by localhost (netline-mail3.netline.ch [127.0.0.1]) (amavisd-new, port 10024) with LMTP id tBmJkhbOh7G4; Mon, 8 Feb 2021 14:49:58 +0100 (CET) Received: from thor (24.99.2.85.dynamic.wline.res.cust.swisscom.ch [85.2.99.24]) by netline-mail3.netline.ch (Postfix) with ESMTPSA id 7AE7B2A6042; Mon, 8 Feb 2021 14:49:57 +0100 (CET) Received: from [::1] by thor with esmtp (Exim 4.94) (envelope-from ) id 1l96vM-001pxS-R3; Mon, 08 Feb 2021 14:49:56 +0100 To: Daniel Vetter Cc: Will Drewry , Kees Cook , Jann Horn , intel-gfx , Linux Kernel Mailing List , dri-devel , Andy Lutomirski , Andrew Morton , Chris Wilson References: <20210205163752.11932-1-chris@chris-wilson.co.uk> <202102051030.1AF01772D@keescook> <5a940e13-8996-e9e5-251e-a9af294a39ff@daenzer.net> From: =?UTF-8?Q?Michel_D=c3=a4nzer?= Subject: Re: [PATCH] kernel: Expose SYS_kcmp by default Message-ID: <36274836-1968-e712-fb15-f3e15eeb7741@daenzer.net> Date: Mon, 8 Feb 2021 14:49:51 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-CA Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021-02-08 2:34 p.m., Daniel Vetter wrote: > On Mon, Feb 8, 2021 at 12:49 PM Michel Dänzer wrote: >> >> On 2021-02-05 9:53 p.m., Daniel Vetter wrote: >>> On Fri, Feb 5, 2021 at 7:37 PM Kees Cook wrote: >>>> >>>> On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote: >>>>> Userspace has discovered the functionality offered by SYS_kcmp and has >>>>> started to depend upon it. In particular, Mesa uses SYS_kcmp for >>>>> os_same_file_description() in order to identify when two fd (e.g. device >>>>> or dmabuf) point to the same struct file. Since they depend on it for >>>>> core functionality, lift SYS_kcmp out of the non-default >>>>> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category. >>>>> >>>>> Signed-off-by: Chris Wilson >>>>> Cc: Kees Cook >>>>> Cc: Andy Lutomirski >>>>> Cc: Will Drewry >>>>> Cc: Andrew Morton >>>>> Cc: Dave Airlie >>>>> Cc: Daniel Vetter >>>>> Cc: Lucas Stach >>>>> --- >>>>> init/Kconfig | 11 +++++++++++ >>>>> kernel/Makefile | 2 +- >>>>> tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +- >>>>> 3 files changed, 13 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/init/Kconfig b/init/Kconfig >>>>> index b77c60f8b963..f62fca13ac5b 100644 >>>>> --- a/init/Kconfig >>>>> +++ b/init/Kconfig >>>>> @@ -1194,6 +1194,7 @@ endif # NAMESPACES >>>>> config CHECKPOINT_RESTORE >>>>> bool "Checkpoint/restore support" >>>>> select PROC_CHILDREN >>>>> + select KCMP >>>>> default n >>>>> help >>>>> Enables additional kernel features in a sake of checkpoint/restore. >>>>> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS >>>>> config ARCH_HAS_MEMBARRIER_SYNC_CORE >>>>> bool >>>>> >>>>> +config KCMP >>>>> + bool "Enable kcmp() system call" if EXPERT >>>>> + default y >>>> >>>> I would expect this to be not default-y, especially if >>>> CHECKPOINT_RESTORE does a "select" on it. >>>> >>>> This is a really powerful syscall, but it is bounded by ptrace access >>>> controls, and uses pointer address obfuscation, so it may be okay to >>>> expose this. As it is, at least Ubuntu already has >>>> CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much >>>> difference on exposure. >>>> >>>> So, if you drop the "default y", I'm fine with this. >>> >>> It was maybe stupid, but our userspace started relying on fd >>> comaprison through sys_kcomp. So for better or worse, if you want to >>> run the mesa3d gl/vk stacks, you need this. >> >> That's overstating things somewhat. The vast majority of applications >> will work fine regardless (as they did before Mesa started using this >> functionality). Only some special ones will run into issues, because the >> user-space drivers incorrectly assume two file descriptors reference >> different descriptions. >> >> >>> Was maybe not the brighest ideas, but since enough distros had this >>> enabled by defaults, >> >> Right, that (and the above) is why I considered it fair game to use. >> What should I have done instead? (TBH I was surprised that this >> functionality isn't generally available) > > Yeah that one is fine, but I thought we've discussed (irc or > something) more uses for de-duping dma-buf and stuff like that. But > quick grep says that hasn't landed yet, so I got a bit confused (or > just dreamt). Looking at this again I'm kinda surprised the drmfd > de-duping blows up on normal linux distros, but I guess it can all > happen. One example: GEM handle name-spaces are per file description. If user-space incorrectly assumes two DRM fds are independent, when they actually reference the same file description, closing a GEM handle with one file descriptor will make it unusable with the other file descriptor as well. >>> Ofc we can leave the default n, but the select if CONFIG_DRM is >>> unfortunately needed I think. >> >> Per above, not sure this is really true. > > We seem to be going boom on linux distros now, maybe userspace got > more creative in abusing stuff? I don't know what you're referring to. I've only seen maybe two or three reports from people who didn't enable CHECKPOINT_RESTORE in their self-built kernels. > The entire thing is small enough that imo we don't really have to care, > e.g. we also unconditionally select dma-buf, despite that on most > systems there's only 1 gpu, and you're never going to end up with a > buffer sharing case that needs any of that code (aside from the > "here's an fd" part). > > But I guess we can limit to just KCMP_FILE like you suggest in another > reply. Just feels a bit like overkill. Making KCMP_FILE gated by DRM makes as little sense to me as by CHECKPOINT_RESTORE. -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and X developer