Received: by 2002:ab2:7b86:0:b0:1f7:5705:b850 with SMTP id q6csp1270688lqh; Mon, 6 May 2024 02:27:33 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUB6c2CIpsFxUR/8dLrRvWunqWsi3Z4FiTY2Y4tbVfSzP5zZk7vtwD3xSo0ulAomTKl/dAN4ysTuKtAgaUeJ28NTdIk5n7iOcoT9HDHQA== X-Google-Smtp-Source: AGHT+IE4OOL81kpwhwf080qDvtGkoPTxMhLjKI6c8cw+0nVnMpyYWw16fcR+9SwvkPKX8fSLr30z X-Received: by 2002:a50:9b05:0:b0:56e:ddc:17ad with SMTP id o5-20020a509b05000000b0056e0ddc17admr2820808edi.30.1714987653787; Mon, 06 May 2024 02:27:33 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1714987653; cv=pass; d=google.com; s=arc-20160816; b=Wq/P/o+30JMks/jIWYxMKErBxi2Y8Bv0JZ6Xyg91th9Y6Xd4eqkxK3N6foZW3TnuYE 43njTHvgYe2LSAJ0UzqWvZxp34QehBXfX3GiUgM/fHnGdBHIZWBZWv/fKpyVaq+83VwA ljECZa3RvddPzmVqMgpQrrDbSnjO5gl5DewT02TfuxuBeAj8Y4kavxb12MWqx9UinuzV gJigFWeB6WuE4KIR+PAIrdjY5AHwsVjeZo1yOxPU1qCZ6QLaL4hkdkIivxYVD0i+Uy4a Q8rnIbQj4OpS60GqPYsoG7W+8V9PO6t7WyfZL+dv7hD6YVOl6AiEfKGq9l6lb3SSlpZ2 izOw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=wVFofgNkxGdQMWlEbE3kERnLyitg7II+vYwMI+fgqVc=; fh=K+u6k4LeW50YAlFBITh17z6/Kw8sjFmvdXtpOlBV5c8=; b=oETQq+mShS8FLAQuyWnPIS0KhhgTYTjxTIaSwS+xqm0ODNToO14SZiclYTJTh3fE2F do80CqPBrqgPJ3bFZLMEYbjZRZ68RpkatUVLaw5FmoeNDuSuQuuEbMGJ7dWohygYGBQY KJnVYUFBoPKgm/LqpQKfw9EqbKziR3ug5HhHFXc7+vuvaSlcq6WLqU3BP3i/ZinqOjEc 73+U9PsCu6A2zq1HgBBvBJ1NJ/L/Bd/0giFCU3WPvSR9P418U8cBXqcO7cLt5S7jKYnS u5kZX3sLdXkU75WSrZrztW43Xt/b6FkIlj9xmmldW8i0HMJfUYxLppeal50Q+F/AVXeL CozQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZJyv7ou5; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-169570-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-169570-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id b93-20020a509f66000000b005727dc27af6si4556386edf.189.2024.05.06.02.27.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 May 2024 02:27:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-169570-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZJyv7ou5; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-169570-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-169570-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id EA9791F23484 for ; Mon, 6 May 2024 09:27:16 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2C6F0142640; Mon, 6 May 2024 09:27:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZJyv7ou5" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 454BF6CDCE; Mon, 6 May 2024 09:27:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714987624; cv=none; b=ktOuPd/KLIu9lZ2VPjmkWsUYA9KDdnb8WRsJtTtulXzu4YJfLEhxu5subJgLw8s9yTM16Q2peXkwsycHrZGvH+3ucFlqx38Iq0eF8SkAt0ywRSQ+Ugqf3zb7wq/47YpPJnglk5wArt6dc/MzESc/o0IcxzAUeYA/PQ7JnhoD3EI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714987624; c=relaxed/simple; bh=r9PFC5bu7lmiF9aq5BTi6CEcKoYDbUpyNGKVlEmMpec=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=YVC7NVBhQXkC3+29OTzGxNfz/g9ZcoH9JQ3r3aXeKZG0qp4KwWw+YERDu8NgYCEI8s1oXPfyFUijOBk64XnJPSDTxlcq1Bw/I36e/S8XOWBw85T5tx+nc2dtOfJcCPCTDLwocVDQIPrBu/JRaso9RIlJ49i5Sz2SOnsnkQjz3IU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ZJyv7ou5; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0CE85C116B1; Mon, 6 May 2024 09:26:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1714987624; bh=r9PFC5bu7lmiF9aq5BTi6CEcKoYDbUpyNGKVlEmMpec=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ZJyv7ou5pxTVr+uFQiaUFaf141/cP0PUE3Qwy8Uxt7ca6tziSJAncFT6YoShaa4kG XcNFv5jUbDNutMUtLFyD7kfwlU3CfLN1mvCWn65w7ZdzuayS3fTQKBSkFuatUdwbGQ o/AvZTj0GH9CXRl6SIrwcNGc6i63LilJlhGYbJUl6F9jYP0waycKJzZ9bmasukbKf1 U934NoMGZc9+AnQG7WtQzww5nuRY3YSb5P5wGwhZxLA3G14RLSmkpfQLk0ohnNX1HW w9L7ZrTsbC7mzuhQMjM8CQEhVGab8383AUyX/Xjb9b4SvXj2xQqapcbWmPtO5didMN H7A9F0KS1jCxw== Date: Mon, 6 May 2024 11:26:57 +0200 From: Christian Brauner To: Linus Torvalds Cc: Al Viro , keescook@chromium.org, axboe@kernel.dk, christian.koenig@amd.com, dri-devel@lists.freedesktop.org, io-uring@vger.kernel.org, jack@suse.cz, laura@labbott.name, linaro-mm-sig@lists.linaro.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, minhquangbui99@gmail.com, sumit.semwal@linaro.org, syzbot+045b454ab35fd82a35fb@syzkaller.appspotmail.com, syzkaller-bugs@googlegroups.com Subject: Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes Message-ID: <20240506-zweibeinig-mahnen-daa579a233db@brauner> References: <202405031110.6F47982593@keescook> <20240503211129.679762-2-torvalds@linux-foundation.org> <20240503212428.GY2118490@ZenIV> <20240504-wohngebiet-restwert-6c3c94fddbdd@brauner> <20240505-gelehnt-anfahren-8250b487da2c@brauner> <20240506-injizieren-administration-f5900157566a@brauner> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20240506-injizieren-administration-f5900157566a@brauner> On Mon, May 06, 2024 at 10:45:35AM +0200, Christian Brauner wrote: > > The fact is, it's not dma-buf that is violating any rules. It's epoll. > > I agree that epoll() not taking a reference on the file is at least > unexpected and contradicts the usual code patterns for the sake of > performance and that it very likely is the case that most callers of > f_op->poll() don't know this. > > Note, I cleary wrote upthread that I'm ok to do it like you suggested > but raised two concerns a) there's currently only one instance of > prolonged @file lifetime in f_op->poll() afaict and b) that there's > possibly going to be some performance impact on epoll(). > > So it's at least worth discussing what's more important because epoll() > is very widely used and it's not that we haven't favored performance > before. > > But you've already said that you aren't concerned with performance on > epoll() upthread. So afaict then there's really not a lot more to > discuss other than take the patch and see whether we get any complaints. Two closing thoughts: (1) I wonder if this won't cause userspace regressions for the semantics of epoll because dying files are now silently ignored whereas before they'd generated events. (2) The other part is that this seems to me that epoll() will now temporarly pin filesystems opening up the possibility for spurious EBUSY errors. If you register a file descriptor in an epoll instance and then close it and umount the filesystem but epoll managed to do an fget() on that fd before that close() call then epoll will pin that filesystem. If the f_op->poll() method does something that can take a while (blocks on a shared mutex of that subsystem) that umount is very likely going to return EBUSY suddenly. Afaict, before that this wouldn't have been an issue at all and is likely more serious than performance. (One option would be to only do epi_fget() for stuff like dma-buf that's never unmounted. That'll cover nearly every driver out there. Only "real" filesystems would have to contend with @file count going to zero but honestly they also deal with dentry lookup under RCU which is way more adventurous than this.) Maybe I'm barking up the wrong tree though.