Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1816006pxb; Sun, 31 Oct 2021 00:57:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzYPXjv2RoKqN5vfxdMSgbxPNo2YxjGEKg8YSgzYNqFW8M5y2VzvUOW4f70xKBCFgYI1XMS X-Received: by 2002:a17:907:608c:: with SMTP id ht12mr27453947ejc.78.1635667069977; Sun, 31 Oct 2021 00:57:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635667069; cv=none; d=google.com; s=arc-20160816; b=yociLsjjBKuG6BwGkOs1aEBm45LVUOZobo64zMKLpExTsRcor/o2wzIiUXVFRGfx+N Udieuos3im4+INC34j7y3OzF3she9sADrDoWBDP0+7NCxHPsKT3aGOcLt6JjEtlF+cg0 twKDxArFeknw6ZgI1F22GkDVcbx1z+RHud9Whsk2JnoD1VtHCotbyl2OICIzmipbIUpI ssMP54Ic6PggMYT36ImusH4IgKMCjNCgnAvRNPsjhIMJc3EUKm7EVUMuKh1vaVa8KYNo etsRobplt925nf14/4NGEL8622AHC127TX4d8NL8MPffcLohoaQMWobdHlicPFTMJDHn BajQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=OK22pdHX5b9HabYTws0pSyzByN+7dzISoLGlEJIxrec=; b=PBqZk7KX0EYtNn6TVC5aaQsrwUfiNM38O+zrJO4VjPunFBUVwacL4tBFcMSBAXWldX 6paaINrd3rFaKAPmOCMJjwEUp1Px1ec4VG3qT+Ajk0jupBrgjlQj1GI2BCNz/eQ7zlz9 n/USq85QLPAZQLZsZkE8BAntFxkb/2EYP/ZA0xkCdSgQNAPPE08FKmvPVnzZ9M1zOS8K qPnt7HsI8BN3eUImu0Y/EOsIYYkOJN5J5/eDuTjH3q3pDthk8v0nQk1tUxJrpVb+kRRq ZO83GE31K9cK/P7cokzXw49i9o/tclP+fkLGO63/GQ6YnRLTfISN2UtGtEiFAFhBG8xd bT0Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 15si16389515edw.32.2021.10.31.00.57.13; Sun, 31 Oct 2021 00:57:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229887AbhJaHwP (ORCPT + 99 others); Sun, 31 Oct 2021 03:52:15 -0400 Received: from dcvr.yhbt.net ([64.71.152.64]:57890 "EHLO dcvr.yhbt.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229638AbhJaHwN (ORCPT ); Sun, 31 Oct 2021 03:52:13 -0400 X-Greylist: delayed 619 seconds by postgrey-1.27 at vger.kernel.org; Sun, 31 Oct 2021 03:52:13 EDT Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 446A01F953; Sun, 31 Oct 2021 07:39:23 +0000 (UTC) Date: Sun, 31 Oct 2021 07:39:23 +0000 From: Eric Wong To: Sargun Dhillon Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, willy@infradead.org, arnd@kernel.org, Willem de Bruijn Subject: Re: epoll may leak events on dup Message-ID: <20211031073923.M174137@dcvr> References: <20211030100319.GA11526@ircssh-3.c.rugged-nimbus-611.internal> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20211030100319.GA11526@ircssh-3.c.rugged-nimbus-611.internal> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sargun Dhillon wrote: > I discovered an interesting behaviour in epoll today. If I register the same > file twice, under two different file descriptor numbers, and then I close one of > the two file descriptors, epoll "leaks" the first event. This is fine, because > one would think I could just go ahead and remove the event, but alas, that isn't > the case. Some example python code follows to show the issue at hand. > > I'm not sure if this is really considered a "bug" or just "interesting epoll > behaviour", but in my opinion this is kind of a bug, especially because leaks > may happen by accident -- especially if files are not immediately freed. "Interesting epoll behavior" combined with a quirk with the Python wrapper for epoll. It passes the FD as epoll_event.data (.data could also be any void *ptr, a u64, or u32). Not knowing Python myself (but knowing Ruby and Perl5 well); I assume Python developers chose the safest route in passing an integer FD for .data. Passing a pointer to an arbitrary Perl/Ruby object would cause tricky lifetime issues with the automatic memory management of those languages; I expect Python would have the same problem. > I'm also not sure why epoll events are registered by file, and not just fd. > Is the expectation that you can share a single epoll amongst multiple > "users" and register different files that have the same file descriptor No, the other way around. Different FDs for the same file. Having registration keyed by [file+fd] allows users to pass different pointers for different events to the same file; which could have its uses. Registering by FD alone isn't enough; since the epoll FD itself can be shared across fork (which is of limited usefulness[1]). Originaly iterations of epoll were keyed only by the file; with the FD being added later. > number (at least for purposes other than CRIU). Maybe someone can shed > light on the behaviour. CRIU? Checkpoint/Restore In Userspace? [1] In contrast, kqueue has a unique close-on-fork behavior which greatly simplifies usage from C code (but less so for high-level runtimes which auto-close FDs).