Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp1682663pxb; Wed, 9 Feb 2022 02:04:59 -0800 (PST) X-Google-Smtp-Source: ABdhPJy8P3H6Cbv8Ogai2TW+fy6Tk0AcO+fkXxdHdfFPtzif5VKpRzBvhzl5GNUe1DvP8VDiIJgs X-Received: by 2002:a05:6a00:c94:: with SMTP id a20mr1534345pfv.41.1644401099796; Wed, 09 Feb 2022 02:04:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644401099; cv=none; d=google.com; s=arc-20160816; b=ykuP9DZvhryUzk3HYMFPrkr5fuDRm/QoPwc392N7lMTqi+9ir8uSKVs9vQFeoPbe2W pOWQXyiwnKDhxRcG5xXGfMEgu9kL26wTGdB09wvgJpZVxDF9MRNEOzntbufoB82XuQpy KZW5cvb1vd3nNqtbCfnGWyq9qdojh0khbaKe5AwrDDRuaVPGR5UNXFe1r4CJXIBlZlB9 1l74JxkfAaNPUF1H9IREU2oFOnPYv5nPBHoQg/SQXbYUVsPgt1d1OaOZEtX68d1S4tUs O9TzUl3SauhxFTfa4pphvluS3wkejlo2Av/Nu7AunoYXF/jdad5nttvlWuGP71f2nlCv 9Opw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:mime-version:user-agent:message-id :in-reply-to:date:references:cc:to:from; bh=7Asgh0PIBrJYgoDlpSiKWpEDeH8LscSdFmglRQ+wboo=; b=i5ezXF7zMMAWjUGEiTiccyfvfh5c+Jmtpj88yOgSBKplPi+1JLWoR8Gnxp0gPV4ga1 2+k7ZTmS7Wm055a1Ox7pzfdrY7AApLronw4/PkvOxsIJPTYMzWHhLPX0reO7brVuwV3v qJmq/C7lkfkgUwJHjJ2C5jeVziilv09EMfG4lI+jFuw0yNkAtKenwq2B19bLHcx54GjZ eo0FOn6b1lkNUIiYmdemPWK1j23wUIjNstKMFV+ynW2l11tXV92gOmMo7WN5KrOP4RLR 8Dt2S0tdGEuiDgyafzLp7F90kAXMLVtqh+gCmMiAmEFylu9S94ipTtC78iXLBUS3zB31 JziA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xmission.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id k71si15829775pga.357.2022.02.09.02.04.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Feb 2022 02:04:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xmission.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7FCAFE040ABB; Wed, 9 Feb 2022 01:13:35 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387475AbiBHWYR (ORCPT + 99 others); Tue, 8 Feb 2022 17:24:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47688 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387124AbiBHV7a (ORCPT ); Tue, 8 Feb 2022 16:59:30 -0500 Received: from out02.mta.xmission.com (out02.mta.xmission.com [166.70.13.232]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE4DAC0612B8 for ; Tue, 8 Feb 2022 13:59:29 -0800 (PST) Received: from in01.mta.xmission.com ([166.70.13.51]:35296) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1nHYWF-00D3Sx-RE; Tue, 08 Feb 2022 14:59:27 -0700 Received: from ip68-227-174-4.om.om.cox.net ([68.227.174.4]:40330 helo=email.froward.int.ebiederm.org.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1nHYWE-002oBP-Rc; Tue, 08 Feb 2022 14:59:27 -0700 From: "Eric W. Biederman" To: Al Viro Cc: Waiman Long , Christian Brauner , Andrew Morton , Jens Axboe , Alexey Gladkov , David Hildenbrand , Jann Horn , linux-kernel@vger.kernel.org References: <20220208163912.1084752-1-longman@redhat.com> Date: Tue, 08 Feb 2022 15:59:06 -0600 In-Reply-To: (Al Viro's message of "Tue, 8 Feb 2022 19:07:41 +0000") Message-ID: <874k59f2ad.fsf@email.froward.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1nHYWE-002oBP-Rc;;;mid=<874k59f2ad.fsf@email.froward.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=68.227.174.4;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19ah3ujpA0uSRXc7hWxYjEDlJ9CtDrmagE= X-SA-Exim-Connect-IP: 68.227.174.4 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: *;Al Viro X-Spam-Relay-Country: X-Spam-Timing: total 421 ms - load_scoreonly_sql: 0.04 (0.0%), signal_user_changed: 11 (2.5%), b_tie_ro: 9 (2.2%), parse: 0.93 (0.2%), extract_message_metadata: 12 (2.9%), get_uri_detail_list: 1.71 (0.4%), tests_pri_-1000: 7 (1.8%), tests_pri_-950: 1.25 (0.3%), tests_pri_-900: 1.01 (0.2%), tests_pri_-90: 84 (20.0%), check_bayes: 82 (19.6%), b_tokenize: 7 (1.8%), b_tok_get_all: 7 (1.7%), b_comp_prob: 2.5 (0.6%), b_tok_touch_all: 61 (14.6%), b_finish: 0.97 (0.2%), tests_pri_0: 289 (68.6%), check_dkim_signature: 0.62 (0.1%), check_dkim_adsp: 2.7 (0.6%), poll_dns_idle: 0.39 (0.1%), tests_pri_10: 2.2 (0.5%), tests_pri_500: 9 (2.2%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH] copy_process(): Move fd_install() out of sighand->siglock critical section X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Al Viro writes: > On Tue, Feb 08, 2022 at 01:51:35PM -0500, Waiman Long wrote: >> On 2/8/22 13:16, Al Viro wrote: >> > On Tue, Feb 08, 2022 at 11:39:12AM -0500, Waiman Long wrote: >> > >> > > One way to solve this problem is to move the fd_install() call out of >> > > the sighand->siglock critical section. >> > > >> > > Before commit 6fd2fe494b17 ("copy_process(): don't use ksys_close() >> > > on cleanups"), the pidfd installation was done without holding both >> > > the task_list lock and the sighand->siglock. Obviously, holding these >> > > two locks are not really needed to protect the fd_install() call. >> > > So move the fd_install() call down to after the releases of both locks. >> > Umm... That assumes we can delay it that far. IOW, that nothing >> > relies upon having pidfd observable in /proc/*/fd as soon as the child >> > becomes visible there in the first place. >> > >> > What warranties are expected from CLONE_PIDFD wrt observation of >> > child's descriptor table? >> > >> I think the fd_install() call can be moved after the release of >> sighand->siglock but before the release the tasklist_lock. Will that be good >> enough? > > Looks like it should, but I'd rather hear from the CLONE_PIDFD authors first... > Christian, could you comment on that? The tasklist_lock and the siglock provide no protection against being looked up in proc. The proc filesystem looks up process information with things only protected by the rcu_read_lock(). Which means that the process will be visible through proc after "attach_pid(p, PIDTYPE_PID". The fd is being installed in the fdtable of the parent process, and the siglock and tasklist_lock are held to protect the child. Further fd_install is exposing the fd to userspace where it can be used by the process_madvise and the process_mrelease system calls, from anything that shares the fdtable of the parent thread. Which means it needs to be guaranteed that kernel_clone will call wake_up_process before it is safe to call fd_install. So it appears to me that moving fd_install earlier fundamentally unsafe, and the locks are meaningless from an fd_install perspective. Which means it should be perfectly fine to move the fd_install outside of the tasklist_lock and the outside siglock. I don't see how we could support the fd appearing in the fdtable sooner which seems to make the question moot as to weather userspace in some odd corner case expects the fd to appear in the fdtable sooner. So I say move fd_install down with proc_fork_connector and friends. Eric