Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp1054227rwb; Tue, 29 Nov 2022 08:25:32 -0800 (PST) X-Google-Smtp-Source: AA0mqf7KP70qJbYCE8lZGm+xklkuq0Wm0IhTD+hvuUKrR0FiE9fR2a9zdgA5RGDoJo0dLyiMM8mA X-Received: by 2002:a63:f406:0:b0:46e:b96d:192d with SMTP id g6-20020a63f406000000b0046eb96d192dmr33008123pgi.418.1669739132261; Tue, 29 Nov 2022 08:25:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669739132; cv=none; d=google.com; s=arc-20160816; b=LrPFcIVKH0BF7WhIm9evOCDQQ94FhI5EUL47WKJVAFrAkj3ynvYLDCGEl8/BZg1g1h u2UAHFR2eGfPoANka2IlFp5/7oQGjH+SWCf1ggiUtcJv4maY3A85FWlbvoshIzYY32Ig iJ05Rrz6P1I2BnNpY8ANHshrPqCop9KCIbRg2/QJC9wnholy5si8/CXrA5MVNvOEhuHl zKPHmFU6NmL1jyAETUsgLy3guhBmmSOqa+vA22hFphx4rCFCiny40clONXGiP3ba2vqG VEzNQB7UKYqQImPl3DyIuJP2AI5jUs+T8t9+UDJx3Ct1rO4L0lA8aSrPF01cR14Ib/OE Wyjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=0HWfe1+VjMGFyrMHQg7oU2YqCOUnzuCCKI9ZSqYRpPw=; b=KVyDVfLSfq6rSOXVQf6bohEv4Va8LLBfOD3Hjt5y0IF2QZkgAVJ692+Ji8TH6ej20b PPUn/R4i/Y811LJlLrlbT5M2jaVFJioU30/OmEfjh59Op4xckyr+sBQXU9I2eSZjTcUT 6ylxWxgH7Ik/X9kVN5dShiZZiMQfLRHyTlJzHqhzIUFTDeFwrPg/dOAbaUFXTqHnNMAD ZgCBUaErd5y3SojRlKiHJwkzS54s5c+hWQdMImgAYa+Ve86VFYGK2IRnXwQweWT7/uaR qsjewHgY0IeIB9kbwGbD3wegAPoN6DEUKSmBMC0ueL2ldVBBgKzxWjdhvsSrVtENNUZp ps9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=HC6K4vRP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z15-20020a63c04f000000b0043c1cb75c22si15091027pgi.333.2022.11.29.08.25.19; Tue, 29 Nov 2022 08:25:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=HC6K4vRP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235153AbiK2P7d (ORCPT + 84 others); Tue, 29 Nov 2022 10:59:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235173AbiK2P7E (ORCPT ); Tue, 29 Nov 2022 10:59:04 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F23E69DD3 for ; Tue, 29 Nov 2022 07:58:06 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 79B62617A3 for ; Tue, 29 Nov 2022 15:58:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 480A9C4347C; Tue, 29 Nov 2022 15:58:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669737484; bh=zrEzumqsT84sAsMe/WdazNyakaN/EBtxrup9XhwiJ4Y=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=HC6K4vRPijG2/QsB4kYC85xkx1mTzkG9RkL0BQLiuotFt2sKX0H4H5zXCVYLRUKbn uWQvugQj4i5qxzQ8hY3jdzBj690iJDLfZrS60qEbWPqWpaMdANDmcwv0uzM90Mezcz VunLEvGuVXvpEB1Ujxcz/dCY0nmmsI2eLYu39hruALAvZ2q48qAeOdp1ZQ/2jc6Fd2 33shhUUSqaaagf2jJTqjTWQI2b8Kt2SXuvkz0cGJJNNWYpY7NkqH75SuBUp1oR/FZr t2+nXbe0jPRFqyHmURxBRbLhYHVx67zPgYdAbnxhXMnzBxszvp+yLsmVb9E5Ym0iez FPKVX6/4YknqA== Date: Tue, 29 Nov 2022 15:57:58 +0000 From: Will Deacon To: Waiman Long Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Phil Auld , Wenjie Li , David Wang =?utf-8?B?546L5qCH?= , linux-kernel@vger.kernel.org Subject: Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr() Message-ID: <20221129155757.GC26561@willie-the-truck> References: <20221125023943.1118603-1-longman@redhat.com> <92b99a5e-1588-4e08-a652-72e9c51421cf@redhat.com> <20221128120008.GA25090@willie-the-truck> <20221129140759.GA26437@willie-the-truck> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 29, 2022 at 10:32:49AM -0500, Waiman Long wrote: > On 11/29/22 09:07, Will Deacon wrote: > > On Mon, Nov 28, 2022 at 10:11:52AM -0500, Waiman Long wrote: > > > On 11/28/22 07:00, Will Deacon wrote: > > > > On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote: > > > > > On 11/24/22 21:39, Waiman Long wrote: > > > > > > In general, a non-null user_cpus_ptr will remain set until the task dies. > > > > > > A possible exception to this is the fact that do_set_cpus_allowed() > > > > > > will clear a non-null user_cpus_ptr. To allow this possible racing > > > > > > condition, we need to check for NULL user_cpus_ptr under the pi_lock > > > > > > before duping the user mask. > > > > > > > > > > > > Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()") > > > > > > Signed-off-by: Waiman Long > > > > > This is actually a pre-existing use-after-free bug since commit 07ec77a1d4e8 > > > > > ("sched: Allow task CPU affinity to be restricted on asymmetric systems"). > > > > > So it needs to be fixed in the stable release as well. Will resend the patch > > > > > with an additional fixes tag and updated commit log. > > > > Please can you elaborate on the use-after-free here? Looking at > > > > 07ec77a1d4e8, the mask is only freed in free_task() when the usage refcount > > > > has dropped to zero and I can't see how that can race with fork(). > > > > > > > > What am I missing? > > > I missed that at first. The current task cloning process copies the content > > > of the task structure over to the newly cloned/forked task. IOW, if > > > user_cpus_ptr had been set up previously, it will be copied over to the > > > cloned task. Now if user_cpus_ptr of the source task is cleared right after > > > that and before dup_user_cpus_ptr() is called. The obsolete user_cpus_ptr > > > value in the cloned task will remain and get used even if it has been freed. > > > That is what I call as use-after-free and double-free. > > If the parent task can be modified concurrently with dup_task_struct() then > > surely we'd have bigger issues because that's not going to be atomic? At the > > very least we'd have a data race, but it also feels like we could end up > > with inconsistent task state in the child. In fact, couldn't the normal > > 'cpus_mask' be corrupted by a concurrent set_cpus_allowed_common()? > > > > Or am I still failing to understand the race? > > > A major difference between cpus_mask and user_cpus_ptr is that for > cpus_mask, the bitmap is embedded into task_struct whereas user_cpus_ptr is > a pointer to an external bitmap. So there is no issue of use-after-free wrt > cpus_mask. That is not the case where the memory of the user_cpus_ptr of the > parent task is freed, but then a reference to that memory is still available > in the child's task struct and may be used. Sure, I'm not saying there's a UAF on cpus_mask, but I'm concerned that we could corrupt the data and end up with an affinity mask that doesn't correspond to anything meaningful. Do you agree that's possible? > Note that the problematic concurrence is not between the copying of task > struct and changing of the task struct. It is what will happen after the > task struct copying has already been done with an extra reference present in > the child's task struct. Well, sort of, but the child only has the extra reference _because_ the parent pointer was concurrently cleared to NULL, otherwise dup_user_cpus_ptr() would have allocated a new copy and we'd be ok, no? Overall, I'm just very wary that we seem to be saying that copy_process() can run concurrently with changes to the parent. Maybe it's all been written with that in mindi (including all the arch callbacks), but I'd be astonished if this is the only problem in there. Will