Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp2260709rwp; Fri, 14 Jul 2023 03:31:53 -0700 (PDT) X-Google-Smtp-Source: APBJJlFQjUq2T+Hl73067mujc+9owCLveAG8f/Nx6igL0gbr7Dx6yr+OXuTKVL4US+f7SWqSknP/ X-Received: by 2002:a05:6402:b26:b0:51d:92bf:e6ae with SMTP id bo6-20020a0564020b2600b0051d92bfe6aemr4174293edb.18.1689330713007; Fri, 14 Jul 2023 03:31:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689330712; cv=none; d=google.com; s=arc-20160816; b=cnUrjxNcq02h8kcy+Ppy7Fyoe1/3FWXxf9/ERiiHAvjfxQZ+4bAVpKA5BxsxcH5g2W KJOgMH939A8LcAjn6Q7lVgqPPjqOWQPnCCf4J9IUCCTzpFwLHvFJdsOxO2qV7KM5RDQ6 3Kn89u9gLxbyT70N6UdYtJKZ3VGfwqm1g1wNfL63gVyloDq43UgWCFXuWB+dzVvOCmCl 2Arq3xZJIAJ4/q11e9V8X83OeorW9nmrAGt0w+ty19OVEg/Focm/1JnAdUHWJtPA/5sL Q7X5XjVNT3ZhhM/+3vALic/243Oa8F2jRj1XVeCxro+J8e+72VFxWX+83TWj1L4epT4D RqcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=jlHTtlE3PXHV6UamgZ7fuHVnxiBrjo7fMBC4CmEY7uc=; fh=55RZIZzLuhRrvB39jWdpKSgMfNLAAz4RKVVa+gvdeE8=; b=d+0tOz2fpWfkZ+aArP0p8RP/wkt/vzfyWNvfptR2B+B85DaOKyaQnk9IP6JS2ikt9P /aj+Ybz/sFYy5A7+EAxr/E5LO1ghbuZRsdIrE3O8ex7pZngsOHvxbFWU0GIzurjkND8R uQsQnWqN7RkIToeWaIiDga1ltWdc6fQ54UYAoqsaxlpKurADMMLBZvZiDoYiyY+9ZN5I YL5ZmO7jGgndAOnLbK9zrwbaCpladLjZydp3eazAGg17TV+0BKuYYFC7KBZWOPtGwl4O U9LkvOa19WsIRq5uNdJ0Mb0cUsdP2KZPER7GUHF6d2b6CFvpcCUYKNF9nKlgeFRuUsQU iRxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@asahilina.net header.s=default header.b=DcwMypG1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=asahilina.net Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t20-20020a05640203d400b0051d9df5dd25si376927edw.97.2023.07.14.03.31.28; Fri, 14 Jul 2023 03:31:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@asahilina.net header.s=default header.b=DcwMypG1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=asahilina.net Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235996AbjGNKGj (ORCPT + 99 others); Fri, 14 Jul 2023 06:06:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235958AbjGNKGi (ORCPT ); Fri, 14 Jul 2023 06:06:38 -0400 Received: from mail.marcansoft.com (marcansoft.com [212.63.210.85]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04B502D75; Fri, 14 Jul 2023 03:06:31 -0700 (PDT) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) (Authenticated sender: lina@asahilina.net) by mail.marcansoft.com (Postfix) with ESMTPSA id 117714258C; Fri, 14 Jul 2023 10:06:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=asahilina.net; s=default; t=1689329190; bh=9QWl/AwKkqJIVfZsIvRKT61QtERPUmn6O/Rt/bWA6J8=; h=Date:Subject:To:Cc:References:From:In-Reply-To; b=DcwMypG1rCCpo1DjBZULFEiSjm2jugs4+T/om73xJh6DCpNnIBpPnruD4ZMoNlsrk +wBm7aNUIfmDN6W2QzN7gUI3bWehBstiJBVqk+St4UBZhafAaLSx+t8ASGKadHueB2 GENvsra5bFsLmWn3ZoWp7xUvTNObvlgyewP7T9SKQw8ySL6jYiU2NYceUxhIfPbhB8 QtWbVuQQ9nlnK5WWnBDSYFmkhJST1z4z6rP0ESs26Fx5HtDDeWlaeSDCqvd1jH5mqy 553ACgMOQU1nAaP4mvYNKGfYVuvkM0LFuMetqRCkoz7I6k9gGEgD6mhBVCXUynr9q7 Aj+UYoO0hiMVA== Message-ID: <6b473196-9f87-d6c8-b289-18f80de78f0a@asahilina.net> Date: Fri, 14 Jul 2023 19:06:25 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [PATCH 2/3] drm/scheduler: Fix UAF in drm_sched_fence_get_timeline_name Content-Language: en-US To: =?UTF-8?Q?Christian_K=c3=b6nig?= , Luben Tuikov , David Airlie , Daniel Vetter , Sumit Semwal Cc: Faith Ekstrand , Alyssa Rosenzweig , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, asahi@lists.linux.dev References: <20230714-drm-sched-fixes-v1-0-c567249709f7@asahilina.net> <20230714-drm-sched-fixes-v1-2-c567249709f7@asahilina.net> From: Asahi Lina In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 14/07/2023 18.57, Christian König wrote: > Am 14.07.23 um 11:49 schrieb Asahi Lina: >> On 14/07/2023 17.43, Christian König wrote: >>> Am 14.07.23 um 10:21 schrieb Asahi Lina: >>>> A signaled scheduler fence can outlive its scheduler, since fences are >>>> independencly reference counted. Therefore, we can't reference the >>>> scheduler in the get_timeline_name() implementation. >>>> >>>> Fixes oopses on `cat /sys/kernel/debug/dma_buf/bufinfo` when shared >>>> dma-bufs reference fences from GPU schedulers that no longer exist. >>>> >>>> Signed-off-by: Asahi Lina >>>> --- >>>>    drivers/gpu/drm/scheduler/sched_entity.c | 7 ++++++- >>>>    drivers/gpu/drm/scheduler/sched_fence.c  | 4 +++- >>>>    include/drm/gpu_scheduler.h              | 5 +++++ >>>>    3 files changed, 14 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c >>>> b/drivers/gpu/drm/scheduler/sched_entity.c >>>> index b2bbc8a68b30..17f35b0b005a 100644 >>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c >>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c >>>> @@ -389,7 +389,12 @@ static bool >>>> drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity) >>>>               /* >>>>             * Fence is from the same scheduler, only need to wait for >>>> -         * it to be scheduled >>>> +         * it to be scheduled. >>>> +         * >>>> +         * Note: s_fence->sched could have been freed and reallocated >>>> +         * as another scheduler. This false positive case is okay, >>>> as if >>>> +         * the old scheduler was freed all of its jobs must have >>>> +         * signaled their completion fences. >>> >>> This is outright nonsense. As long as an entity for a scheduler exists >>> it is not allowed to free up this scheduler. >>> >>> So this function can't be called like this. >> >> As I already explained, the fences can outlive their scheduler. That >> means *this* entity certainly exists for *this* scheduler, but the >> *dependency* fence might have come from a past scheduler that was >> already destroyed along with all of its entities, and its address reused. > > Well this is function is not about fences, this function is a callback > for the entity. That deals with dependency fences, which could have come from any arbitrary source, including another entity and another scheduler. >> >> Christian, I'm really getting tired of your tone. I don't appreciate >> being told my comments are "outright nonsense" when you don't even >> take the time to understand what the issue is and what I'm trying to >> do/document. If you aren't interested in working with me, I'm just >> going to give up on drm_sched, wait until Rust gets workqueue support, >> and reimplement it in Rust. You can keep your broken fence lifetime >> semantics and I'll do my own thing. > > I'm certainly trying to help here, but you seem to have unrealistic > expectations. I don't think expecting not to be told my changes are "outright nonsense" is an unrealistic expectation. If you think it is, maybe that's yet another indicator of the culture problems the kernel community has... > I perfectly understand what you are trying to do, but you don't seem to > understand that this functionality here isn't made for your use case. I do, that's why I'm trying to change things. Right now, this functionality isn't even properly documented, which is why I thought it could be used for my use case, and slowly discovered otherwise. Daniel suggested documenting it, then fixing the mismatches between documentation and reality, which is what I'm doing here. > We can adjust the functionality to better match your requirements, but > you can't say it is broken because it doesn't work when you use it not > in the way it is intended to be used. I'm saying the idea that a random dma-buf holds onto a chain of references that prevents unloading a driver module that wrote into it (and keeps a bunch of random unrelated objects alive) is a broken state of affairs. It may or may not trickle down to actual problems for users (I would bet it does in some cases but I don't know for sure), but it's a situation that doesn't make any sense. I know I'm triggering actual breakage with my new use case due to this, which is why I'm trying to improve things. But the current state of affairs just doesn't make any sense even if it isn't causing kernel oopses today with other drivers. > You can go ahead and try to re-implement the functionality in Rust, but > then I would reject that pointing out that this should probably be an > extension to the existing code. You keep rejecting my attempts at extending the existing code... ~~ Lina