Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp244946iob; Mon, 2 May 2022 18:21:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwESTR/QRZhG1zuhmwxRt29x4AD5dSCUSjDHFhatLLQgdDgO72bqUA83MwY309R/6uD1kAx X-Received: by 2002:a62:3848:0:b0:50d:376e:57ed with SMTP id f69-20020a623848000000b0050d376e57edmr13685701pfa.71.1651540895020; Mon, 02 May 2022 18:21:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651540895; cv=none; d=google.com; s=arc-20160816; b=BtRpQDhLZ2j7Ml7oHWrSuVsmFu89FGfbzrnbtLrRt83DmSRdTnipeB4wwFU10jJBKR HnS7P8HsZp3qiT3qjADOMEUub7obQrmIwICA83KbLH3gETmBaNdEwtvfCI9KpfMjUCYr Glia4TwSbm3NugfruiE2yBUyCK5nb2qyyoEf18z7RJG4IYTj7jmel6GuvqxNqn0dBvmr e2RZnt1kQFKj2utDYK30Kr9WaAys4hwb44ahrZUqTL1Pc+0ztQGkjOBsmwj6VLBFkjtK YiTq5EeRShD7YYUjYUzFAMZAkRU0QwmTjJSgHwIF+bN5T7nUY8G/wmxGpxNF5SnvAxHJ 2/qg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:to:content-language:subject:user-agent:mime-version:date :message-id:dkim-signature; bh=4q6LMMy0sHpDudev3rOcr+oRUDi4BUl+wFKum8dGK6M=; b=Sg5Wp3SGCMiH4c2cdNJXWiTqN7oL+1sMJDwsbPd10ES84N18F5KQL4JAELpXGjYy4O +tGKI1YcpDuL+LE+R/vtWDvt/1lAALHZkaAS7JyY08oi7Fltt9RZnepm5WoY7XYzMaCm beJgDqNLP/Su4ILduPiMMpLZhaX0qz8z4A6qBZl5dLOMcGgyzb5gZvgXZDiJjsI2OvzQ zjnHKXjCKNZ1iq8dPWHTB0mj8W0fQoIZEEC0khZWrxd2XyE5bl90KMrM+I71SbC4e0nN iuUwq8v4NUq25GHlUS7iI40UetFpqSUpjKM8bWtoCSShelPXJ2B+PMu8u3iFQX8AWo+2 6CrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b="Tij/ZiEl"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=collabora.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id oc2-20020a17090b1c0200b001dc45b66378si921009pjb.9.2022.05.02.18.21.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 May 2022 18:21:35 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b="Tij/ZiEl"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8595D2A70C; Mon, 2 May 2022 18:00:43 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349121AbiD1Se1 (ORCPT + 99 others); Thu, 28 Apr 2022 14:34:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229507AbiD1SeZ (ORCPT ); Thu, 28 Apr 2022 14:34:25 -0400 Received: from bhuna.collabora.co.uk (bhuna.collabora.co.uk [46.235.227.227]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E0475A0B5 for ; Thu, 28 Apr 2022 11:31:06 -0700 (PDT) Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: dmitry.osipenko) with ESMTPSA id B50941F45B64 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1651170664; bh=78K7990FnBt2UVl64MTVirrkRYZ5P8r008mGSy+cJ1g=; h=Date:Subject:To:References:From:In-Reply-To:From; b=Tij/ZiElokYfBZE+f5oDZtG/tmMK58ekbisajaRJlcXVA8MfPlo3z+7qGO8rGldZ/ oUa0CAMd6pabI3pdRB3VOD200Ij6RZeibXPnv827ojdlcgkZN42Rx5oUFAbBpx7GHQ x9gXLweoN8dRgV6MHlFN6jVx7brHxxLQHyRMS8U93hthDev0cz/ApDpmEmWhYXZ49j kHMlaQYxvKMvqfVuX2YB1QCq8DjIoSHMJnal+VNMh+s+gGMk9dYD4mZ17gCJDJKyKf 7YLW0Ieb4ihAuevYyndZtN1XNS/ao8JdwqVvR6UVose5hWOfazjieiEBkSZAvLDO3i W5FmzxrleL0qA== Message-ID: <8f932ab0-bb72-8fea-4078-dc59e9164bd4@collabora.com> Date: Thu, 28 Apr 2022 21:31:00 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH v4 10/15] drm/shmem-helper: Take reservation lock instead of drm_gem_shmem locks Content-Language: en-US To: Daniel Stone , Thomas Zimmermann , David Airlie , Gerd Hoffmann , Gurchetan Singh , Chia-I Wu , Daniel Almeida , Gert Wollny , Gustavo Padovan , Tomeu Vizoso , Maarten Lankhorst , Maxime Ripard , Rob Herring , Steven Price , Alyssa Rosenzweig , Rob Clark , Emil Velikov , Robin Murphy , Dmitry Osipenko , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, virtualization@lists.linux-foundation.org References: <20220417223707.157113-1-dmitry.osipenko@collabora.com> <20220417223707.157113-11-dmitry.osipenko@collabora.com> <248083d2-b8f2-a4d7-099d-70a7e7859c11@suse.de> From: Dmitry Osipenko In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Daniel, 27.04.2022 17:50, Daniel Vetter пишет: > On Mon, Apr 18, 2022 at 10:18:54PM +0300, Dmitry Osipenko wrote: >> Hello, >> >> On 4/18/22 21:38, Thomas Zimmermann wrote: >>> Hi >>> >>> Am 18.04.22 um 00:37 schrieb Dmitry Osipenko: >>>> Replace drm_gem_shmem locks with the reservation lock to make GEM >>>> lockings more consistent. >>>> >>>> Previously drm_gem_shmem_vmap() and drm_gem_shmem_get_pages() were >>>> protected by separate locks, now it's the same lock, but it doesn't >>>> make any difference for the current GEM SHMEM users. Only Panfrost >>>> and Lima drivers use vmap() and they do it in the slow code paths, >>>> hence there was no practical justification for the usage of separate >>>> lock in the vmap(). >>>> >>>> Suggested-by: Daniel Vetter >>>> Signed-off-by: Dmitry Osipenko >>>> --- >> ... >>>>   @@ -310,7 +306,7 @@ static int drm_gem_shmem_vmap_locked(struct >>>> drm_gem_shmem_object *shmem, >>>>       } else { >>>>           pgprot_t prot = PAGE_KERNEL; >>>>   -        ret = drm_gem_shmem_get_pages(shmem); >>>> +        ret = drm_gem_shmem_get_pages_locked(shmem); >>>>           if (ret) >>>>               goto err_zero_use; >>>>   @@ -360,11 +356,11 @@ int drm_gem_shmem_vmap(struct >>>> drm_gem_shmem_object *shmem, >>>>   { >>>>       int ret; >>>>   -    ret = mutex_lock_interruptible(&shmem->vmap_lock); >>>> +    ret = dma_resv_lock_interruptible(shmem->base.resv, NULL); >>>>       if (ret) >>>>           return ret; >>>>       ret = drm_gem_shmem_vmap_locked(shmem, map); >>> >>> Within drm_gem_shmem_vmap_locked(), there's a call to dma_buf_vmap() for >>> imported pages. If the exporter side also holds/acquires the same >>> reservation lock as our object, the whole thing can deadlock. We cannot >>> move dma_buf_vmap() out of the CS, because we still need to increment >>> the reference counter. I honestly don't know how to easily fix this >>> problem. There's a TODO item about replacing these locks at [1]. As >>> Daniel suggested this patch, we should talk to him about the issue. >>> >>> Best regards >>> Thomas >>> >>> [1] >>> https://www.kernel.org/doc/html/latest/gpu/todo.html#move-buffer-object-locking-to-dma-resv-lock >> >> Indeed, good catch! Perhaps we could simply use a separate lock for the >> vmapping of the *imported* GEMs? The vmap_use_count is used only by >> vmap/vunmap, so it doesn't matter which lock is used by these functions >> in the case of imported GEMs since we only need to protect the >> vmap_use_count. > > Apologies for the late reply, I'm flooded. > > I discussed this with Daniel Stone last week in a chat, roughly what we > need to do is: > > 1. Pick a function from shmem helpers. > > 2. Go through all drivers that call this, and make sure that we acquire > dma_resv_lock in the top level driver entry point for this. > > 3. Once all driver code paths are converted, add a dma_resv_assert_held() > call to that function to make sure you have it all correctly. > 4. Repeate 1-3 until all shmem helper functions are converted over. Somehow I didn't notice the existence of dma_resv_assert_held(), thank you for the suggestion :) > > 5. Ditch the 3 different shmem helper locks. > > The trouble is that I forgot that vmap is a thing, so that needs more > work. I think there's two approaches here: > - Do the vmap at import time. This is the trick we used to untangle the > dma_resv_lock issues around dma_buf_attachment_map() > - Change the dma_buf_vmap rules that callers must hold the dma_resv_lock. I'll consider this option for v6, thank you. I see now that you actually want to define the new rules for the dma-bufs in general and not only in the context of the DRM code, this now makes much more sense to me. > - Maybe also do what you suggest and keep a separate lock for this, but > the fundamental issue is that this doesn't really work - if you share > buffers both ways with two drivers using shmem helpers, then the > ordering of this vmap_count_mutex vs dma_resv_lock is inconsistent and > you can get some nice deadlocks. So not a great approach (and also the > reason why we really need to get everyone to move towards dma_resv_lock > as _the_ buffer object lock, since otherwise we'll never get a > consistent lock nesting hierarchy). The separate locks should work okay because it will be always the exporter that takes the dma_resv_lock. But I agree that it's less ideal than defining the new rules for dma-bufs since sometime you will take the resv lock and sometime not, potentially hiding bugs related to lockings. > The trouble here is that trying to be clever and doing the conversion just > in shmem helpers wont work, because there's a lot of cases where the > drivers are all kinds of inconsistent with their locking. > > Adding Daniel S, also maybe for questions it'd be fastest to chat on irc? My nickname is digetx on the #dri-devel channel, feel free to ping me if needed. Right now yours suggestions are clear to me, hence no extra questions. Thank you for the review.