Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932398AbaGWJ16 (ORCPT ); Wed, 23 Jul 2014 05:27:58 -0400 Received: from mail-bn1blp0189.outbound.protection.outlook.com ([207.46.163.189]:23804 "EHLO na01-bn1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753112AbaGWJ14 convert rfc822-to-8bit (ORCPT ); Wed, 23 Jul 2014 05:27:56 -0400 X-WSS-ID: 0N95RMD-07-K6Z-02 X-M-MSG: Message-ID: <53CF8010.9060809@amd.com> Date: Wed, 23 Jul 2014 11:27:44 +0200 From: =?UTF-8?B?Q2hyaXN0aWFuIEvDtm5pZw==?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Daniel Vetter , =?UTF-8?B?Q2hyaXN0aWFuIEvDtm5p?= =?UTF-8?B?Zw==?= CC: Maarten Lankhorst , Thomas Hellstrom , nouveau , LKML , dri-devel , Ben Skeggs , "Deucher, Alexander" Subject: Re: [Nouveau] [PATCH 09/17] drm/radeon: use common fence implementation for fences References: <20140709093124.11354.3774.stgit@patser> <20140722115737.GN15237@phenom.ffwll.local> <53CE56ED.4040109@vodafone.de> <20140722132652.GO15237@phenom.ffwll.local> <53CE6AFA.1060807@vodafone.de> <53CE84AA.9030703@amd.com> <53CE8A57.2000803@vodafone.de> <53CF58FB.8070609@canonical.com> <53CF5B9F.1050800@amd.com> <53CF5EFE.6070307@canonical.com> <53CF63C2.7070407@vodafone.de> <53CF6622.6060803@amd.com> <53CF699D.9070902@canonical.com> <53CF6B18.5070107@vodafone.de> <53CF7035.2060808@amd.com> <53CF7191.2090008@canonical.com> <53CF765E.7020802@vodafone.de> In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed X-Originating-IP: [10.224.155.198] Content-Transfer-Encoding: 8BIT X-EOPAttributedMessage: 0 X-Forefront-Antispam-Report: CIP:165.204.84.221;CTRY:US;IPV:NLI;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(6009001)(428002)(51704005)(24454002)(199002)(189002)(377454003)(80022001)(21056001)(76482001)(64706001)(50466002)(77982001)(50986999)(4396001)(93886003)(87936001)(79102001)(97736001)(65806001)(59896001)(65956001)(19580395003)(85852003)(102836001)(87266999)(54356999)(83072002)(76176999)(83506001)(85306003)(65816999)(84676001)(105586002)(33656002)(68736004)(85182001)(64126003)(36756003)(106466001)(74662001)(20776003)(99396002)(107046002)(85202003)(101416001)(23676002)(86362001)(83322001)(47776003)(46102001)(81542001)(44976005)(81342001)(95666004)(19580405001)(92726001)(80316001);DIR:OUT;SFP:;SCL:1;SRVR:BN1PR02MB039;H:atltwp01.amd.com;FPR:;MLV:sfv;PTR:InfoDomainNonexistent;MX:1;LANG:en; X-Microsoft-Antispam: BCL:0;PCL:0;RULEID: X-Forefront-PRVS: 028166BF91 Authentication-Results: spf=none (sender IP is 165.204.84.221) smtp.mailfrom=Christian.Koenig@amd.com; X-OriginatorOrg: amd4.onmicrosoft.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am 23.07.2014 10:54, schrieb Daniel Vetter: > On Wed, Jul 23, 2014 at 10:46 AM, Christian König > wrote: >> Am 23.07.2014 10:42, schrieb Daniel Vetter: >> >>> On Wed, Jul 23, 2014 at 10:25 AM, Maarten Lankhorst >>> wrote: >>>> In this case if the sync was to i915 the i915 lockup procedure would take >>>> care of itself. It wouldn't fix radeon, but it would at least unblock your >>>> intel card again. I haven't specifically added a special case to attempt to >>>> unblock external fences, but I've considered it. :-) >>> Actually the i915 reset stuff relies crucially on being able to kick >>> all waiters holding driver locks. Since the current fence code only >>> exposes an opaque wait function without exposing the underlying wait >>> queue we won't be able to sleep on both the fence queue and the reset >>> queue. So would pose a problem if we add fence_wait calls to our >>> driver. >> >> And apart from that I really think that I misunderstood Maarten. But his >> explanation sounds like i915 would do a reset because Radeon is locked up, >> right? >> >> Well if that's really the case then I would question the interface even >> more, cause that is really nonsense. > I disagree - the entire point of fences is that we can do multi-gpu > work asynchronously. So by the time we'll notice that radeon's dead we > have accepted the batch from userspace already. The only way to get > rid of it again is through our reset machinery, which also tells > userspace that we couldn't execute the batch. Whether we actually need > to do a hw reset depends upon whether we've committed the batch to the > hw already. Atm that's always the case, but the scheduler will change > that. So I have no issues with intel doing a reset when other drivers > don't signal fences. You submit a job to the hardware and then block the job to wait for radeon to be finished? Well than this would indeed require a hardware reset, but wouldn't that make the whole problem even worse? I mean currently we block one userspace process to wait for other hardware to be finished with a buffer, but what you are describing here blocks the whole hardware to wait for other hardware which in the end blocks all userspace process accessing the hardware. Talking about alternative approaches wouldn't it be simpler to just offload the waiting to a different kernel or userspace thread? Christian. > > Also this isn't a problem with the interface really, but with the > current implementation for radeon. And getting cross-driver reset > notifications right will require more work either way. > -Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/