Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2752403imm; Fri, 24 Aug 2018 04:54:22 -0700 (PDT) X-Google-Smtp-Source: ANB0Vda/NxXnLlK4q0Mc/fM7XeB2AdPuK0X2y6327mOGVsO3pGTN5BW8JEzOoaCjHxW3q1WeAe2q X-Received: by 2002:a63:e811:: with SMTP id s17-v6mr1422419pgh.176.1535111662709; Fri, 24 Aug 2018 04:54:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535111662; cv=none; d=google.com; s=arc-20160816; b=uCL9rzG8/g5UkMbuXG7kF40f6j+x70zv7dhtLlK9aJAUe/q6qGl17ystpW8m/Z34mi su4TpnjMRA6sEmepUksoTR/5BE/YszJZozpM+2qpV+GIsEosxLP6GxCRPDD78BPlpFB+ qxvfQNY6zTvEJLafzJnaSU12vm+RBHV9Ehdj/5rQJ2gWoP/XGNzmVRcv8DbUqqqJc0r4 PFyfYKIrRgmYr/+S4YWiQKLRGcoW8WZ8pIDLylWfzluTfgosVbun7YLEqfsr+1YiAb5G LARidsnduWbrnkTUnYBwR7TyFilFKVCKnnz0qm9nZSGSVVaMzwVMrZfIqwWNl4y/BSla AnMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=MmAhP1wfEoroYg8VWBs0xrnQRuorFYXU/LUQ7F+B/aM=; b=IjbiajB1ALs3MvKXy1F9YE7GL/ER15ycFHRidNlWBmNwQDE0LJIBL7McfodBQaixik nQE6b7roYHQvTyF/Okh2nUCb/04rDm9ar7hha16sHg5DGO1XzhAcMmmOfZef5VvVAAVO aZsTK5ommU7L1n92l7pSJjnqLvjI0z/BNfILbKR7pXUhZRQ6hcFozxWk/JS5IeXJfSVn 5R2HbOAyc9vqmJfUpHmFB51mH5SyakTPkl432Nmhv7lL7mh5xPPylti+J+0lNIAvyIfL twxjKuDYidkHRgcsjlILt1PqvEfmg74EFOgxRtN5fgAPmbMhXVNT9ybm+qtfRbWqu4AU HcDQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w5-v6si5973132plz.175.2018.08.24.04.54.07; Fri, 24 Aug 2018 04:54:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727757AbeHXP0t (ORCPT + 99 others); Fri, 24 Aug 2018 11:26:49 -0400 Received: from mx2.suse.de ([195.135.220.15]:36850 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726891AbeHXP0t (ORCPT ); Fri, 24 Aug 2018 11:26:49 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 0F021B049; Fri, 24 Aug 2018 11:52:28 +0000 (UTC) Date: Fri, 24 Aug 2018 13:52:26 +0200 From: Michal Hocko To: christian.koenig@amd.com Cc: Tetsuo Handa , kvm@vger.kernel.org, Radim =?utf-8?B?S3LEjW3DocWZ?= , David Airlie , Joonas Lahtinen , Sudeep Dutt , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Andrea Arcangeli , "David (ChunMing) Zhou" , Dimitri Sivanich , linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Jason Gunthorpe , Doug Ledford , David Rientjes , xen-devel@lists.xenproject.org, intel-gfx@lists.freedesktop.org, Jani Nikula , Leon Romanovsky , =?iso-8859-1?B?Suly9G1l?= Glisse , Rodrigo Vivi , Boris Ostrovsky , Juergen Gross , Mike Marciniszyn , Dennis Dalessandro , LKML , Ashutosh Dixit , Alex Deucher , Paolo Bonzini , Andrew Morton , Felix Kuehling Subject: Re: [PATCH] mm, oom: distinguish blockable mode for mmu notifiers Message-ID: <20180824115226.GK29735@dhcp22.suse.cz> References: <20180716115058.5559-1-mhocko@kernel.org> <8cbfb09f-0c5a-8d43-1f5e-f3ff7612e289@I-love.SAKURA.ne.jp> <20180824113248.GH29735@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 24-08-18 13:43:16, Christian K?nig wrote: > Am 24.08.2018 um 13:32 schrieb Michal Hocko: > > On Fri 24-08-18 19:54:19, Tetsuo Handa wrote: > > > Two more worries for this patch. > > > > > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c > > > > @@ -178,12 +178,18 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn) > > > > * > > > > * @amn: our notifier > > > > */ > > > > -static void amdgpu_mn_read_lock(struct amdgpu_mn *amn) > > > > +static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable) > > > > { > > > > - mutex_lock(&amn->read_lock); > > > > + if (blockable) > > > > + mutex_lock(&amn->read_lock); > > > > + else if (!mutex_trylock(&amn->read_lock)) > > > > + return -EAGAIN; > > > > + > > > > if (atomic_inc_return(&amn->recursion) == 1) > > > > down_read_non_owner(&amn->lock); > > > Why don't we need to use trylock here if blockable == false ? > > > Want comment why it is safe to use blocking lock here. > > Hmm, I am pretty sure I have checked the code but it was quite confusing > > so I might have missed something. Double checking now, it seems that > > this read_lock is not used anywhere else and it is not _the_ lock we are > > interested about. It is the amn->lock (amdgpu_mn_lock) which matters as > > it is taken in exclusive mode for expensive operations. > > The write side of the lock is only taken in the command submission IOCTL. > > So you actually don't need to change anything here (even the proposed > changes are overkill) since we can't tear down the struct_mm while an IOCTL > is still using. I am not so sure. We are not in the mm destruction phase yet. This is mostly about the oom context which might fire right during the IOCTL. If any of the path which is holding the write lock blocks for unbound amount of time or even worse allocates a memory then we are screwed. So we need to back of when blockable = false. > > Is that correct Christian? If this is correct then we need to update the > > locking here. I am struggling to grasp the ref counting part. Why cannot > > all readers simply take the lock rather than rely on somebody else to > > take it? 1ed3d2567c800 didn't really help me to understand the locking > > scheme here so any help would be appreciated. > > That won't work like this there might be multiple > invalidate_range_start()/invalidate_range_end() pairs open at the same time. > E.g. the lock might be taken recursively and that is illegal for a > rw_semaphore. I am not sure I follow. Are you saying that one invalidate_range might trigger another one from the same path? -- Michal Hocko SUSE Labs