Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp1110979imm; Thu, 6 Sep 2018 15:48:44 -0700 (PDT) X-Google-Smtp-Source: ANB0Vda4/8i52xalecB6JX3yW10tcGmm5JOSiBqxYEPV/Oz3Bt/mgpIa9DuCB3KDARILnKr9H31Y X-Received: by 2002:a17:902:900a:: with SMTP id a10-v6mr4962589plp.143.1536274123916; Thu, 06 Sep 2018 15:48:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536274123; cv=none; d=google.com; s=arc-20160816; b=w0aO/cuF0TU7gp+u05dGybKOLGLeXBZlOQD5SopVX52sKCe0Hb7Gpl+B+VxFLx0ORm d2LFI0EvYreKCzj26o0hfuWO5aZErh8K5JIEQoNQTRQULxpLmc81tcajV3icFbu6oXsI DX4+9jDBNYi9w3XXXXxAQGX6SwAMuWC4BPs/rUz3PfHgjB3A61jm1KS2ITJ+dJsp3/pK fNccON5zenThXQwlcX5mD259T02fL/UmYk4eZRdG6V2XX+2X2J7kZ35OZWzy5Vf3BCJR eb9/1Uah8WhVGfzxT1Cr/oGMNwwepxjPRZgzNMELHQ0DDB14uZyEPxGg/V1M0ldUwBKh 0sQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=VHo9KkQC2YTH5pM+q91VIE37ZY4Qdibz+8i+29qMSVM=; b=NNXp+5QHVS01Zbmb97FrEcBXsQ7PnGJNlcoWpsxTsQexnJNp9pyE3tC0w2RPPWhKTC zK4sg/mC2n5bor69Gxx+t1DhWEiXSfc+JBimOi3rB30MnigHz0ow/bOsxYr9scKTbxfi EmzABtREXUpvsdX57idqyrJ3CgANHraPFImwqusANUiEc3EBbcsoGC550eBIITjKtJ9J TKDFsp4R/wBWKAvuGTybnf3jr2wRK8ZSommwJMQ48foMqheiUf3U03XiKAtnf1QN0NZ5 0y8muNnv4cpKff9LdjfANBl6DWJ9TBBn99UeTpviZCMTCdMkNsZVwK+KvmSlMQVokniz FX6g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h17-v6si5964078pgg.218.2018.09.06.15.48.28; Thu, 06 Sep 2018 15:48:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727740AbeIGDYl (ORCPT + 99 others); Thu, 6 Sep 2018 23:24:41 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:61100 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726131AbeIGDYl (ORCPT ); Thu, 6 Sep 2018 23:24:41 -0400 Received: from fsav110.sakura.ne.jp (fsav110.sakura.ne.jp [27.133.134.237]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id w86MkCAM009541; Fri, 7 Sep 2018 07:46:12 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav110.sakura.ne.jp (F-Secure/fsigk_smtp/530/fsav110.sakura.ne.jp); Fri, 07 Sep 2018 07:46:12 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/530/fsav110.sakura.ne.jp) Received: from [192.168.1.8] (softbank060157066051.bbtec.net [60.157.66.51]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id w86MkBQJ009537 (version=TLSv1.2 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 7 Sep 2018 07:46:12 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Subject: Re: [PATCH] mm, oom: distinguish blockable mode for mmu notifiers To: =?UTF-8?Q?Christian_K=c3=b6nig?= , Michal Hocko Cc: kvm@vger.kernel.org, =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Sudeep Dutt , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Andrea Arcangeli , Dimitri Sivanich , Jason Gunthorpe , linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, David Airlie , Doug Ledford , David Rientjes , xen-devel@lists.xenproject.org, intel-gfx@lists.freedesktop.org, Leon Romanovsky , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Rodrigo Vivi , Boris Ostrovsky , Juergen Gross , Mike Marciniszyn , Dennis Dalessandro , LKML , Ashutosh Dixit , Alex Deucher , Paolo Bonzini , Andrew Morton , Felix Kuehling References: <20180824120339.GL29735@dhcp22.suse.cz> <20180824123341.GN29735@dhcp22.suse.cz> <20180824130132.GP29735@dhcp22.suse.cz> <23d071d2-82e4-9b78-1000-be44db5f6523@gmail.com> <20180824132442.GQ29735@dhcp22.suse.cz> <86bd94d5-0ce8-c67f-07a5-ca9ebf399cdd@gmail.com> <20180824134009.GS29735@dhcp22.suse.cz> <735b0a53-5237-8827-d20e-e57fa24d798f@amd.com> <20180824135257.GU29735@dhcp22.suse.cz> <0e80c531-4e91-fb1d-e7eb-46a7aecc4c9d@amd.com> From: Tetsuo Handa Message-ID: <841ae1fb-bb5a-8b1e-6383-ca2e70b6e759@i-love.sakura.ne.jp> Date: Fri, 7 Sep 2018 07:46:09 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <0e80c531-4e91-fb1d-e7eb-46a7aecc4c9d@amd.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018/08/27 16:41, Christian König wrote: > Am 26.08.2018 um 10:40 schrieb Tetsuo Handa: >> I'm not following. Why don't we need to do like below (given that >> nobody except amdgpu_mn_read_lock() holds ->read_lock) because e.g. >> drm_sched_fence_create() from drm_sched_job_init() from amdgpu_cs_submit() >> is doing GFP_KERNEL memory allocation with ->lock held for write? > > That's a bug which needs to be fixed separately. > > Allocating memory with GFP_KERNEL while holding a lock which is also taken in the reclaim code path is illegal not matter what you do. > > Patches to fix this are already on the appropriate mailing list and will be pushed upstream today. > > Regards, > Christian. Commit 4a2de54dc1d7668f ("drm/amdgpu: fix holding mn_lock while allocating memory") seems to be calling amdgpu_mn_unlock() without amdgpu_mn_lock() when drm_sched_job_init() failed... Michal, you are asking me to fix all bugs (including out of tree code) and prevent future bugs just because you want to avoid using timeout in order to avoid OOM lockup ( https://marc.info/?i=55a3fb37-3246-73d7-0f45-5835a3f4831c@i-love.sakura.ne.jp ). That is a too much request which is impossible for even you. More you count on the OOM reaper, we exponentially complicates dependency and more likely to stumble over unreviewed/untested code...