Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751402AbdLPGWD (ORCPT ); Sat, 16 Dec 2017 01:22:03 -0500 Received: from www262.sakura.ne.jp ([202.181.97.72]:57211 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750730AbdLPGWB (ORCPT ); Sat, 16 Dec 2017 01:22:01 -0500 Subject: Re: [patch v2 1/2] mm, mmu_notifier: annotate mmu notifiers with blockable invalidate callbacks To: Michal Hocko , David Rientjes Cc: Andrew Morton , Andrea Arcangeli , Benjamin Herrenschmidt , Paul Mackerras , Oded Gabbay , Alex Deucher , =?UTF-8?Q?Christian_K=c3=b6nig?= , David Airlie , Joerg Roedel , Doug Ledford , Jani Nikula , Mike Marciniszyn , Sean Hefty , Dimitri Sivanich , Boris Ostrovsky , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Paolo Bonzini , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20171215162534.GA16951@dhcp22.suse.cz> From: Tetsuo Handa Message-ID: <0c555671-9214-5cb9-0121-5da04faf5329@I-love.SAKURA.ne.jp> Date: Sat, 16 Dec 2017 15:21:51 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <20171215162534.GA16951@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1253 Lines: 29 On 2017/12/16 1:25, Michal Hocko wrote: >> struct mmu_notifier_ops { >> + /* >> + * Flags to specify behavior of callbacks for this MMU notifier. >> + * Used to determine which context an operation may be called. >> + * >> + * MMU_INVALIDATE_DOES_NOT_BLOCK: invalidate_{start,end} does not >> + * block >> + */ >> + int flags; > > This should be more specific IMHO. What do you think about the following > wording? > > invalidate_{start,end,range} doesn't block on any locks which depend > directly or indirectly (via lock chain or resources e.g. worker context) > on a memory allocation. I disagree. It needlessly complicates validating the correctness. What if the invalidate_{start,end} calls schedule_timeout_idle(10 * HZ) ? schedule_timeout_idle() will not block on any locks which depend directly or indirectly on a memory allocation, but we are already blocking other memory allocating threads at mutex_trylock(&oom_lock) in __alloc_pages_may_oom(). This is essentially same with "sleeping forever due to schedule_timeout_killable(1) by SCHED_IDLE thread with oom_lock held" versus "looping due to mutex_trylock(&oom_lock) by all other allocating threads" lockup problem. The OOM reaper does not want to get blocked for so long.