Received: by 10.192.165.156 with SMTP id m28csp1249364imm; Wed, 18 Apr 2018 06:45:24 -0700 (PDT) X-Google-Smtp-Source: AIpwx49X4HyQ/NUu0p82bhpUKPiXQ6kGxNnoVhDS9Zur/d0o1cjcaIyGgWfjjKgAGPivbrngsKyM X-Received: by 10.98.81.197 with SMTP id f188mr2068830pfb.136.1524059124546; Wed, 18 Apr 2018 06:45:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524059124; cv=none; d=google.com; s=arc-20160816; b=Fs9nRZ+cbjpc5+intci4490xXL2lRB8mtwqp/CuNBKV/PDDpYJAbg1ytPRue0OxdW+ Ijemp3yJvG9KRh72CIBan5rb51QBPY3Lxqft9EGF0o5KUBc2ZMKxRsQS3LFPq5/G5pK1 ftunXRTWsB4Kudenn3hw/g01+Um106XIC5DSrMZb3UMu0AcTbC7ZZ6VrPt347YhYtApd 67GU+NBIquD3rM4wTD+D5L2xpsqYPdI2lcdYuKMg2rEYG1W7lt8UIYff2sAFclP2fgiB NvrST3INICvrZb+x1inPJbcUgBcJ2nanCoPAq6GHV1vWFLtTPdZ17+tiZU1ZTalbX3jy rwOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=PeFtdALBvQustDZIPlGzUEjm8vttBbUJIhZ6p+x8/aw=; b=loVc370CPyRL3sHgCvavnvvJiI4MGfspaLyO+EiFsmTQpC1ed0eZD2Y+Uzdv/5E4ed Vqy+q8KNNmPUjCyd3IA4YgTLNvp8yabqDxRi0/bvNUyW03lFlD7S8KqbRpJlwMS+JmYz lRu7aOwNuCB1+hwR372lWooDiKAFlvxiRIP5S9xNzm3Zx6VlZBLLBcw98YdmT0/0iN5f 5S3ROqCS5g4k8T1igaiyAiINcJWoBOnPrytLq/BUMw8leKD1yIda1kW3om0oXoAf16an 0UjnEWvfuCr4RE9f+DkClQ4ysRJBHXvrT3nxAZlUJTi8FW+EnE/3ETyF1Z3Tl+nrqTmK K+eQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u14si1130797pgq.103.2018.04.18.06.45.10; Wed, 18 Apr 2018 06:45:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752784AbeDRNoE (ORCPT + 99 others); Wed, 18 Apr 2018 09:44:04 -0400 Received: from mx2.suse.de ([195.135.220.15]:33847 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752714AbeDRNoD (ORCPT ); Wed, 18 Apr 2018 09:44:03 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id D24C1AB1E; Wed, 18 Apr 2018 13:44:01 +0000 (UTC) Date: Wed, 18 Apr 2018 15:44:01 +0200 From: Michal Hocko To: Tetsuo Handa Cc: rientjes@google.com, akpm@linux-foundation.org, aarcange@redhat.com, guro@fb.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch v2] mm, oom: fix concurrent munlock and oom reaper unmap Message-ID: <20180418134401.GF17484@dhcp22.suse.cz> References: <20180418075051.GO17484@dhcp22.suse.cz> <201804182049.EDJ21857.OHJOMOLFQVFFtS@I-love.SAKURA.ne.jp> <20180418115830.GA17484@dhcp22.suse.cz> <201804182225.EII57887.OLMHOFVtQSFJOF@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201804182225.EII57887.OLMHOFVtQSFJOF@I-love.SAKURA.ne.jp> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 18-04-18 22:25:54, Tetsuo Handa wrote: > Michal Hocko wrote: > > > > Can we try a simpler way and get back to what I was suggesting before > > > > [1] and simply not play tricks with > > > > down_write(&mm->mmap_sem); > > > > up_write(&mm->mmap_sem); > > > > > > > > and use the write lock in exit_mmap for oom_victims? > > > > > > You mean something like this? > > > > or simply hold the write lock until we unmap and free page tables. > > That increases possibility of __oom_reap_task_mm() giving up reclaim and > setting MMF_OOM_SKIP when exit_mmap() is making forward progress, doesn't it? Yes it does. But it is not that likely and easily noticeable from the logs so we can make the locking protocol more complex if this really hits two often. > I think that it is better that __oom_reap_task_mm() does not give up when > exit_mmap() can make progress. In that aspect, the section protected by > mmap_sem held for write should be as short as possible. Sure, but then weight the complexity on the other side and try to think whether simpler code which works most of the time is better than a buggy complex one. The current protocol has 2 followup fixes which speaks for itself. [...] > > > Then, I'm tempted to call __oom_reap_task_mm() before holding mmap_sem for write. > > > It would be OK to call __oom_reap_task_mm() at the beginning of __mmput()... > > > > I am not sure I understand. > > To reduce possibility of __oom_reap_task_mm() giving up reclaim and > setting MMF_OOM_SKIP. Still do not understand. Do you want to call __oom_reap_task_mm from __mmput? If yes why would you do so when exit_mmap does a stronger version of it? -- Michal Hocko SUSE Labs