Received: by 10.192.165.156 with SMTP id m28csp841744imm; Tue, 17 Apr 2018 22:21:52 -0700 (PDT) X-Google-Smtp-Source: AIpwx49/MZA/vCX0ZLj8Z/2hyI1UpO21qHQJQJ55sg0ZwX7b9ftbZ9b3FHNL8ni7S4uPVWNL5Epr X-Received: by 2002:a17:902:b105:: with SMTP id q5-v6mr725199plr.173.1524028912861; Tue, 17 Apr 2018 22:21:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524028912; cv=none; d=google.com; s=arc-20160816; b=CMhZ4hroIu9zu12RbIO6SEapciwDYlIh6djPK2sj6fCOFjuI5ktp8X+q1uTWnt3Mwe cVwVhe9UTgpmq6nmeyNhvkdSuvEdxK6BVUeSpQUcBXJZDbg1jnBcigCpC19296a2fFxC IwJKNl2KvM0Ofafqcpakl/kdy0XKbalGsUA008ZlkniAW4gr0J4zkHyrJjNZ0/e88oIZ Gh5wdaPA4VS5yd1rMBJFVD+TuSYh3HbWF6p97/odijGquDPFJuRDkkK+kVzdlXlH9mU8 PGFP5ILxRHawyzYI9w3YXeJURdTXqO8cHmauq43M7kYNJaSpsOSGXrpTWGuEhiYGsNa0 WcUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature :arc-authentication-results; bh=VOdZ9HAaDQbJMEMO8UsxBKiIYW5e+6oZ4Is99lqpluA=; b=ZiMR1wRKHgyUna6a9fm9l0Txfy59yUjFS2ZZ1h0FotXTfMrgU4c7YuL4yoq9YQM9pl +mhPKgolu2iOShxCNjnbN1LbAbknjQ3t8vAqMuD8jR6W2ixU0yBQLd9TFt2YJ2rKYiph Egj5BbXha1xsSUHRW09MYhSNvEqLEU6IyDLsBwnb1nwhbpfzU2GcOli3jW2dznbaTR/w V6tEybRBAK9I4K/sBt0ZQECJ0YYGQVagHpkXvCe/jdjrg+3CDQt1qedsNL0SxSWLbENx ZuVcUa9fvoEhPRwYGoWriV8hg3wwekshpmilUjVBiyXJnIYjF11VRO4ZAmd53AlPRfRP QZHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=hdJhAucK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 133si441781pgc.341.2018.04.17.22.21.37; Tue, 17 Apr 2018 22:21:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=hdJhAucK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753469AbeDRFU1 (ORCPT + 99 others); Wed, 18 Apr 2018 01:20:27 -0400 Received: from mail-pl0-f67.google.com ([209.85.160.67]:43870 "EHLO mail-pl0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752864AbeDRFU0 (ORCPT ); Wed, 18 Apr 2018 01:20:26 -0400 Received: by mail-pl0-f67.google.com with SMTP id a39-v6so395252pla.10 for ; Tue, 17 Apr 2018 22:20:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=VOdZ9HAaDQbJMEMO8UsxBKiIYW5e+6oZ4Is99lqpluA=; b=hdJhAucKT8TMjmM2l6wnGsoREYgaInW7UDe6MkwUnUoUzT+qWRDOUlNaI9zwYm1QjV IHk23DLYlWJZ1LrOYEgCtMdU3V2C3iF4UMwiulYE2JI/hTJ13oxptV4nx05KRX2oSpG+ /L0TjgL2xcgPs9o9Nzccj3tZJYQx92yLCbLPi1/NxILuRYBYCsIOlXUQSjedO5fOh31z TukKPH7RzIvasZXwOCdi7idvoafy7GPF4HvgsPvtOpxaSd16W+nMBGbANaFJZXbv+RAX NEezjf7GVI5tKCfbDVZtjnEABizYDVW/BDlpnng78Y+L0f2arKpFFScBcFIUCfrVa54n GNQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=VOdZ9HAaDQbJMEMO8UsxBKiIYW5e+6oZ4Is99lqpluA=; b=lQdXByTJKuAEji7sN1ZZWOoiE2DFivWtmiaRaYujG2BYY+bP6WjKlZvieuX3V4rc2V 1HYIc4stbuJ9Wc9wJnB6/s6F5XTqOp0kCd3SoRC0srrEyN+Vbj+aAjU0LiHekRuJRGR/ LtA+GRbKAnd3Cv/SHqzUzE7wz/7szQIpnnsianWk22XCxo9kOJl5//+lOi5ZLQQIkkxl Z1YG9e22vDT/V8Acqv6yQnK4pnYzBIA0iY3lThIidWmA/i1Dn43RQDqb3mUwAQEhXCkQ CwWUwB5A2Dlqy5evvXKEq5ycyy8RzDna2wFHEZVTJ8PlfxGTGk4GHER8JirrmgvrkCvn yQvg== X-Gm-Message-State: ALQs6tCk6cONXp52+0HjDxf7Tvv6cmI/K9BIKkMCOuKOO4siEwNzM9RC ThCNbdK28yGQ+qwpimF8m+uvMg== X-Received: by 2002:a17:902:2d24:: with SMTP id o33-v6mr733324plb.14.1524028825535; Tue, 17 Apr 2018 22:20:25 -0700 (PDT) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id a76sm868422pfc.97.2018.04.17.22.20.24 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 17 Apr 2018 22:20:24 -0700 (PDT) Date: Tue, 17 Apr 2018 22:20:24 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Tetsuo Handa cc: Andrew Morton , Michal Hocko , Andrea Arcangeli , Roman Gushchin , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch v2] mm, oom: fix concurrent munlock and oom reaper unmap In-Reply-To: <201804180447.w3I4lq60017956@www262.sakura.ne.jp> Message-ID: References: <201804180355.w3I3tM6T001187@www262.sakura.ne.jp> <201804180447.w3I4lq60017956@www262.sakura.ne.jp> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 18 Apr 2018, Tetsuo Handa wrote: > > Commit 97b1255cb27c is referencing MMF_OOM_SKIP already being set by > > exit_mmap(). The only thing this patch changes is where that is done: > > before or after free_pgtables(). We can certainly move it to before > > free_pgtables() at the risk of subsequent (and eventually unnecessary) oom > > kills. It's not exactly the point of this patch. > > > > I have thousands of real-world examples where additional processes were > > oom killed while the original victim was in free_pgtables(). That's why > > we've moved the MMF_OOM_SKIP to after free_pgtables(). > > "we have moved"? No, not yet. Your patch is about to move it. > I'm referring to our own kernel, we have thousands of real-world examples where additional processes have been oom killed where the original victim is in free_pgtables(). It actually happens about 10-15% of the time in automated testing where you create a 128MB memcg, fork a canary, and then fork a >128MB memory hog. 10-15% of the time both processes get oom killed: the memory hog first (higher rss), the canary second. The pgtable stat is unchanged between oom kills. > My question is: is it guaranteed that munlock_vma_pages_all()/unmap_vmas()/free_pgtables() > by exit_mmap() are never blocked for memory allocation. Note that exit_mmap() tries to unmap > all pages while the OOM reaper tries to unmap only safe pages. If there is possibility that > munlock_vma_pages_all()/unmap_vmas()/free_pgtables() by exit_mmap() are blocked for memory > allocation, your patch will introduce an OOM livelock. > If munlock_vma_pages_all(), unmap_vmas(), or free_pgtables() require memory to make forward progress, then we have bigger problems :) I just ran a query of real-world oom kill logs that I have. In 33,773,705 oom kills, I have no evidence of a thread failing to exit after reaching exit_mmap(). You may recall from my support of your patch to emit the stack trace when the oom reaper fails, in https://marc.info/?l=linux-mm&m=152157881518627, that I have logs of 28,222,058 occurrences of the oom reaper where it successfully frees memory and the victim exits. If you'd like to pursue the possibility that exit_mmap() blocks before freeing memory that we have somehow been lucky to miss in 33 million occurrences, I'd appreciate the test case.