Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp2971921pxb; Mon, 17 Jan 2022 09:13:39 -0800 (PST) X-Google-Smtp-Source: ABdhPJyFA9swNSjyRUF1G6qW/oGKT8iOXhyxDRj72McmX7zdfVX/3L1W/RNMPPhsOLgxYpB3pdxr X-Received: by 2002:a63:ad42:: with SMTP id y2mr13944114pgo.386.1642439619011; Mon, 17 Jan 2022 09:13:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1642439619; cv=none; d=google.com; s=arc-20160816; b=z0vgCFF+S/mnBf2i8Jh6K4Y9b3o+m0iclMl4Zei/R6/Cq+cL9EX5SYVHfrlrgPtz9y IODdMRtFRxxog9dVcaJPRd07KhPw+2neaW2zOJP2rprUZtZIcwAYblCikH+MgzIBSiVo 00vYfPXyiI6Ekq4N2mDDrcAqUMHGhUNtrlgTHZ3MfDb2JTm6ZdDgRlF1WYefqN5m9KL0 t9UP9CzhmBxTY5wQiMUUFqXBdzMY6Bde/UCQXMcfFwcaP3qg9w53euiF7x2BZyazwWYL T245wpaPmKGiGuyZUQQRUxmSaXl49CXfkzfbS53HckWZNkPJlvTvWFZ2FacPYZbm378c Hndg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=6AT+BcSng3mCjm3YV8NAWDfJaHk3WflmRlnyGIs2osU=; b=APBqNCtsn79KjrdUS34hRWUjBZM9mbc7Ie5p5WuQ8lDANMS7MeYkPYjVk1y4028ZdH HpPpJaIfulPY7XLOaljF/OFy2SZUgVpLNaNd7tcPGb5TmuS+kTSr/pJmhESyFDLL1PlA YC426A5DL+kfsqOqtP/QebGeLvYAsuHZm9WrN/lGQYSM+Efwohwfpg4FxaXrI1cOOuDc bmR1mYfu69DcwnNKwaW4A1XXouMFbEkO28wcXlYebUlKbf4IZt+xUVPGqwKDPCuUrWzQ vFuRl4RvyipNEY9Ijr3FvTZWI+7qegGPIh+8iUNt4X6JP1X/ZqNjO8TWjhvMEDQ/aPIH p+MA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=W9fQoUk6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id fw18si17906500pjb.107.2022.01.17.09.13.26; Mon, 17 Jan 2022 09:13:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=W9fQoUk6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239121AbiAQLdh (ORCPT + 99 others); Mon, 17 Jan 2022 06:33:37 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:54680 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233803AbiAQLdf (ORCPT ); Mon, 17 Jan 2022 06:33:35 -0500 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 5A3991F39A; Mon, 17 Jan 2022 11:33:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1642419214; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6AT+BcSng3mCjm3YV8NAWDfJaHk3WflmRlnyGIs2osU=; b=W9fQoUk6ynN8A9SUKYIzk+1u9jMKeIkiIk30Qy2MmF2K2mc8MRRpBph23eyBNuPULQJYo1 GZAu82TYgSznNB+DKbd/DbZzZoI+YxFT4s2D4C1WyAK94WGmS9yg31YFEdl2VPXqAeuD0e lorOsIyOTPijwQW/AGlgGOJXgAekaRE= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id B21BBA3B84; Mon, 17 Jan 2022 11:33:33 +0000 (UTC) Date: Mon, 17 Jan 2022 12:33:33 +0100 From: Michal Hocko To: Joel Savitz Cc: Andrew Morton , linux-kernel , Waiman Long , linux-mm@kvack.org, Nico Pache , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Darren Hart , Davidlohr Bueso , =?iso-8859-1?Q?Andr=E9?= Almeida Subject: Re: [PATCH] mm/oom_kill: wake futex waiters before annihilating victim shared mutex Message-ID: References: <20211207214902.772614-1-jsavitz@redhat.com> <20211207154759.3f3fe272349c77e0c4aca36f@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org I have only noticed your email now after replying to v3 so our emails have crossed. On Fri 14-01-22 09:39:55, Joel Savitz wrote: > > What has happened to the oom victim and why it has never exited? > > What appears to happen is that the oom victim is sent SIGKILL by the > process that triggers the oom while also being marked as an oom > victim. > > As you mention in your patchset introducing the oom reaper in commit > aac4536355496 ("mm, oom: introduce oom reaper"), the purpose the the > oom reaper is to try and free more memory more quickly than it > otherwise would have been by assuming anonymous or swapped out pages > won't be needed in the exit path as the owner is already dying. > However, this assumption is violated by the futex_cleanup() path, > which needs access to userspace in fetch_robust_entry() when it is > called in exit_robust_list(). Trace_printk()s in this failure path > reveal an apparent race between the oom reaper thread reaping the > victim's mm and the futex_cleanup() path. There may be other ways that > this race manifests but we have been most consistently able to trace > that one. Please let's continue the discussion in the v3 email thread: http://lkml.kernel.org/r/20220114180135.83308-1-npache@redhat.com -- Michal Hocko SUSE Labs