Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp2592046pxp; Tue, 22 Mar 2022 01:41:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwWclOa/IogmbD/2kc1mPbyVs+BQN+816zmNR9p1sHUzfPPx0MRuc5BgVUnPKo+LqS2DB/b X-Received: by 2002:a17:906:d0cb:b0:6db:acb:64c5 with SMTP id bq11-20020a170906d0cb00b006db0acb64c5mr23460919ejb.707.1647938490550; Tue, 22 Mar 2022 01:41:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647938490; cv=none; d=google.com; s=arc-20160816; b=vBtl2KoTmn4yS/7/Koqewal+8+aKe85aw2k6A2yo6zwv4Ls70kpsptQvBB8+OvYZHH fJJpU6ttByWx8xOq/zlMiT4vSLVTGhsWxktrc5YolfYj7wBZmsHIWFqK076lpCvxA/wW eJxZ+vfHyFXifvo1uwmT6FtA5SrLIrQmfuHCsA1dRqj23/CxuyFWsssPiS0DRYQYBXB3 fIr49ee5tIXlOu4t220cCc4mLfUNLFpFDQEEsfRwzo2AkzCiQOWR9hn45RfsELzWDzsb XXoVElY9PGHEUCATa/PcaATCKufxrat1vPXGtQs9TEF3v1T6AhkHQkQOn8XQKlE6peZG ABnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=hnywj9CfLzgrCfLN6HUiL0MSzOUyCXY5DawdGkBu/Ko=; b=fEU2oPugPwFHTKZpuzh2yXhEGJ06yPi0hUTYzjl12gQ5EY4UUi6TIS+Ya40+C8mOMC /GMy/aDFUrJ6d1YrXKVMMybN7PvNYSjV07Xc+uHxu3A75vsWGOSfrLKYQM0qNcMGbC+S AMQ+4ADBj6Zcz6RgQaKKROY/8GAPsfRXQRupAXNU3+zVhCEnEoPcgzrf5QeSwYw3pWKc mdk9VCoE+4wVvSYR9bboWrpBQotZePTYPBBmOpP1ZDwaT2DoBLLGmIIieVHrwYNDxZrx tgXIeMG4g/wekAdHE/gvFTMuIeXBl78Xw2/x7QEd4ZAklavXF/O4zsoMn0WSpnJRTLfl rSUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=RrPr5pIq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f5-20020a170906738500b006dfb71913ebsi8837618ejl.890.2022.03.22.01.41.05; Tue, 22 Mar 2022 01:41:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=RrPr5pIq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230465AbiCVI2X (ORCPT + 99 others); Tue, 22 Mar 2022 04:28:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229706AbiCVI2V (ORCPT ); Tue, 22 Mar 2022 04:28:21 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E67E29CA0 for ; Tue, 22 Mar 2022 01:26:54 -0700 (PDT) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 36C9D1F385; Tue, 22 Mar 2022 08:26:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1647937613; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=hnywj9CfLzgrCfLN6HUiL0MSzOUyCXY5DawdGkBu/Ko=; b=RrPr5pIqwbWx5Oswh6AE6IGKP9DCOpxWnRLxLcd96mxdzohUNSlvpMdr5OPZNjLBHGkl1o J/CZK95W9jlnxuTkJCoGU2x55sUWHKmEQ0xlWdVqDW6oTFnR3moZzW6MIyQ92bc7b0vgWH QFAsT55DJaMQ09nbSZMkjSR1doNXplc= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id E5830A3B81; Tue, 22 Mar 2022 08:26:52 +0000 (UTC) Date: Tue, 22 Mar 2022 09:26:52 +0100 From: Michal Hocko To: Davidlohr Bueso Cc: Nico Pache , linux-mm@kvack.org, Andrea Arcangeli , Joel Savitz , Andrew Morton , linux-kernel@vger.kernel.org, Rafael Aquini , Waiman Long , Baoquan He , Christoph von Recklinghausen , Don Dutile , "Herton R . Krzesinski" , Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Darren Hart , Andre Almeida , David Rientjes Subject: Re: [PATCH v5] mm/oom_kill.c: futex: Close a race between do_exit and the oom_reaper Message-ID: References: <20220318033621.626006-1-npache@redhat.com> <20220322004231.rwmnbjpq4ms6fnbi@offworld> <20220322025724.j3japdo5qocwgchz@offworld> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220322025724.j3japdo5qocwgchz@offworld> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 21-03-22 19:57:24, Davidlohr Bueso wrote: > On Mon, 21 Mar 2022, Nico Pache wrote: > > > We could proceed with the V3 approach; however if we are able to find a complete > > solution that keeps both functionalities (Concurrent OOM Reaping & Robust Futex) > > working, I dont see why we wouldnt go for it. > > Because semantically killing the process is, imo, the wrong thing to do. I am not sure I follow. The task has been killed by the oom killer. All we are discussing here is how to preserve the robust list metadata stored in the memory which is normally unmapped by the oom_reaper to guarantee a further progress. I can see we have 4 potential solutions: 1) do not oom_reap oom victims with robust futex metadata in anonymous memory. Easy enough but it could lead to excessive oom killing in case the victim gets stuck in the kernel and cannot terminate. 2) clean up robust list from the oom_reaper context. Seems tricky due to #PF handling from the oom_reaper context which would need to be non-blocking 3) filter vmas which contain robust list. Simple check for the vma range 4) internally mark vmas which have to preserve the state during oom_reaping. Futex code would somehow have to mark those mappings. While more generic solution. I am not sure this is a practical approach. -- Michal Hocko SUSE Labs