Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp5215319pxb; Wed, 26 Jan 2022 07:14:52 -0800 (PST) X-Google-Smtp-Source: ABdhPJzExtJaYqCBeoZ+irOzU9PBTkGtaw00fsxM+o7Eno6zlZCdx7afDtsJnQvyEZ33gGw7RMwp X-Received: by 2002:a05:6402:4cf:: with SMTP id n15mr18875463edw.165.1643210092458; Wed, 26 Jan 2022 07:14:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643210092; cv=none; d=google.com; s=arc-20160816; b=sAhX0XNKEEw1nE+1NJRuXPxOvT2utuuRZztIwjPhl5U+xxU40VbI1Debn0LgrSWbKC 451RGk+KG1wqX4LxIFFrzWBbKGbTpiMrXCGYsJRe9coBEDDpa0VngLjFQIJ4AHYet+Ds 7pWQJZXBWXC3cEAqg7/vcmpkfZWBBKhclWN8Nqk+4ftQPloo09EzOq0x4pPiZEGQ4DRN nubLpViMUSAuCOCCF+YIQgWSTm9xNigAPgPUmAutauDrHqLRD0zIlT7OGhkMhgn7WWcu fNfEjORf6U/Wu69+/Zt67OLvYMZO9jGqFVb7agg5W5uVh4ZhhxNJr6KruXHaYC06cL81 DCPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=jcmJBIHPwHjznYtDDk6Boea8e5KryTYO/GlCBZebjjA=; b=p7vMJeepv/N/4sOvLffInQ2c1KbvHDBH8UWGFf6um5e/nwZUYeBCVQ2QgC6D/+Rt1+ xlrWjA999o358m3o/jvKO6qWE4Cm5sWzwft49lLwC2+wJ3Fc8E3CPuj8LCpHnQc3DYNC 29ILRf0C13FKaTSCigi7wtEQ6i1SOpYRkfps5CsSAZ+nzMlzIXy/6yXKnKL6etsXdL/e b9qU6XOja4wuNRI1pGBpaLLxsaU87p+Syr7iL91+3JRDlbaHRin7qNwvi2W25z8YThuf I9QSbN+0EcD+i0h89ntYfnF4FrWt0X4ynxicH5a9C+jpKCZj3gcOufYwP0G1lOCD064f IXcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=LOwzPM1S; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id qk44si5481130ejc.231.2022.01.26.07.14.26; Wed, 26 Jan 2022 07:14:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=LOwzPM1S; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235911AbiAZBmE (ORCPT + 99 others); Tue, 25 Jan 2022 20:42:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235863AbiAZBmD (ORCPT ); Tue, 25 Jan 2022 20:42:03 -0500 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ACF8CC06161C for ; Tue, 25 Jan 2022 17:42:03 -0800 (PST) Received: by mail-pj1-x102d.google.com with SMTP id w12-20020a17090a528c00b001b276aa3aabso3390353pjh.0 for ; Tue, 25 Jan 2022 17:42:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=jcmJBIHPwHjznYtDDk6Boea8e5KryTYO/GlCBZebjjA=; b=LOwzPM1SoqHGYd2SxQWQ8hws35Uh7wvjYPKYB5Vh9boIA9KB527XTHrptuNGzj2Uh2 9Gzd2tVgY2UZyHU48NroEeNTAWCywS2x9INKK+1IK5i9lsAZCMTaZzmA4LWILH3HimDC wfJh0+pz/FeSZ5hr/mvz+1AhQjAETXbg8E1NLIjPZ3vvoGZraatWgm9+Q68bDIl3wiqD SLu42ZcdTKzdWP5hUXqLGNflzredT1NSUsJ4f59mxVCJcXFpR9mdl3eWf1mYPJ+Xksqi K+VDa96hrD3GXv6dOyFhbbKx4sZ70jFhGbqt02q98tBDPsDVYbBv/CWJ9vhk+cSWFwOF 5BDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=jcmJBIHPwHjznYtDDk6Boea8e5KryTYO/GlCBZebjjA=; b=Fgisohi313k/WHE3FXFPASc9gLblNjPjm9zA5T25n67/Oy8+l0XVNjco1AdyQkVV8j iwnclLGNsem9y5qn/Z94W3IFQuGvIuLtKgdm9Z66iGq1+rRe4ev2SXz56H9kVWoK2O1F 5Y6paY9/7Tx1MYi3Ek7WY+C709/8v7yDcVcNsTR6iqQZ80+8b4EfQH1BT8qDFqdsR8fW qaIDIKpEn+hRzSoBSFYhhiFMqCDZlEzEGjjMmEPNgWUAGKAxHlRTpq3s7Yp8kzxFCx8G 18J2Lr8jN1hoD6KwzjqB0HW4ixHjtzPTsImYAQe2mxCMRxnno/R2zSmBGwkILO0kYcLv 3KaA== X-Gm-Message-State: AOAM53177f5D/fdXchhJjvPH+zPQfY5q2cMmqYFC80XwhI+ibO5vtWql zQpNpIx+IlMyH5DqRYVegvHfsg== X-Received: by 2002:a17:90a:aa95:: with SMTP id l21mr6393110pjq.207.1643161322880; Tue, 25 Jan 2022 17:42:02 -0800 (PST) Received: from [2620:15c:29:204:6f7a:fc02:d37c:a8b0] ([2620:15c:29:204:6f7a:fc02:d37c:a8b0]) by smtp.gmail.com with ESMTPSA id nv13sm1561492pjb.18.2022.01.25.17.42.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jan 2022 17:42:01 -0800 (PST) Date: Tue, 25 Jan 2022 17:42:01 -0800 (PST) From: David Rientjes To: Shakeel Butt cc: Jens Axboe , Pavel Begunkov , Andrew Morton , Linux MM , LKML , io-uring@vger.kernel.org Subject: Re: [PATCH] mm: io_uring: allow oom-killer from io_uring_setup In-Reply-To: Message-ID: References: <20220125051736.2981459-1-shakeelb@google.com> <2bec4db-1533-2d39-77f9-bf613fc262d9@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 25 Jan 2022, Shakeel Butt wrote: > > > On an overcommitted system which is running multiple workloads of > > > varying priorities, it is preferred to trigger an oom-killer to kill a > > > low priority workload than to let the high priority workload receiving > > > ENOMEMs. On our memory overcommitted systems, we are seeing a lot of > > > ENOMEMs instead of oom-kills because io_uring_setup callchain is using > > > __GFP_NORETRY gfp flag which avoids the oom-killer. Let's remove it and > > > allow the oom-killer to kill a lower priority job. > > > > > > > What is the size of the allocations that io_mem_alloc() is doing? > > > > If get_order(size) > PAGE_ALLOC_COSTLY_ORDER, then this will fail even > > without the __GFP_NORETRY. To make the guarantee that workloads are not > > receiving ENOMEM, it seems like we'd need to guarantee that allocations > > going through io_mem_alloc() are sufficiently small. > > > > (And if we're really serious about it, then even something like a > > BUILD_BUG_ON().) > > > > The test case provided to me for which the user was seeing ENOMEMs was > io_uring_setup() with 64 entries (nothing else). > > If I understand rings_size() calculations correctly then the 0 order > allocation was requested in io_mem_alloc(). > > For order > PAGE_ALLOC_COSTLY_ORDER, maybe we can use > __GFP_RETRY_MAYFAIL. It will at least do more aggressive reclaim > though I think that is a separate discussion. For this issue, we are > seeing ENOMEMs even for order 0 allocations. > Ah, gotcha, thanks for the background. IIUC, io_uring_setup() can be done with anything with CAP_SYS_NICE so my only concern would be whether this could be used maliciously on a system not using memcg, but in that case we can already fork many small processes that consume all memory and oom kill everything else on the system already.