Received: by 2002:ab2:6a05:0:b0:1f8:1780:a4ed with SMTP id w5csp3297547lqo; Wed, 15 May 2024 05:55:50 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWfSp4dWySPLap0ykWWc9rjWLPlaoSuir/NR5LxHUPNGMai/U7EsJ6MC+rXlMr1fpiqQ5Tib2uqJqeqdjwIlEnkkQ1zhfMSIq9zW5INhQ== X-Google-Smtp-Source: AGHT+IHAONcfuk1A56r2yWqFK6nH7DhKHu4e7ZmZuIvMzomU4ClMxZL4Bl8KxoXFPYuJbwey8+g5 X-Received: by 2002:a17:90a:eb07:b0:2a3:10d3:239d with SMTP id 98e67ed59e1d1-2b6ccd6bbbdmr13237403a91.32.1715777750088; Wed, 15 May 2024 05:55:50 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715777750; cv=pass; d=google.com; s=arc-20160816; b=jibZVO5dQrJAcZqE9DKe0CLIGb7wSlajGLt0WLavTOqtZ68MSt9MKhK9vnm+qlE4LK G6Jvh1Evf5eQj3kxx0voQ9Y4RmRUDGgL5YZlbHij8dprOK2jD/R6UWfAJDQpjks7t0U6 pFWjzR3N6E0joBksy7vwFmGhQ7oEi5uBmX8vHRzq3MSLeHGswz0JitmvL/3Ngo0Og+Nm RK5Wxij665e4ZFVMLy04ZAVMBkmJNO0A+nXBBlUbnGu843/Nl/wkY6NkRwHtZ3Hitlgk Wlz6nowwWsPrQE0GRQLRQDAnBrr4vhVDLvyX1PEdzcjumaZz8T/G6/nfPiE8B3pyfk3L fi0Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=vu11EeMFysr0s0X80qX6ZZNyhjciJQ++IXjrRGklOts=; fh=GaSAYfh5FLoTC6t2Xhg6bNmq0NjRUvBIE2VOTI0dJc4=; b=VZkywDlII6yj4UtqR8Ywa+gLW70TMLvqT9cXgRbF2peUg9Ylt2LiXOYCXDqQASNdRy SQt4VAaqYCyNJVCsIbB5jKzhCeGwun+4N+1GZ+vuiigZ1qUKWt3uv0gLP04Glln9j5Lr 31KgpFtxHHVkmrwEfIvWyf+TYme/ov0yCiPU8xepv+ZdmJM/6baaD2BsKn2aH7W3Dsni fU2KboHOnCVUdTNUdj4t1kkkJ5gqJf6hYP1kGmmtHDJCUb0+DUZQBqDB82cifF8EDjAW AaFNI00r7LXBJlpmTSqO55vu07UB7kwK3YWWZ3h9NClPoloOE/nIFUI7XYX2UmWKgrEH wnAA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-11-20 header.b=P7mc4APK; arc=pass (i=1 spf=pass spfdomain=oracle.com dkim=pass dkdomain=oracle.com dmarc=pass fromdomain=oracle.com); spf=pass (google.com: domain of linux-kernel+bounces-179862-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-179862-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=oracle.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id 98e67ed59e1d1-2b671471c93si13666158a91.97.2024.05.15.05.55.49 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 May 2024 05:55:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-179862-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-11-20 header.b=P7mc4APK; arc=pass (i=1 spf=pass spfdomain=oracle.com dkim=pass dkdomain=oracle.com dmarc=pass fromdomain=oracle.com); spf=pass (google.com: domain of linux-kernel+bounces-179862-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-179862-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=oracle.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 8BD4728520F for ; Wed, 15 May 2024 12:55:27 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E3DAE128399; Wed, 15 May 2024 12:54:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="P7mc4APK" Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3578E82860; Wed, 15 May 2024 12:54:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715777663; cv=none; b=IK7nyROKa1bvMmRA/YI0AUwa4EBuD8dSCpyIgwQiLC34kxrTVkQWokRBLhY7ZQb2OXxKTNHpmT3Xp/cpoR7IUbbDHZ+1ftF2j62B5NZaNFRHA6WPvvKWm52Yf2ux/QiikbTDbKc+8O5dOqiGQP3M+gHLQ74TXgwZ3fA1nYovEtA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715777663; c=relaxed/simple; bh=MIxN3WKFAODN7v/HXRR/ds7aPZze5HQV8ktUpU9r0F0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=FkE7+y7gnISfAPLmROeZ8pTjDUCFpb1E2nRKLs4k6tj2tw35e6LjNl3e/Fn8xpbb9MuYEXqrSzHWsSE6aFxlK/dDCJzOm5fD2SwelgkNrlJ3VQrbhqoxiXkvHTcmJUNpRIQtwiiT53kOETKTpT+iS2K3vOgRFHJme2JlRp68bEM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=P7mc4APK; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 44F7nsbN023517; Wed, 15 May 2024 12:54:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2023-11-20; bh=vu11EeMFysr0s0X80qX6ZZNyhjciJQ++IXjrRGklOts=; b=P7mc4APK1I6HcyPO6LgBCnhV+8uWWO8hALvsD6ANMfvT2zceJJ09V0sWu04q9gHgAXkz uiGIUdK3CbHlCTEK2v7P2EKG2kmmkQt9OtoejRs4gmMNXoZHwCywINDyLnVa9C0BHjV6 2MChULgUIOabFDkccXM1Ei1V9zelAzPZ+nNAdgmT+8xVT2qz0OT1HQMb9qFBdN6a6+IU KcJRo8KdLVX2BC9TxBT3PMUqefUxxsawHCdgzn4T7JlMkFXT8t0mluhpkbCi1z4qvgZO HyWTSKf+324KFOSGF67F/Ctl3/gZZiPzLQdoirSDTeGSbXKIQIrYpabqohHr4MDEgPpL 0A== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3y3tx34rwu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 15 May 2024 12:54:00 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 44FCZeHJ038274; Wed, 15 May 2024 12:53:53 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3y24pxgujk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 15 May 2024 12:53:53 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 44FCmlra038458; Wed, 15 May 2024 12:53:52 GMT Received: from lab61.no.oracle.com (lab61.no.oracle.com [10.172.144.82]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3y24pxgud9-2; Wed, 15 May 2024 12:53:52 +0000 From: =?UTF-8?q?H=C3=A5kon=20Bugge?= To: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, rds-devel@oss.oracle.com Cc: Jason Gunthorpe , Leon Romanovsky , Saeed Mahameed , Tariq Toukan , "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Tejun Heo , Lai Jiangshan , Allison Henderson , Manjunath Patil , Mark Zhang , =?UTF-8?q?H=C3=A5kon=20Bugge?= , Chuck Lever , Shiraz Saleem , Yang Li Subject: [PATCH v2 1/6] workqueue: Inherit NOIO and NOFS alloc flags Date: Wed, 15 May 2024 14:53:37 +0200 Message-Id: <20240515125342.1069999-2-haakon.bugge@oracle.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20240515125342.1069999-1-haakon.bugge@oracle.com> References: <20240515125342.1069999-1-haakon.bugge@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.650,FMLib:17.11.176.26 definitions=2024-05-15_06,2024-05-15_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxscore=0 spamscore=0 phishscore=0 suspectscore=0 mlxlogscore=999 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2405010000 definitions=main-2405150090 X-Proofpoint-GUID: MZWHxKg_o9YJ2gNopnjKutRdnNNRyYBR X-Proofpoint-ORIG-GUID: MZWHxKg_o9YJ2gNopnjKutRdnNNRyYBR For drivers/modules running inside a memalloc_{noio,nofs}_{save,restore} region, if a work-queue is created, we make sure work executed on the work-queue inherits the same flag(s). This in order to conditionally enable drivers to work aligned with block I/O devices. This commit makes sure that any work queued later on work-queues created during module initialization, when current's flags has PF_MEMALLOC_{NOIO,NOFS} set, will inherit the same flags. We do this in order to enable drivers to be used as a network block I/O device. This in order to support XFS or other file-systems on top of a raw block device which uses said drivers as the network transport layer. Under intense memory pressure, we get memory reclaims. Assume the file-system reclaims memory, goes to the raw block device, which calls into said drivers. Now, if regular GFP_KERNEL allocations in the drivers require reclaims to be fulfilled, we end up in a circular dependency. We break this circular dependency by: 1. Force all allocations in the drivers to use GFP_NOIO, by means of a parenthetic use of memalloc_noio_{save,restore} on all relevant entry points. 2. Make sure work-queues inherits current->flags wrt. PF_MEMALLOC_{NOIO,NOFS}, such that work executed on the work-queue inherits the same flag(s). That is what this commit contributes with. Signed-off-by: HÃ¥kon Bugge --- v1 -> v2: * Added missing hunk in alloc_workqueue() --- include/linux/workqueue.h | 2 ++ kernel/workqueue.c | 21 +++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 158784dd189ab..09ecc692ffcae 100644 --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -398,6 +398,8 @@ enum wq_flags { __WQ_DRAINING = 1 << 16, /* internal: workqueue is draining */ __WQ_ORDERED = 1 << 17, /* internal: workqueue is ordered */ __WQ_LEGACY = 1 << 18, /* internal: create*_workqueue() */ + __WQ_NOIO = 1 << 19, /* internal: execute work with NOIO */ + __WQ_NOFS = 1 << 20, /* internal: execute work with NOFS */ /* BH wq only allows the following flags */ __WQ_BH_ALLOWS = WQ_BH | WQ_HIGHPRI, diff --git a/kernel/workqueue.c b/kernel/workqueue.c index d2dbe099286b9..8eb7562372ce2 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -51,6 +51,7 @@ #include #include #include +#include #include #include #include @@ -3172,6 +3173,10 @@ __acquires(&pool->lock) unsigned long work_data; int lockdep_start_depth, rcu_start_depth; bool bh_draining = pool->flags & POOL_BH_DRAINING; + bool use_noio_allocs = pwq->wq->flags & __WQ_NOIO; + bool use_nofs_allocs = pwq->wq->flags & __WQ_NOFS; + unsigned long noio_flags; + unsigned long nofs_flags; #ifdef CONFIG_LOCKDEP /* * It is permissible to free the struct work_struct from @@ -3184,6 +3189,12 @@ __acquires(&pool->lock) lockdep_copy_map(&lockdep_map, &work->lockdep_map); #endif + /* Set inherited alloc flags */ + if (use_noio_allocs) + noio_flags = memalloc_noio_save(); + if (use_nofs_allocs) + nofs_flags = memalloc_nofs_save(); + /* ensure we're on the correct CPU */ WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) && raw_smp_processor_id() != pool->cpu); @@ -3320,6 +3331,12 @@ __acquires(&pool->lock) /* must be the last step, see the function comment */ pwq_dec_nr_in_flight(pwq, work_data); + + /* Restore alloc flags */ + if (use_nofs_allocs) + memalloc_nofs_restore(nofs_flags); + if (use_noio_allocs) + memalloc_noio_restore(noio_flags); } /** @@ -5583,6 +5600,10 @@ struct workqueue_struct *alloc_workqueue(const char *fmt, /* init wq */ wq->flags = flags; + if (current->flags & PF_MEMALLOC_NOIO) + wq->flags |= __WQ_NOIO; + if (current->flags & PF_MEMALLOC_NOFS) + wq->flags |= __WQ_NOFS; wq->max_active = max_active; wq->min_active = min(max_active, WQ_DFL_MIN_ACTIVE); wq->saved_max_active = wq->max_active; -- 2.45.0