Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp3900796rdh; Tue, 28 Nov 2023 06:53:26 -0800 (PST) X-Google-Smtp-Source: AGHT+IGueGpMp/s4AWT2+CreHtZVL52ed+vBPSy5L1Kjazatnx7rrsp4cnnTMaeNlMcB2uXlX5E7 X-Received: by 2002:a17:902:d50d:b0:1cf:c3f7:7d51 with SMTP id b13-20020a170902d50d00b001cfc3f77d51mr10778547plg.49.1701183206637; Tue, 28 Nov 2023 06:53:26 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701183206; cv=pass; d=google.com; s=arc-20160816; b=zZVuaaOH5qcMTT1RjppBSgc0ikwd1F0LAplIxUhaKo17ZQh0sosj6kXfE9EH3j8v4m NhUClTbYVrUA1d7vM4plKU8GuWyX77AJIGgIoA3Vm6aJ4net2G0x99yXbEpzMANOmDD3 5cc+HkHncm4JbR18oAtf2odGuUY94O+V81MhH7supLuunpWvleZjm0q93CJ/0eMNC4dh JLby/5e9zjJuSgvAr6GZ2igJV8CAVh0E5nOUeDFQzejHCoQo2gVLDDt2/gV4Kgbr+SHa TEkm20IOzKXERspP7jQ6Rmqnqbzy+Ux9fLozEQ8ch/naK3x2YbTyaJubrxhxpHDrKzyS BPLg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:in-reply-to:content-disposition :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Hn8zTSBx0EECv8Se1zHPIwuPb6kgjkjZ5pWXIZ28jvU=; fh=L0FMUhxtLWHW/ozrr+bZNHpweh0padKv7vx2hPUD1JY=; b=dU4LTJUEdFlwR3VI/ifFE7UiHv61RHS1+YQGqhedzTz+v+W9iWkmqmNveZ0DU/iZtE 9JcfD6D+iOpI6qmJ0WiMEAfu7saeNHwjnbcBKC//huMOoCJ1ZSc58CclPEBlffdb70it otearCDKieMYN8/b0oT5IfHOHR804kC1SzVLMwDR15eLZ4w89wTc8yTyfcVbAoRNVzxi Bq9f9f55y2bgiw4bGaT/6xMv8fr5UeJUdmph8ZNjP79XcZb++Q/C+beJoBa1wHIf5nbk 0syvf7dL7h8JtqwBSo96wkovhmki2vLkOJILrXSRhDI1Elm+MindnXoPu+fWVJaZAMqk vdpg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@memverge.com header.s=selector2 header.b=xnWvHOZ2; arc=pass (i=1 spf=pass spfdomain=memverge.com dkim=pass dkdomain=memverge.com dmarc=pass fromdomain=memverge.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=memverge.com Return-Path: Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id o12-20020a170902d4cc00b001cfc0272b48si6957530plg.103.2023.11.28.06.53.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Nov 2023 06:53:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@memverge.com header.s=selector2 header.b=xnWvHOZ2; arc=pass (i=1 spf=pass spfdomain=memverge.com dkim=pass dkdomain=memverge.com dmarc=pass fromdomain=memverge.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=memverge.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id C1F59809C11A; Tue, 28 Nov 2023 06:53:23 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346375AbjK1Owl (ORCPT + 99 others); Tue, 28 Nov 2023 09:52:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346381AbjK1OvQ (ORCPT ); Tue, 28 Nov 2023 09:51:16 -0500 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2054.outbound.protection.outlook.com [40.107.243.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2247710E4; Tue, 28 Nov 2023 06:51:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=faLTWgLz9q/Ed5nrkYTlLWcqF9tgFPBwjkBqpTy0IRP/WV0kSXNGkxCU/fjeAUtfiIkU/WvdLBcomh1PgFL1Nm6I8/YmNmH8zmWaCHE7sVpRGUipud6ULdynEt52VMfcKpbWDFjFLbwh2rgvZnJJiSnitnBp5kbbJ9HixugnxHnlQHM12+JDXMXgbJei4twG8Q1Vbw4m1jA2A22R2cQsNy7hEg5ASx5SoxGejNdxOTGuqagNimRTEgJgOZM90es2tr7ZfPE1ZD93pbrx7RrndA31FudNZpoo0RWDJgzPFJIkrtTL1BAFcG53lCSVx7gkBXMm2KdRS3XqcivR6/b3yw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Hn8zTSBx0EECv8Se1zHPIwuPb6kgjkjZ5pWXIZ28jvU=; b=RQido1zZtvf3Im3FEdINaxqfUGZoT+ZgLI6KhX8INQRGi9BNRWLYCCzzt9Parpi/issDI1VR1mwNzKtckxuXfyhPrykJ4mr35sypt0ATO7zEu1XmSNgt9ogcGO6Ri2iUzv/snAd1pvnDlHn+V4tm5Lnl3qp2fnc2YFnoKCjXuxn5lbI37Odzg/3KPfKTxjyzOVdRo6DvvpcAVPrGbOMEWGpjyDGDzg2wrvNcrxFtFMU7XlQBvZai5yHqpfTzYlowdyxcGv2URVKBX2wb7vVCemdQaXa5fT+MTynIsNRzkwAfSL4Ts9Rplo48fWuKVPkTWAiHklArS6CvIRac8BIHTw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=memverge.com; dmarc=pass action=none header.from=memverge.com; dkim=pass header.d=memverge.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=memverge.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Hn8zTSBx0EECv8Se1zHPIwuPb6kgjkjZ5pWXIZ28jvU=; b=xnWvHOZ2jxtIMGBkC+TDqJRvnmF9ZWBdjpHr6qyDFRX6PNZdwZ6sgx7Yz+aRr1LkkS/RwRHuQC5ZmMxC7nAyFI7kCtkMuBX9yx8VZTRybH696OA2akn6svRqdMB2aE5sRJbG3s5q1isd728rVtMm8HFyd7ZdENgn3fDGdrHCYeI= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=memverge.com; Received: from SJ0PR17MB5512.namprd17.prod.outlook.com (2603:10b6:a03:394::19) by CH3PR17MB6665.namprd17.prod.outlook.com (2603:10b6:610:130::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7046.22; Tue, 28 Nov 2023 14:51:17 +0000 Received: from SJ0PR17MB5512.namprd17.prod.outlook.com ([fe80::381c:7f11:1028:15f4]) by SJ0PR17MB5512.namprd17.prod.outlook.com ([fe80::381c:7f11:1028:15f4%5]) with mapi id 15.20.7046.015; Tue, 28 Nov 2023 14:51:17 +0000 Date: Tue, 28 Nov 2023 09:51:08 -0500 From: Gregory Price To: Michal Hocko Cc: Gregory Price , linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, arnd@arndb.de, tglx@linutronix.de, luto@kernel.org, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, tj@kernel.org, ying.huang@intel.com Subject: Re: [RFC PATCH 06/11] mm/mempolicy: modify do_mbind to operate on task argument instead of current Message-ID: References: <20231122211200.31620-1-gregory.price@memverge.com> <20231122211200.31620-7-gregory.price@memverge.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: PH0PR07CA0016.namprd07.prod.outlook.com (2603:10b6:510:5::21) To SJ0PR17MB5512.namprd17.prod.outlook.com (2603:10b6:a03:394::19) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ0PR17MB5512:EE_|CH3PR17MB6665:EE_ X-MS-Office365-Filtering-Correlation-Id: 4810fc7d-0249-4544-5943-08dbf02179a3 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 71PBl1X4/90/P4PmUZBXSRSL+Hw7aFQLuJvIPjHGcji2ngKZNd/GtkNkkDqGjkfAJkq+Zdh1rCgj2a3KZb335g/6TJ0kgMIdz3GEOmtoH4dyQqpsHyZFQh8CYZCZOMbGqeUkaYaZF3+zYGmAyHZkn3Pro2S4N5VhBlU0xGTXez7na20pbRdm5gnCmyCa++4JLW/DjLRzw9LsZMMkBB5AVy69JyaHevlYln6M9UQjB2Km6Zx3HUTaRvnyjbGlP/6XI/BqD0kKf5pVxtnCboTyaHbcxu+c/zf97SeM2ApJCo1zluKpkyN389GC02kqeQhjd8QCnM8yb3uyl1uOQKYCQ0qSlJGj7QXMDyZUKF2EWE4Fqfi+37iatQcVCCh52zEmbEk3hmmsiSEtZ2HJhsR0J9mwO3l3IqVL6kcrs69CqQ+wcHuT9NfY/GwKsuXvMScicayslVfZnbOAAvC+ZfKOev9r7FgELbY0k5Hf3CW1VvSg+ideKGpy/sX172NfcrSiCofa97Q8c/iBZVHIcqFiXhs4iwYTBm4u8At3Ajjv0ui3i2PxQNJrJ+yE+s5/Rx56e0fSKbEExJrKtlG7JhiuDiQZBdtF9BGb7DFLwxeOGytjNBa8YP2yXcwFtpjDLjk6 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ0PR17MB5512.namprd17.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(136003)(366004)(39850400004)(396003)(376002)(346002)(230273577357003)(230173577357003)(230922051799003)(64100799003)(186009)(1800799012)(451199024)(86362001)(2906002)(5660300002)(6512007)(6506007)(8676002)(4326008)(7416002)(41300700001)(44832011)(8936002)(83380400001)(316002)(38100700002)(36756003)(2616005)(478600001)(6486002)(6666004)(66476007)(66946007)(6916009)(26005)(66556008);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?gyyk/Iy4Cvamq0arg7zCpgtUF4FFWXj0WoXRNm84sE+0BPED70VtMxgGXbBa?= =?us-ascii?Q?Lnti6Q+EDHfH2HmgM9RPBXqQ/HbuBJ2oQWtNsIgISucPBswnFgyN09IzKJze?= =?us-ascii?Q?o0j3qESMs9D0qWGCwRBDIMmr63HCUmWAcHpZR3G0d/JjJyfdxamen/NFTiza?= =?us-ascii?Q?8O7oiQEd0VZ3tcL+4ANF+tLEmvSDYiorZAb3FqqPLhvdb6uW1FvJhnDBvkMT?= =?us-ascii?Q?WMm0JwEj3K1qPqOkeYFXzVMHffXwErKwBcXuWzSyxzd3AFSSPmQ5Rem7I1nP?= =?us-ascii?Q?31I1fa22ehf7jvqrJM6V2pnZWWINAlaM/FqLV8G+CIYQD1Gnq3esU68ssJLn?= =?us-ascii?Q?aBd6LtD0nncrbj3cEXNVgz36jHPTcXa+txZN6ZSX2zRKo9DEMwJ6ZCo4yvSL?= =?us-ascii?Q?1bA8BDAF3yBY8m207DTYKOjwdZfjgAYIe6+iOkdPaftaMjdXeBW/7PA+ZxU8?= =?us-ascii?Q?zOPIxfbF9u8b/E2pUPWPbpDoVRtTlOjwJG6yDpa5x3mjYum2YKBs2HqNL04c?= =?us-ascii?Q?9AyBmEXlpdWfVkO4x/udibgTAGCwEH8toncBUpTISh0dt6nbbShMs5mowHY/?= =?us-ascii?Q?PsgQ9C85FCV4dTrkvpWiMeiybMDW9u9cDsuUVLed2PiOkgaWCD0voUtP90MF?= =?us-ascii?Q?4FkQtZR6JVKXAP1y6SnLfOpo/oCgkKKql+ceOOVVKWK8VoN0drpJSjuHpnS/?= =?us-ascii?Q?lFIMHKJEBCCTZmSMtcL4LsNedX3N5LM1iuy6xcRTjodzdgppp/jxsgD6Cedf?= =?us-ascii?Q?KAbszsU96HqMWYczVdqEHtThu1s0yORWEevt+O7O9ltELea2bzZPcf8FGhSv?= =?us-ascii?Q?pKq9kai+WjIGA1HMfhry9He1sOD6piCfiwlOkT1DCuIolcdpE98nQbJllkqf?= =?us-ascii?Q?84gh3yhSCZ/fBZWGvwZAerwbshvlR42/UnEUvWh00L8VIGuipIUcQZZE00Mc?= =?us-ascii?Q?CWYI+KfoKPiZNqb7fldVe5rLbyN6SwpUWWCECzwS3pfXSzK0JlKF/izuJ2SJ?= =?us-ascii?Q?RVGf3RafSuYN8KnuyaQydpSpPzyQJTSIyb5iiuV7CMJqH0At/GG0PrNKGmo6?= =?us-ascii?Q?VkTyGpdPq4dLo5Hgrq9dzqpD0z4+FojfioyLSK6a/kkQzT4DViOt3+dcyspz?= =?us-ascii?Q?gKPRPKPcO0IPB8uHp9oLKFNXlZLENaDq2rz9E3RQiYVdlvBqOPiNfhquLD6L?= =?us-ascii?Q?EGstIcPx0O50Jdk4LOWSxCC8crkCWZ3Na2jYfdsJhjK4XLUQ6mCqCNzRlBgI?= =?us-ascii?Q?1kGgGhmGO4sz1vVhGCdVxZd6kd8fV5iyqd9R5AfO7SjFrW9xVMYgHPKziQWy?= =?us-ascii?Q?ks33geCxYIGnbNxqHc3t/7E91voG2qSZ2rWO3OHGq47jR4So4xlnlBpTp9H/?= =?us-ascii?Q?dwIS3JkN+BQZx8N7aF0pS2Rmud3xrrmkIgU5de07gD+xKfVrpPlZQBEyhg5N?= =?us-ascii?Q?0gQ0lCCBsGpiJ0U7HgtJWUhZ+JwJR/OadRfuaL6BqJJbUqgvBlJsmt/bnt3m?= =?us-ascii?Q?inDjEzA8ng0PeMfNVfn/If/nwhD7FVno0tJQ7/SsJK5+IEUEZ+4EUlj2dau/?= =?us-ascii?Q?cQxyqVKRtOiZ+Xm3Q/2VcuYEBrhZfCGbZH690IWZ93plADeFF19sDFonN9iU?= =?us-ascii?Q?dA=3D=3D?= X-OriginatorOrg: memverge.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4810fc7d-0249-4544-5943-08dbf02179a3 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR17MB5512.namprd17.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2023 14:51:16.9337 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 5c90cb59-37e7-4c81-9c07-00473d5fb682 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 2vTYPlIXuVqEoU3frhq0qus2YCCPhKuV+iv6xXe93YXRPvlKMdWY5b3Ik2YiDwznCUE5dfFOaQAJIsAHKrtl+wyrB2VTrwXuKbhSlTigD/w= X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH3PR17MB6665 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Tue, 28 Nov 2023 06:53:23 -0800 (PST) On Tue, Nov 28, 2023 at 03:11:06PM +0100, Michal Hocko wrote: > On Wed 22-11-23 16:11:55, Gregory Price wrote: > [...] > > + * Like get_vma_policy and get_task_policy, must hold alloc/task_lock > > + * while calling this. > > + */ > > +static struct mempolicy *get_task_vma_policy(struct task_struct *task, > > + struct vm_area_struct *vma, > > + unsigned long addr, int order, > > + pgoff_t *ilx) > [...] > > You should add lockdep annotation for alloc_lock/task_lock here for clarity and > also... > > @@ -1844,16 +1899,7 @@ struct mempolicy *__get_vma_policy(struct vm_area_struct *vma, > > struct mempolicy *get_vma_policy(struct vm_area_struct *vma, > > unsigned long addr, int order, pgoff_t *ilx) > > { > > - struct mempolicy *pol; > > - > > - pol = __get_vma_policy(vma, addr, ilx); > > - if (!pol) > > - pol = get_task_policy(current); > > - if (pol->mode == MPOL_INTERLEAVE) { > > - *ilx += vma->vm_pgoff >> order; > > - *ilx += (addr - vma->vm_start) >> (PAGE_SHIFT + order); > > - } > > - return pol; > > + return get_task_vma_policy(current, vma, addr, order, ilx); > > I do not think that all get_vma_policy take task_lock (just random check > dequeue_hugetlb_folio_vma->huge_node->get_vma_policy AFAICS) > hm, i might have gotten turned around on this one. Forgot to check for external references to get_vma_policy. I thought I considered it, but i clearly did not leave myself any notes if I did. This pattern is troublesome, we're holding the task lock during the callback stack in __get_vma_policy - just incase that returns NULL so we can return the task policy instead. If that vma is shared, it will take the vma shared policy lock (sp->lock) I almost want to change this interface to return NULL if the VMA doesn't have one, and change callers to fetch the task policy explicitly instead of implicitly returning the task policy. At least then we'd only take the task lock on an explicit access to the *Task* policy. > Also I do not see policy_nodemask to be handled anywhere. That one is > used along with get_vma_policy (sometimes hidden like in > alloc_pages_mpol). It has a dependency on > cpuset_nodemask_valid_mems_allowed. That means that e.g. mbind on a > remote task would be constrained by current task cpuset when allocating > migration targets for the target task. I am wondering how many other > dependencies like that are lurking there. bah! thought i dug all these out, but i missed alloc_migration_target_by_mpol from do_mbind. I'll need to take another look at the calls to cpusets interfaces to make sure i dig this out. The number of hidden accesses to current is really nasty :[ > -- > Michal Hocko > SUSE Labs