Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp6039263rwb; Sun, 11 Dec 2022 17:52:26 -0800 (PST) X-Google-Smtp-Source: AA0mqf4R7q0mSTz0iKU0YkicfRPtiosT/ALMS86rkXUT//+040r0q5cMIK2O6Y0oPH0stUdYUwi1 X-Received: by 2002:a05:6402:5486:b0:461:aeeb:9ce1 with SMTP id fg6-20020a056402548600b00461aeeb9ce1mr12316664edb.32.1670809946676; Sun, 11 Dec 2022 17:52:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670809946; cv=none; d=google.com; s=arc-20160816; b=0gZRxMldWW27G9Cdrvkx0EJ0rdcVQhnnf+jNWfAxPVPyc5VlZvy2LRTjOcqjkr0+ZO slrGFeaUh0mD+72SwaJhHZYPb1rurrSt9GmtnzAo2f8I3nFWFaoQcCtkRFFV5rl2H5kX oBUhMiNXqLXxAVuunVJZUPxcMHq7/TJp+57OvXF9/HRM4f0pphuazmKel2Jw/2CSkINd B26GfhsJu0C3VOBSKricqP3GWvyVioeUq/obtvJLI7weyz21JTmaNzB8Wvtszj8pEIN0 XCJSUTQhl12kcRGwsKQkgMpPdW9GiELxl8cK5i/GYZ1YHAnSBk1/gf43qBAl1HqZposN dErQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=C6lnvE9j+zbJeeVsnn0eGEObFI9vcSZBxmPcZEEAOdw=; b=fb2hPsSZ/yHk04WIlzhNo7MqEbXBA9RyCIv6lIuVoxyC6pBM/zI+JlJGDV46gteMoy WGeJ7UzSftpFWoh/e3e4ga+IO57VHJZdYTn/Kshr8MVXi2CU8rGR4w/IK6nUVzOoLnx7 MUKlAdoyLg4havQz/BWtqSieGEKN9MzcNFjGPxV/EWaPQ17nJ7sR5+3tOvbCW8CDZQ29 FJlc1EFIfIItCXzrPKeeOM5hwvgqB3Ms6rFGtep4K2Jtm7fooQtHAsf5H228go9pkpnE IEG0XXSxxKwyXpk4bSNBQa221O+Z508gaKqJZ7bge+sTzZfATOLqxDjPOpDO0RYNuPQw aeFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=HJ7c5b9Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q9-20020a056402518900b0046b9dc9bd27si6922166edd.157.2022.12.11.17.52.07; Sun, 11 Dec 2022 17:52:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=HJ7c5b9Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231169AbiLLBp3 (ORCPT + 78 others); Sun, 11 Dec 2022 20:45:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229475AbiLLBp0 (ORCPT ); Sun, 11 Dec 2022 20:45:26 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55AFCD2C3 for ; Sun, 11 Dec 2022 17:44:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670809467; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=C6lnvE9j+zbJeeVsnn0eGEObFI9vcSZBxmPcZEEAOdw=; b=HJ7c5b9Y460/85r3uY64oMbjCg1dTitNOMIJQv80QcWiBXSzI1v4j6Uohy/tOl0nw9GjuH SesSZ8T7nr6YaYHvsZylxvKPamQjCdivrz/buFTFcozmdaAduhvu4lEp6xBWO4sq9m8DAx YUMW4K5PfkcszkVddW3k3nZo/wm+BuM= Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-621-AQojn-_9MbSzdc-o0KDzGg-1; Sun, 11 Dec 2022 20:44:22 -0500 X-MC-Unique: AQojn-_9MbSzdc-o0KDzGg-1 Received: by mail-pl1-f199.google.com with SMTP id jc4-20020a17090325c400b00189ceee4049so9223597plb.3 for ; Sun, 11 Dec 2022 17:44:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=C6lnvE9j+zbJeeVsnn0eGEObFI9vcSZBxmPcZEEAOdw=; b=WZuBtcYGtS3a1mp+c/kV9GuPAphYmGugUBQOL9hziy0aWehFKg40jfZ+PldmEZeAiY OjLVUzcsr0V3IgndPjJolbouleDoCo7Ipr+LPndWHA7R7YOdEiACUsZB7VKIvnDYmRSa hrtOarecgV9Sm2a+xdmQOLo1UlTaCXePQnXmsfotEbQggs29X0F934oB5I9UzbaLyz4t DYEAKg9lvVA2L+hF08I5eGAq/uTKjTHP0Sne75B3YXr6VCxAHfXLsM13jgmzVvy/hqPl gnq8DdiOxDXx1B7+sZt2YGy4jj/cNTUeqiYPdqto0ITPH2WYhn1ilFtHYaU50oW8kmVO XgRg== X-Gm-Message-State: ANoB5pnBjCwGmNiWsv3zeiBpOJv33DEgVPNzF8XFxKiQdpl/guttMyUO MqHwbTLq+d06JV4NBbf9W8LJmoKEX7+b/+Uom4xEWexjAEF97Zt4w/ZnWIWT20mRKsQewCljFJy wmMS1ro7/ihLiax4jSpEWBc7e X-Received: by 2002:a05:6a00:a07:b0:573:3de7:89a with SMTP id p7-20020a056a000a0700b005733de7089amr20821147pfh.4.1670809461637; Sun, 11 Dec 2022 17:44:21 -0800 (PST) X-Received: by 2002:a05:6a00:a07:b0:573:3de7:89a with SMTP id p7-20020a056a000a0700b005733de7089amr20821133pfh.4.1670809461318; Sun, 11 Dec 2022 17:44:21 -0800 (PST) Received: from ?IPV6:2403:580e:4b40:0:7968:2232:4db8:a45e? (2403-580e-4b40--7968-2232-4db8-a45e.ip6.aussiebb.net. [2403:580e:4b40:0:7968:2232:4db8:a45e]) by smtp.gmail.com with ESMTPSA id g28-20020aa79ddc000000b00573769811d6sm4534577pfq.44.2022.12.11.17.44.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 11 Dec 2022 17:44:20 -0800 (PST) Message-ID: <669b0d70-6c56-272c-f6bc-586a2b720baa@redhat.com> Date: Mon, 12 Dec 2022 09:44:15 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.5.1 Subject: Re: [PATCH v3 1/5] pid: replace pidmap_lock with xarray lock To: Brian Foster , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: onestero@redhat.com, willy@infradead.org, ebiederm@redhat.com References: <20221202171620.509140-1-bfoster@redhat.com> <20221202171620.509140-2-bfoster@redhat.com> Content-Language: en-US From: Ian Kent In-Reply-To: <20221202171620.509140-2-bfoster@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/12/22 01:16, Brian Foster wrote: > As a first step to changing the struct pid tracking code from the > idr over to the xarray, replace the custom pidmap_lock spinlock with > the internal lock associated with the underlying xarray. This is > effectively equivalent to using idr_lock() and friends, but since > the goal is to disentangle from the idr, move directly to the > underlying xarray api. > > Signed-off-by: Matthew Wilcox > Signed-off-by: Brian Foster The xa array switch looks simpler than I expected, looks good. Reviewed-by: Ian Kent > --- > kernel/pid.c | 79 ++++++++++++++++++++++++++-------------------------- > 1 file changed, 40 insertions(+), 39 deletions(-) > > diff --git a/kernel/pid.c b/kernel/pid.c > index 3fbc5e46b721..3622f8b13143 100644 > --- a/kernel/pid.c > +++ b/kernel/pid.c > @@ -86,22 +86,6 @@ struct pid_namespace init_pid_ns = { > }; > EXPORT_SYMBOL_GPL(init_pid_ns); > > -/* > - * Note: disable interrupts while the pidmap_lock is held as an > - * interrupt might come in and do read_lock(&tasklist_lock). > - * > - * If we don't disable interrupts there is a nasty deadlock between > - * detach_pid()->free_pid() and another cpu that does > - * spin_lock(&pidmap_lock) followed by an interrupt routine that does > - * read_lock(&tasklist_lock); > - * > - * After we clean up the tasklist_lock and know there are no > - * irq handlers that take it we can leave the interrupts enabled. > - * For now it is easier to be safe than to prove it can't happen. > - */ > - > -static __cacheline_aligned_in_smp DEFINE_SPINLOCK(pidmap_lock); > - > void put_pid(struct pid *pid) > { > struct pid_namespace *ns; > @@ -129,10 +113,11 @@ void free_pid(struct pid *pid) > int i; > unsigned long flags; > > - spin_lock_irqsave(&pidmap_lock, flags); > for (i = 0; i <= pid->level; i++) { > struct upid *upid = pid->numbers + i; > struct pid_namespace *ns = upid->ns; > + > + xa_lock_irqsave(&ns->idr.idr_rt, flags); > switch (--ns->pid_allocated) { > case 2: > case 1: > @@ -150,8 +135,8 @@ void free_pid(struct pid *pid) > } > > idr_remove(&ns->idr, upid->nr); > + xa_unlock_irqrestore(&ns->idr.idr_rt, flags); > } > - spin_unlock_irqrestore(&pidmap_lock, flags); > > call_rcu(&pid->rcu, delayed_put_pid); > } > @@ -206,7 +191,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, > } > > idr_preload(GFP_KERNEL); > - spin_lock_irq(&pidmap_lock); > + xa_lock_irq(&tmp->idr.idr_rt); > > if (tid) { > nr = idr_alloc(&tmp->idr, NULL, tid, > @@ -233,7 +218,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, > nr = idr_alloc_cyclic(&tmp->idr, NULL, pid_min, > pid_max, GFP_ATOMIC); > } > - spin_unlock_irq(&pidmap_lock); > + xa_unlock_irq(&tmp->idr.idr_rt); > idr_preload_end(); > > if (nr < 0) { > @@ -266,34 +251,38 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, > INIT_HLIST_HEAD(&pid->inodes); > > upid = pid->numbers + ns->level; > - spin_lock_irq(&pidmap_lock); > - if (!(ns->pid_allocated & PIDNS_ADDING)) > - goto out_unlock; > for ( ; upid >= pid->numbers; --upid) { > + tmp = upid->ns; > + > + xa_lock_irq(&tmp->idr.idr_rt); > + if (tmp == ns && !(tmp->pid_allocated & PIDNS_ADDING)) { > + xa_unlock_irq(&tmp->idr.idr_rt); > + put_pid_ns(ns); > + goto out_free; > + } > + > /* Make the PID visible to find_pid_ns. */ > - idr_replace(&upid->ns->idr, pid, upid->nr); > - upid->ns->pid_allocated++; > + idr_replace(&tmp->idr, pid, upid->nr); > + tmp->pid_allocated++; > + xa_unlock_irq(&tmp->idr.idr_rt); > } > - spin_unlock_irq(&pidmap_lock); > > return pid; > > -out_unlock: > - spin_unlock_irq(&pidmap_lock); > - put_pid_ns(ns); > - > out_free: > - spin_lock_irq(&pidmap_lock); > while (++i <= ns->level) { > upid = pid->numbers + i; > - idr_remove(&upid->ns->idr, upid->nr); > - } > + tmp = upid->ns; > > - /* On failure to allocate the first pid, reset the state */ > - if (ns->pid_allocated == PIDNS_ADDING) > - idr_set_cursor(&ns->idr, 0); > + xa_lock_irq(&tmp->idr.idr_rt); > > - spin_unlock_irq(&pidmap_lock); > + /* On failure to allocate the first pid, reset the state */ > + if (tmp == ns && tmp->pid_allocated == PIDNS_ADDING) > + idr_set_cursor(&ns->idr, 0); > + > + idr_remove(&tmp->idr, upid->nr); > + xa_unlock_irq(&tmp->idr.idr_rt); > + } > > kmem_cache_free(ns->pid_cachep, pid); > return ERR_PTR(retval); > @@ -301,9 +290,9 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, > > void disable_pid_allocation(struct pid_namespace *ns) > { > - spin_lock_irq(&pidmap_lock); > + xa_lock_irq(&ns->idr.idr_rt); > ns->pid_allocated &= ~PIDNS_ADDING; > - spin_unlock_irq(&pidmap_lock); > + xa_unlock_irq(&ns->idr.idr_rt); > } > > struct pid *find_pid_ns(int nr, struct pid_namespace *ns) > @@ -647,6 +636,18 @@ SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags) > return fd; > } > > +/* > + * Note: disable interrupts while the xarray lock is held as an interrupt might > + * come in and do read_lock(&tasklist_lock). > + * > + * If we don't disable interrupts there is a nasty deadlock between > + * detach_pid()->free_pid() and another cpu that does xa_lock() followed by an > + * interrupt routine that does read_lock(&tasklist_lock); > + * > + * After we clean up the tasklist_lock and know there are no irq handlers that > + * take it we can leave the interrupts enabled. For now it is easier to be safe > + * than to prove it can't happen. > + */ > void __init pid_idr_init(void) > { > /* Verify no one has done anything silly: */