Received: by 2002:a05:6358:f14:b0:e5:3b68:ec04 with SMTP id b20csp1009111rwj; Thu, 22 Dec 2022 19:08:34 -0800 (PST) X-Google-Smtp-Source: AMrXdXuAk4Av708smXnPc+zvZzFFRSdgzJ1SXBD+64T6QjG5WQa5IffI7SAdyye2N2WIa8osCsEI X-Received: by 2002:a17:906:c2cb:b0:842:32e9:f1e9 with SMTP id ch11-20020a170906c2cb00b0084232e9f1e9mr3900646ejb.69.1671764914573; Thu, 22 Dec 2022 19:08:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671764914; cv=none; d=google.com; s=arc-20160816; b=dEX7+lBAKxKJ3gi8Hrm1nKWK41mV61wlHZldZUbyj6IOyrIla6jWffAR97o+TsWyRn hVyISyVzosShKkZXJrpSQpd8bcChGOh7jl8DQMDmQi70AI+FRKTFw+eEAKZf29mmRIOL dDBwOXb4LKbwaQKlawxX+JP0Q0KkJo4B8S2c+rSvbY1G7YA4NU65vAC2HXINu5nON6td m5DtZ5uiJwiMm0enVEeXWIx1AkCFcVMUtnO/n0fdz5PhU0JnfpbpOi081t+uoOH5Sm+p esY9xmLlJg5vxgvMKXN00HIODi1+5OW59MLp+ZO/QKqvCqtNknpC5P8TRt5N9xorHXqz LwVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=XfTwuOp2z1P9TAcyXMs8UlXEYy4igwWCoCrfoFaxg+o=; b=Enx2Ry8FqkY/baJPTvhU4O97hNLuTV8qWTfMJS6C2RTertUIvs511FPlQCpeYak1wi G4aTIc/FM428u8YeBp8vFf/qcG9RU0ZGRNTOP27GMyTIup/pqAXiCIj87tNtlGKrfUu5 FPOw9mQM3JV8YWBOacGCP3ekqkciWBY/h+OCqHR2rEZ6Gq+Yk9MuDrTOOe7QAo56Wmye /XqTIb/ZcbpoG/O/XMoEX/TAhZiX1TCKLXPqML1V7PT8+/aIgciXu2ldOyoN4iRLbAhz fwgMNToqb9ISYm6ToewRdrsHrUrNK/LFYxK5UGSeJGzNOFdubPoNQ7KIl6nCI7YFaRgK xxmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=CwPP7uSN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o22-20020a170906975600b007f1501a8c24si1905081ejy.216.2022.12.22.19.08.18; Thu, 22 Dec 2022 19:08:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=CwPP7uSN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229930AbiLWC3S (ORCPT + 67 others); Thu, 22 Dec 2022 21:29:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229582AbiLWC3Q (ORCPT ); Thu, 22 Dec 2022 21:29:16 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B953162C5; Thu, 22 Dec 2022 18:29:15 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 6D9B1B81F51; Fri, 23 Dec 2022 02:29:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A9B26C433EF; Fri, 23 Dec 2022 02:29:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1671762553; bh=YUA7NW/YjEael4wnH+SUazl990K3DLH51CosWijVaXU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=CwPP7uSNfIBDnrSmOGAzJTtbTIGpVJAlg8hdLDCfH6V/RqpZC/QYhmjbW3CRvCG/k 0VHOYfkynsELXiOhtb2B9rCYtMLSRu1IBByhgF+QckJ6f7RE7rrBSbB8xeyVqmJ9/i +3QhpEumGAzmk0ZL7Ys3p1vZKjNWChq19EwGDMAVkenlUZs62037gi6j3qML8SsfMb LvQ2NCTco8AtCeOPGRCjb4wzYiupLgNqN2moftVx6gmRkFUXhBrVNUWCSl5HUSeP/9 czNT38cZUT5wsUZZ5/f+o0uAZt5+Cji081BAyr7a+5DIDTHpT6MkFAkrAe6LFCjVzn EhQC0AQTlH7Og== Date: Fri, 23 Dec 2022 11:29:08 +0900 From: Masami Hiramatsu (Google) To: Masami Hiramatsu (Google) Cc: wuqiang , davem@davemloft.net, anil.s.keshavamurthy@intel.com, naveen.n.rao@linux.ibm.com, rostedt@goodmis.org, peterz@infradead.org, akpm@linux-foundation.org, sander@svanheule.net, ebiggers@google.com, dan.j.williams@intel.com, jpoimboe@kernel.org, linux-kernel@vger.kernel.org, lkp@intel.com, mattwu@163.com, Linux Trace Kernel Subject: Re: [PATCH v7 1/5] lib: objpool added: ring-array based lockless MPMC queue Message-Id: <20221223112908.de0ce1febba4f314f379a8de@kernel.org> In-Reply-To: <20221223004701.80fd132089e6a8b14531cf30@kernel.org> References: <20221212123153.190888-1-wuqiang.matt@bytedance.com> <20221212123153.190888-2-wuqiang.matt@bytedance.com> <20221223004701.80fd132089e6a8b14531cf30@kernel.org> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-8.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Fri, 23 Dec 2022 00:47:01 +0900 Masami Hiramatsu (Google) wrote: > > +/* try to retrieve object from slot */ > > +static inline void *objpool_try_get_slot(struct objpool_slot *os) > > +{ > > + uint32_t *ages = SLOT_AGES(os); > > + void **ents = SLOT_ENTS(os); > > + /* do memory load of head to local head */ > > + uint32_t head = smp_load_acquire(&os->head); > > + > > + /* loop if slot isn't empty */ > > + while (head != READ_ONCE(os->tail)) { > > + uint32_t id = head & os->mask, prev = head; > > + > > + /* do prefetching of object ents */ > > + prefetch(&ents[id]); > > + > > + /* > > + * check whether this item was ready for retrieval ? There's > > + * possibility * in theory * we might retrieve wrong object, > > + * in case ages[id] overflows when current task is sleeping, > > + * but it will take very very long to overflow an uint32_t > > + */ > > We should make sure it is safe in theory. The hot path can be loose but > it must be ensured before use. OS can be used very long time in some time > (e.g. 10 years) and uint32 is too short ... (uint64 is OK) OK, I understand what you concerned here. This means that the ages[id] overflows *and* back to the same value during here. But must objpool_pop() be called under preempt disabled? If not, it should be (and that must not be a big overhead). > > + if (smp_load_acquire(&ages[id]) == head) { > > + /* node must have been udpated by push() */ Please correct me. In my understanding, since the size of ents[] is always bigger than nr_objs, (tail & mask) == (head & mask) only if the ents[] is full and no free object (thus no push() happens) or ents[] is empty (in this case ages[id] != head). Thus the node is not updated if below cmpxchg is succeeded. > > + void *node = READ_ONCE(ents[id]); > > + /* commit and move forward head of the slot */ > > + if (try_cmpxchg_release(&os->head, &head, head + 1)) > > + return node; > > + } > > + > > + /* re-load head from memory and continue trying */ > > + head = READ_ONCE(os->head); > > + /* > > + * head stays unchanged, so it's very likely current pop() > > + * just preempted/interrupted an ongoing push() operation Since this can touch the other CPUs' slot, there should be another ongoing push() running on the same slot (so no need to limit the preempt/interrupt cases.) Also, this happens only when the push() is running on *the empty slot*. Thus we can consider this is empty, and return NULL. Can you update the comment above and make it clear that this exits here because it is empty until ongoing push() is done. Overall, some comments must be clear, but the code itself seems OK to me. Thank you, > > + */ > > + if (head == prev) > > + break; > > + } > > + > > + return NULL; > > +} > > + > > +/** > > + * objpool_pop: allocate an object from objects pool > > Ditto. > > Thank you, > > > + * > > + * args: > > + * @head: object pool > > + * > > + * return: > > + * object: NULL if failed (object pool is empty) > > + * > > + * objpool_pop can be nested, so can be used in any context. > > + */ > > +void *objpool_pop(struct objpool_head *head) > > +{ > > + int i, cpu = raw_smp_processor_id(); > > + void *obj = NULL; > > + > > + for (i = 0; i < num_possible_cpus(); i++) { > > + obj = objpool_try_get_slot(head->cpu_slots[cpu]); > > + if (obj) > > + break; > > + cpu = cpumask_next_wrap(cpu, cpu_possible_mask, -1, 1); > > + } > > + > > + return obj; > > +} > > +EXPORT_SYMBOL_GPL(objpool_pop); > > + > > +/** > > + * objpool_fini: cleanup the whole object pool (releasing all objects) > > + * > > + * args: > > + * @head: object pool to be released > > + * > > + */ > > +void objpool_fini(struct objpool_head *head) > > +{ > > + if (!head->cpu_slots) > > + return; > > + > > + /* release percpu slots */ > > + objpool_fini_percpu_slots(head); > > + > > + /* call user's cleanup callback if provided */ > > + if (head->release) > > + head->release(head, head->context); > > +} > > +EXPORT_SYMBOL_GPL(objpool_fini); > > -- > > 2.34.1 > > > > > -- > Masami Hiramatsu (Google) -- Masami Hiramatsu (Google)