Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3736824pxk; Tue, 29 Sep 2020 05:09:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzMhXSV8eDB4mJ7YZ42JXisiBLd/dfqDM4w7iI2CU38SdGm62IzI1ZngU5WI8/ory/d19ph X-Received: by 2002:a05:6402:70f:: with SMTP id w15mr2970001edx.202.1601381360107; Tue, 29 Sep 2020 05:09:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601381360; cv=none; d=google.com; s=arc-20160816; b=n4rNM6vRoJFP0TpQijqhIwDMKNkhqwi0XGbXhUBSM6+WHeKCC8qvmT3TzhO4obGuTe wxtNqE5ZiDG1nP63Atsb8A9dc5RXREGBOXrGydOjPKZO+LPFoZBzLTPjO8KEYiy+pSVM N3zcOGz8Mx4OgLgexCN9EaKDTWujTqF4vAQ5noDtwu1A95iUizJZ6xYDYPV9N2pstgKq kWPg7W1HV2uLFVCcpXCyWSF5auCqOSK0Gk9lv5y0y4FlBVCUY1bQMkdNJ2nHDAovOVYG dV2qgFQvrlTvQxlZzHToGA26fZNfsRAi/tBFVniUkMkcHSDz/g3bM96gguf2zjWWgWuZ +gKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=y+E7e9BmznoLWTangQs3/fCDuObjLg3n9HsSYF5tPCY=; b=fQYaPX6VmghdvVumqu8LLoctxZ9d3wz8mrlVLP3hwLip86mQJPa1Nf6chCBq1P4JBg 4OjzIkLS/8U6aUBOtMzt+vo/HhFOcnzMATbshKmcc+pXzg6JRuleWXW1S/CuhWFEvrZj qQZG+Wo10nN2RcJsB9aGPCu4triq0+A5yW2bW1fUxgPtmMVPRoHoV+YHsfqGRu/84mLt 9K5AJclu1Iwl4D8OnGkNEiTZxyB7O5tgEfpS3+/AhSmT8Qg87ikYks/6j9x9NBIqxjXo L0h4DSOBVYT1jbV+aUJdj0MrpC/zHI+3DTcU/L1aojpXMgsVhcW0bV9DP5LV8ytVCTiG q+iw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=QSXmlLlH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p18si2676574edr.166.2020.09.29.05.08.57; Tue, 29 Sep 2020 05:09:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=QSXmlLlH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731884AbgI2MID (ORCPT + 99 others); Tue, 29 Sep 2020 08:08:03 -0400 Received: from mx2.suse.de ([195.135.220.15]:32784 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731232AbgI2MH7 (ORCPT ); Tue, 29 Sep 2020 08:07:59 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601381277; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=y+E7e9BmznoLWTangQs3/fCDuObjLg3n9HsSYF5tPCY=; b=QSXmlLlHtFGfkDHnRvNJNA0fh8V8syNjnREqhSyW7+GYbGyFZm6fRhvPYhMfheHwqWxA6P mejsVWf1rFvsfaUy414sZfMmuy7E33fv1PCFQ+MpEHP9sQy+D3DXGiHVCHH+ap5MJ2oa/b xD8TCfYdiQ2Y+ma9MnoQ9+Dr8kcElfM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 84435ADC5; Tue, 29 Sep 2020 12:07:57 +0000 (UTC) Date: Tue, 29 Sep 2020 14:07:56 +0200 From: Michal Hocko To: paulmck@kernel.org Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, mingo@kernel.org, jiangshanlai@gmail.com, akpm@linux-foundation.org, mathieu.desnoyers@efficios.com, josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com, fweisbec@gmail.com, oleg@redhat.com, joel@joelfernandes.org, mgorman@techsingularity.net, torvalds@linux-foundation.org, "Uladzislau Rezki (Sony)" Subject: Re: [PATCH tip/core/rcu 14/15] rcu/tree: Allocate a page when caller is preemptible Message-ID: <20200929120756.GC2277@dhcp22.suse.cz> References: <20200928233041.GA23230@paulmck-ThinkPad-P72> <20200928233102.24265-14-paulmck@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200928233102.24265-14-paulmck@kernel.org> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 28-09-20 16:31:01, paulmck@kernel.org wrote: [...] > This commit therefore uses preemptible() to determine whether allocation > is possible at all for double-argument kvfree_rcu(). This deserves a comment. Because GFP_ATOMIC is possible for many !preemptible() contexts. It is the raw_spin_lock, NMIs and likely few others that are a problem. You are taking a conservative approach which is fine but it would be good to articulate that explicitly. > If !preemptible(), > then allocation is not possible, and kvfree_rcu() falls back to using > the less cache-friendly rcu_head approach. Even when preemptible(), > the caller might be involved in reclaim, so the GFP_ flags used by > double-argument kvfree_rcu() must avoid invoking reclaim processing. Could you be more specific? Is this about being called directly in the reclaim context and you want to prevent a recursion? If that is the case, do you really need to special case this in any way? Any memory reclaim will set PF_MEMALLOC so allocations called from that context will not perform reclaim. So if you are called from the reclaim directly then you might want to do GFP_KERNEL | __GFP_NOMEMALLOC | __GFP_NOWARN. That should handle both from-the-recalim and outside of reclaim contexts just fine (assuming you don't allocated from !preemptible()) context. > Note that single-argument kvfree_rcu() must be invoked in sleepable > contexts, and that its fallback is the relatively high latency > synchronize_rcu(). Single-argument kvfree_rcu() therefore uses > GFP_KERNEL|__GFP_RETRY_MAYFAIL to allow limited sleeping within the > memory allocator. [...] > static inline bool > -kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr) > +add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp, > + unsigned long *flags, void *ptr, bool can_sleep) > { > struct kvfree_rcu_bulk_data *bnode; > + bool can_alloc_page = preemptible(); > + gfp_t gfp = (can_sleep ? GFP_KERNEL | __GFP_RETRY_MAYFAIL : GFP_ATOMIC) | __GFP_NOWARN; This is quite confusing IMHO. At least without a further explanation. can_sleep is not as much about sleeping as it is about the reclaim recursion AFAIU your changelog, right? > int idx; > > - if (unlikely(!krcp->initialized)) > + *krcp = krc_this_cpu_lock(flags); > + if (unlikely(!(*krcp)->initialized)) > return false; > > - lockdep_assert_held(&krcp->lock); > idx = !!is_vmalloc_addr(ptr); > > /* Check if a new block is required. */ > - if (!krcp->bkvhead[idx] || > - krcp->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) { > - bnode = get_cached_bnode(krcp); > - if (!bnode) { > - /* > - * To keep this path working on raw non-preemptible > - * sections, prevent the optional entry into the > - * allocator as it uses sleeping locks. In fact, even > - * if the caller of kfree_rcu() is preemptible, this > - * path still is not, as krcp->lock is a raw spinlock. > - * With additional page pre-allocation in the works, > - * hitting this return is going to be much less likely. > - */ > - if (IS_ENABLED(CONFIG_PREEMPT_RT)) > - return false; > - > - /* > - * NOTE: For one argument of kvfree_rcu() we can > - * drop the lock and get the page in sleepable > - * context. That would allow to maintain an array > - * for the CONFIG_PREEMPT_RT as well if no cached > - * pages are available. > - */ > - bnode = (struct kvfree_rcu_bulk_data *) > - __get_free_page(GFP_NOWAIT | __GFP_NOWARN); > + if (!(*krcp)->bkvhead[idx] || > + (*krcp)->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) { > + bnode = get_cached_bnode(*krcp); > + if (!bnode && can_alloc_page) { > + krc_this_cpu_unlock(*krcp, *flags); > + bnode = kmalloc(PAGE_SIZE, gfp); What is the point of calling kmalloc for a PAGE_SIZE object? Wouldn't using the page allocator directly be better? -- Michal Hocko SUSE Labs