Received: by 2002:a05:6a10:9e8c:0:0:0:0 with SMTP id y12csp255422pxx; Mon, 26 Oct 2020 07:56:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyxJi24pBXyWFNB6MZZRICfve47M6qvIrlFiuFnuakq4Bo8XNemrv/lBz3H3QAxJn710Y18 X-Received: by 2002:a05:6402:651:: with SMTP id u17mr487059edx.206.1603724195407; Mon, 26 Oct 2020 07:56:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603724195; cv=none; d=google.com; s=arc-20160816; b=ynQ/+mXuc/OmwcxjbtKdnuNyV1JVDkfUw37hN9Y/px02Dne2bs8oh4Jfjy6ABXVVDk srwblhBs1AhBJovtIeROs8b7wva2/vWauNgF/std2d5bAQT2ldIvjjC9tCf2eUd72Nm1 8mfp1K1NuzKxvT+TFffF06mUXxLqG8l+NkauuG9znysb1u1f0Uq/iPoufypFzQ39xkx4 3mN0VIEQSSR5AasVVRX0WhTwjdfPKDl9ckr1taQWmEp2BR7DVw4bpYjGy++9X642sAB1 eCUwU1zN2rMvnDjpavR/ufBJsbyT0iX64L9rifteypubKfFv4lvRxi3kjCEDAp2RVxRc CfCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=XGztmttyc1vFczgaZB5VSz64yvy4aQW8Rz4MpsoAo98=; b=XFS6ry7sr0TI1LYnko4ODU5PYNRNbDi16JEjTd+ncJPnbUbpRAmc9Zgm4nFZXv405z IG9V3B7+yZQNtmyh6jq1h/vygGsSkYOw/y0q0mvhq2M1cY/3e1j5pahYZg3dZpOp/Ysn teVG5eSRBpTW3s+t3DXryRYqb6AVzECapokX+q/T0MdYr6oqFRDRMLp8WCCY5Ae6CFwr UOYn8trVGYa8onapcl5lRvpiph+ST/4Q0NAxFKEV4aaaP4bqa8dfTaPeJsKY9YdN4qdJ h8WCA22Z7y7Ei0VyPkOkEfP/3ZRgMhYlaU4dJu9SLGyPDGWM3dc1dQXowpGDtuwL9osP b6iA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gj23si8958520ejb.600.2020.10.26.07.56.13; Mon, 26 Oct 2020 07:56:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1782305AbgJZOye (ORCPT + 99 others); Mon, 26 Oct 2020 10:54:34 -0400 Received: from mail.kernel.org ([198.145.29.99]:39686 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1782104AbgJZOyd (ORCPT ); Mon, 26 Oct 2020 10:54:33 -0400 Received: from gandalf.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D822022281; Mon, 26 Oct 2020 14:54:30 +0000 (UTC) Date: Mon, 26 Oct 2020 10:54:28 -0400 From: Steven Rostedt To: Vlastimil Babka Cc: Axel Rasmussen , Ingo Molnar , Andrew Morton , Michel Lespinasse , Daniel Jordan , Jann Horn , Chinwen Chang , Davidlohr Bueso , David Rientjes , Yafang Shao , LKML , Linux MM Subject: Re: [PATCH v4 1/1] mmap_lock: add tracepoints around lock acquisition Message-ID: <20201026105428.3205d2b0@gandalf.local.home> In-Reply-To: References: <20201020184746.300555-1-axelrasmussen@google.com> <20201020184746.300555-2-axelrasmussen@google.com> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 23 Oct 2020 19:56:49 +0200 Vlastimil Babka wrote: > > I'm somewhat sure this code can be called in interrupt context, so I > > don't think we can use locks to prevent this situation. I think it > > works like this: say we acquire the lock, an interrupt happens, and > > then we try to acquire again on the same CPU; we can't sleep, so we're > > stuck. > > Yes, we could perhaps trylock() and if it fails, give up on the memcg path. > > > I think we can't kmalloc here (instead of a percpu buffer) either, > > since I would guess that kmalloc may also acquire mmap_lock itself? > > the overhead is not worth it anyway, for a tracepoint > > > Is adding local_irq_save()/local_irq_restore() in addition to > > get_cpu()/put_cpu() sufficient? > > If you do that, then I guess you don't need get_cpu()/put_cpu() anymore. But > it's more costly. > > But sounds like we are solving something that the tracing subystem has to solve > as well to store the trace event data, so maybe Steven has some better idea? How big of a buffer to you need? The way that ftrace handles reserving buffer for events (which coincidentally, I just talked about today at Open Source Summit Europe!), is that I simply use local_add_return() and have a buffer index. And the stack trace code does this as well. For using a temporary buffer, you would allocate (at enabling of the tracepoint, which is why we have a "reg" and "unreg" form of the TRACE_EVENT macro called TRACE_EVENT_FN (for "function)). Have this temporary buffer per cpu and handle all the contexts that it would be called in. For ftrace, we usually make it 4 (normal, softirq, irq and NM context). Ftrace will use local_add_return, but as I'm guessing, the interrupt context will be done with its buffer after writing the event, you don't need to worry about the counter being atomic. You simply need to do: DEFINE_PER_CPU(char *, my_buffer); static int my_buf_idx; At initialization: for_each_possible_cpu(cpu) { per_cpu(my_buffer, cpu) = kmalloc(MY_BUFF_SIZE *context_needed, GFP_KERNEL); if (!per_cpu(my_buffer, cpu)) goto out_fail; per_cpu(my_buf_idx, cpu) = 0; } Then for the event: preempt_disable(); idx = this_cpu_add(my_buf_idx, MY_BUFF_SIZE); current_buffer = this_cpu_ptr(my_buffer); buf = current_buffer[idx - MY_BUFF-SIZE]; copy_my_data_to_buffer(buf); trace_my_trace_point(buf); this_cpu_sub(my_buf_idx, MY_BUFF_SIZE); preempt_enable(); Now if an interrupt were to come in, it would do the same thing, but will use the buffer after the MY_BUFF_SIZE, and you don't need to worry about one corrupting another. Once the index has been incremented, a interrupt will use the portion of the buffer after the "allocate" part. And even if it happened right at the this_cpu_add(), the interrupt would put it back before returning back to the context that it interrupted. Is this what you need to deal with? -- Steve