Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp2669537imw; Wed, 6 Jul 2022 09:44:09 -0700 (PDT) X-Google-Smtp-Source: AGRyM1trvk4xurMZaHHuCwHXsLPdVOyCu6lNFgFsqQSH25LhCdBfgSZA7d4AtX9mrLNm+Imn71WY X-Received: by 2002:a63:4601:0:b0:412:562e:7243 with SMTP id t1-20020a634601000000b00412562e7243mr12434265pga.358.1657125849331; Wed, 06 Jul 2022 09:44:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657125849; cv=none; d=google.com; s=arc-20160816; b=dMXhaGm2h69bh6HHbhIq2FVwJ19uW9juzY1judNPuTdjHB7+tPPzq7JlGj3MNKNkha Ui1KufWHAwJugK+5ABG81vNwhgWCxFNF14fEq0GfrDGQ4N8xWLD/UcPHc2CFV309MZKh nUg4RUy5I3nsb4DvlS2qwWuThAZH9NK6MIyDCNR/0kpkubdzhu8T8B/nMhmoDLriaBpn q7m1IJD5xDCE95DrfpgwOTWdZeEpWY9p+U6S4tYjXTiqnjXnEG7dW/rKdzICA/DTbOYQ cGRoEffECqvb7U6d/dUAVlYYXR6fwAE4H0zie/o9M1P6+bP+FgMcjgbld/UBQNggV0fH PFyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=PhXxi/XavP96zC3WSzhoIHhyWbZryHgf8iD5yTBli5s=; b=J4irFbmcI7TfMnKyxSx68h0gk/yLuzVePch0ZnFOchwHJ6pNBjln0rg6jHMblvJFuU l+H8ZN4GuyQrfSfvcl9k8hPiWLuhRduZgbOZJmLKvk2RJD5PXthPyZIFORFxsF1eYHir sNaxPd8OthilmU+Y59KoYsmwwVWkCpF1XORRMO6dpJuC/bx6LCfrDDbXyOTMmDHDiH8I Z40PnapOZhfUiOmNTz8Lh11odUU5HaOoVHvFeKeGMBEJ8qrmsMuUdI1zV+XO4uC/KR5W jAgDYnhb9IQ11OivGsOX+9RtibJEse6nct1aU9WxldgBKjh6T5HfA80PHTzP8jL7cQMe CbyQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=GrB9ND2Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id oc11-20020a17090b1c0b00b001ed354b0bd4si32277491pjb.140.2022.07.06.09.43.57; Wed, 06 Jul 2022 09:44:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=GrB9ND2Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234053AbiGFQ23 (ORCPT + 99 others); Wed, 6 Jul 2022 12:28:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233514AbiGFQ20 (ORCPT ); Wed, 6 Jul 2022 12:28:26 -0400 Received: from smtp-fw-6002.amazon.com (smtp-fw-6002.amazon.com [52.95.49.90]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DCBBDB7; Wed, 6 Jul 2022 09:28:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1657124907; x=1688660907; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PhXxi/XavP96zC3WSzhoIHhyWbZryHgf8iD5yTBli5s=; b=GrB9ND2Yc/sW0G50m9daBrFDbpOlCTKvB/on26lznszvlKN5AWilGxTq 93IGEQjyBax8iNzNvD8mCSQieKb1oCvnPzUWnlOxyu8ub+BbSy7AH6Ki2 xa7tUO+cq+RJlOSH57vExR+0SbVcQSkTNOhOFXlaJHmZDuY3dft/kBsjX A=; X-IronPort-AV: E=Sophos;i="5.92,250,1650931200"; d="scan'208";a="218701716" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-iad-1d-7a21ed79.us-east-1.amazon.com) ([10.43.8.2]) by smtp-border-fw-6002.iad6.amazon.com with ESMTP; 06 Jul 2022 16:28:15 +0000 Received: from EX13MTAUWB001.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan3.iad.amazon.com [10.40.163.38]) by email-inbound-relay-iad-1d-7a21ed79.us-east-1.amazon.com (Postfix) with ESMTPS id ACBBE22006A; Wed, 6 Jul 2022 16:28:10 +0000 (UTC) Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.36; Wed, 6 Jul 2022 16:28:04 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.106) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1118.9; Wed, 6 Jul 2022 16:28:01 +0000 From: Kuniyuki Iwashima To: CC: , , , , , , , , , , , Subject: Re: [PATCH v1 net 11/16] net: Fix a data-race around sysctl_mem. Date: Wed, 6 Jul 2022 09:27:53 -0700 Message-ID: <20220706162753.47894-1-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220706092711.28ce57e6@gandalf.local.home> References: <20220706092711.28ce57e6@gandalf.local.home> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.43.160.106] X-ClientProxiedBy: EX13D19UWC001.ant.amazon.com (10.43.162.64) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Steven Rostedt Date: Wed, 6 Jul 2022 09:27:11 -0400 > On Wed, 6 Jul 2022 09:17:07 -0400 > Steven Rostedt wrote: > > > On Tue, 5 Jul 2022 22:21:25 -0700 > > Kuniyuki Iwashima wrote: > > > > > --- a/include/trace/events/sock.h > > > +++ b/include/trace/events/sock.h > > > @@ -122,9 +122,9 @@ TRACE_EVENT(sock_exceed_buf_limit, > > > > > > TP_printk("proto:%s sysctl_mem=%ld,%ld,%ld allocated=%ld sysctl_rmem=%d rmem_alloc=%d sysctl_wmem=%d wmem_alloc=%d wmem_queued=%d kind=%s", > > > __entry->name, > > > - __entry->sysctl_mem[0], > > > - __entry->sysctl_mem[1], > > > - __entry->sysctl_mem[2], > > > + READ_ONCE(__entry->sysctl_mem[0]), > > > + READ_ONCE(__entry->sysctl_mem[1]), > > > + READ_ONCE(__entry->sysctl_mem[2]), > > > > This is not reading anything to do with sysctl. It's reading the content of > > what was recorded in the ring buffer. > > > > That is, the READ_ONCE() here is not necessary, and if anything will break > > user space parsing, as this is exported to user space to tell it how to > > read the binary format in the ring buffer. > > I take that back. Looking at the actual trace event, it is pointing to > sysctl memory, which is a major bug. > > TRACE_EVENT(sock_exceed_buf_limit, > > TP_PROTO(struct sock *sk, struct proto *prot, long allocated, int kind), > > TP_ARGS(sk, prot, allocated, kind), > > TP_STRUCT__entry( > __array(char, name, 32) > __field(long *, sysctl_mem) > > sysctl_mem is a pointer. > > __field(long, allocated) > __field(int, sysctl_rmem) > __field(int, rmem_alloc) > __field(int, sysctl_wmem) > __field(int, wmem_alloc) > __field(int, wmem_queued) > __field(int, kind) > ), > > TP_fast_assign( > strncpy(__entry->name, prot->name, 32); > > __entry->sysctl_mem = prot->sysctl_mem; > > > They save the pointer **IN THE RING BUFFER**!!! > > __entry->allocated = allocated; > __entry->sysctl_rmem = sk_get_rmem0(sk, prot); > __entry->rmem_alloc = atomic_read(&sk->sk_rmem_alloc); > __entry->sysctl_wmem = sk_get_wmem0(sk, prot); > __entry->wmem_alloc = refcount_read(&sk->sk_wmem_alloc); > __entry->wmem_queued = READ_ONCE(sk->sk_wmem_queued); > __entry->kind = kind; > ), > > TP_printk("proto:%s sysctl_mem=%ld,%ld,%ld allocated=%ld sysctl_rmem=%d rmem_alloc=%d sysctl_wmem=%d wmem_alloc=%d wmem_queued=%d kind=%s", > __entry->name, > __entry->sysctl_mem[0], > __entry->sysctl_mem[1], > __entry->sysctl_mem[2], > > They are now reading a stale pointer, which can be read at any time. That > is, you get the information of what is in sysctl_mem at the time the ring > buffer is read (which is useless from user space), and not at the time of > the event. > > Thanks for pointing this out. This needs to be fixed. For the record, Steve fixed this properly here, so I'll drop the tracing part in v2. https://lore.kernel.org/netdev/20220706105040.54fc03b0@gandalf.local.home/