Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp4615319imm; Mon, 30 Jul 2018 19:03:15 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfYh5SdNJS7NLur3p1nXqy3M+t2ehYA9hJ+tDW83papVfTcx9VXKiA5OqmbAozeUALJqkJO X-Received: by 2002:a63:4951:: with SMTP id y17-v6mr18798504pgk.32.1533002595089; Mon, 30 Jul 2018 19:03:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533002595; cv=none; d=google.com; s=arc-20160816; b=rarlcBE0XfHjs+7byDF/zjNWG+BhpZLnDFDQqzUzUjbWihy3sqYUDe1cnAqt8ql+/N DlfHcnALuiEZaWvlz7qj4OZ/kdkyePcJuEWksFBff3aoj/TLhgk7CV9ttPpi3TODjtC5 vZ7Xc4u68/xKJ1PKRVumGycOozskjDe+yx1pfPj32JYuZWmZtGAIK1F7ynRgKHB0ulPN HAJarDMQr+tkoks5KtL+e06oMwqPrStzWGyLxK2TMmXDn1BJndm1I7t7LRjspcK5FMAU kSloxyOrB5Raf567hIvp95ZUB3H0spKc4cfMaf79pyTIHD2+CdR1c3EGDsCGaBBy7qNl LwCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=7gzAywgMPxX/a5ASc5aeIjzsSE2iqWom6gNpA2ZwR3Y=; b=ETl0nBZKPbr7IQLG8ojZf8Qvuwi0uvIZn/xdHbr9ZvERNAZzbJDNkvhKX00KxfrhWr POoSLYmTonRLXGAVRK0RsUxVBNLznHDfHw9l4qXoVhG1TuEQtBqA6gmYot/5XgLzFh+s eGiv7YfeA/K5NZfBaRc4smwtFBw6pBBnMZw9z39bgTije0es0NE6Ptt0xEf96f4sTVz0 49VEfdr3+ppDRd0BciaySpY8RkUhesf1S9e8nky0Q2xZwxT6lH1Aeuo8jLDzs8vYrBQ3 NByflyWXT3qdUU1vH03g8bESJeqCn+BZfl2EBG/aS7Hv8+5dWwbRh/PCwy5mYtRyFkAr NRpw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=cQZkaRo5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x61-v6si3653255plb.216.2018.07.30.19.02.50; Mon, 30 Jul 2018 19:03:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=cQZkaRo5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731503AbeGaDjP (ORCPT + 99 others); Mon, 30 Jul 2018 23:39:15 -0400 Received: from mail-pf1-f195.google.com ([209.85.210.195]:42248 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726659AbeGaDjP (ORCPT ); Mon, 30 Jul 2018 23:39:15 -0400 Received: by mail-pf1-f195.google.com with SMTP id l9-v6so5355744pff.9 for ; Mon, 30 Jul 2018 19:01:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=7gzAywgMPxX/a5ASc5aeIjzsSE2iqWom6gNpA2ZwR3Y=; b=cQZkaRo5JuYIj+bNfqhPvbKpbM2JyQ7Q1qU3ZrpUNNCyMUzFbsW5bH0G/3cPHjJGDJ XVE4LJjv29qwqJJY6QCheIpDxkVyzOpGZCPkk/6jeCJeMwZ83pd0qr+CfwlKdy0IAjG3 kH/ovM+xY2F815NlPTY+N41hSRYLMQV4e1jnI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=7gzAywgMPxX/a5ASc5aeIjzsSE2iqWom6gNpA2ZwR3Y=; b=MHrlX7sMpfaZjOrkRkwzZ3Vn5vDNtC6MaVS8N6p/wehWjGpD5AoixhnnsYz1Yp5CCU Nh0Gt6sv5m6yYiiG+KkOPHSLHB2a6NrN9S5mrv4bIutN9rJiO70rAg8uo1QEgn6UfKnf f8AFSG9m2a0Yh5aRExiUHS/jM1/1H6dRb6sARIBei9Ek2E2usBrfnP/AC+iF6RpSf7YS B4OikLGx2Qm/1lUpwEWx4DmvuoQ6lv/zNOVqqwuamLNQuyZq3DwxkAZOyKbXaVzfzX8W hZt/OfWyXo34tolM3pK7n7cZJ1ptdhSxNUW3wjaaFiaEGkhXcnncC0hTWn4mvE9voFjG RT7g== X-Gm-Message-State: AOUpUlFKpy4lLBs5jI2ShBcSskhJxPAIKfU7sHIz7dmWC+qwQ1YKzcfa VT2DflYtlm9bqLyahFGPugzhGg== X-Received: by 2002:aa7:824d:: with SMTP id e13-v6mr20179780pfn.97.1533002484050; Mon, 30 Jul 2018 19:01:24 -0700 (PDT) Received: from localhost ([2620:0:1000:1600:3122:ea9c:d178:eb]) by smtp.gmail.com with ESMTPSA id w16-v6sm30310668pfi.101.2018.07.30.19.01.23 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 30 Jul 2018 19:01:23 -0700 (PDT) Date: Mon, 30 Jul 2018 19:01:22 -0700 From: Joel Fernandes To: Alexei Starovoitov Cc: Daniel Colascione , Joel Fernandes , LKML , Tim Murray , Network Development , Lorenzo Colitti , Chenbo Feng , Mathieu Desnoyers , Alexei Starovoitov , Daniel Borkmann Subject: Re: [PATCH v2] Add BPF_SYNCHRONIZE_MAPS bpf(2) command Message-ID: <20180731020122.GA22311@joelaf.mtv.corp.google.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Jul 29, 2018 at 06:51:18PM +0300, Alexei Starovoitov wrote: > On Thu, Jul 26, 2018 at 7:51 PM, Daniel Colascione wrote: > > BPF_SYNCHRONIZE_MAPS waits for the release of any references to a BPF > > map made by a BPF program that is running at the time the > > BPF_SYNCHRONIZE_MAPS command is issued. The purpose of this command is > > to provide a means for userspace to replace a BPF map with another, > > newer version, then ensure that no component is still using the "old" > > map before manipulating the "old" map in some way. > > > > Signed-off-by: Daniel Colascione > > --- > > include/uapi/linux/bpf.h | 9 +++++++++ > > kernel/bpf/syscall.c | 13 +++++++++++++ > > 2 files changed, 22 insertions(+) > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > index b7db3261c62d..5b27e9117d3e 100644 > > --- a/include/uapi/linux/bpf.h > > +++ b/include/uapi/linux/bpf.h > > @@ -75,6 +75,14 @@ struct bpf_lpm_trie_key { > > __u8 data[0]; /* Arbitrary size */ > > }; > > > > +/* BPF_SYNCHRONIZE_MAPS waits for the release of any references to a > > + * BPF map made by a BPF program that is running at the time the > > + * BPF_SYNCHRONIZE_MAPS command is issued. The purpose of this command > > that doesn't sound right to me. > such command won't wait for the release of the references. > in case of map-in-map the program does not hold > the references to inner map (only to outer map). I didn't follow this completely. The userspace program is using the inner map per your description of the algorithm for using map-in-map to solve the race conditions that this patch is trying to address: If you don't mind, I copy-pasted it below from your netdev post: if you use map-in-map you don't need extra boolean map. 0. bpf prog can do inner_map = lookup(map_in_map, key=0); lookup(inner_map, your_real_key); 1. user space writes into map_in_map[0] <- FD of new map 2. some cpus are using old inner map and some a new 3. user space does sys_membarrier(CMD_GLOBAL) which will do synchronize_sched() which in CONFIG_PREEMPT_NONE=y servers is the same as synchronize_rcu() which will guarantee that progs finished. 4. scan old inner map In step 2, as you mentioned there are CPUs using different inner maps. So could you clarify how the synchronize_rcu mechanism will even work if you're now saying "program does not hold references to the inner maps"? Thanks! - Joel