Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp4692867imm; Mon, 30 Jul 2018 21:05:15 -0700 (PDT) X-Google-Smtp-Source: AAOMgpeWj0o9I9yUn+E7XTfy0qizZIowZjbuojulXFAh5p4HGI2Y0HlUkBr52Nf3iw9iWNyzJQBM X-Received: by 2002:a63:a919:: with SMTP id u25-v6mr19358808pge.211.1533009915350; Mon, 30 Jul 2018 21:05:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533009915; cv=none; d=google.com; s=arc-20160816; b=lb9a/6WCtbSmwY+uQ8lGG8WB8UcNFOCwoh6SFUkust/K0z0tdiTkcJYNiD7zpjooqY VNVbUaSSvigZuIpobhNdhMMOuwdSZLgrkcJjKb3VALP7pdkBdqsY+FKV1pzNcJzCa9wW Al52yYXWw4Wde7LfzndVsJEdWduBlvVaLikuTBwjSMuW/aLv4Fte6uIvLQmLFxg7jz9+ Qab37KnBxKwK3HF0rp7rtCCdmce22/MvHVmoQXyuP3F4r1fKVGtdApoq0raX2PynGOtT LPWFRboqfkQwE6yeTeO5iFwEab4cFpjC63PZhgTvcu4r58uu28NZoeju0DCHsOxkW27m dLYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=kQR8GsVqrqzv+OibNj/iF6+WKazJsHAJ6bt4XIhAKJY=; b=p51QeX3UlLrbV9XfzsGYGDZAHU5PtzxIhC7yrOTtMDTMAp425NMk2F1IZzuAXCZzNM C/yv8vwJ2Vse2xybv6eGyL8t0BvkG/26kbh7NfFXY9ES+YeFbgYGC9TChlHgIZKpFElm U7mgd1vTx07kGi3+fzTZKNM8hd26d1+4y6FyytjVI4EoWlE9JykyGiGh4Am2rlaWdvez zVQFxJhNuyNi03AwhBgJl2pt2RBVga62oVrwee7Po2m28bbf1JUf39y0W2PIc8IkE89s MILTjeO/7mduwNjgLgCgJZEeKVr7YR8olaN/UKdWpgbDMhCsiR6wERYnKtGi7QgnPXaj libg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=IXgFt3uU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e17-v6si12379613pgm.671.2018.07.30.21.05.01; Mon, 30 Jul 2018 21:05:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=IXgFt3uU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729645AbeGaFmM (ORCPT + 99 others); Tue, 31 Jul 2018 01:42:12 -0400 Received: from mail-ua0-f193.google.com ([209.85.217.193]:35793 "EHLO mail-ua0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727188AbeGaFmM (ORCPT ); Tue, 31 Jul 2018 01:42:12 -0400 Received: by mail-ua0-f193.google.com with SMTP id q12-v6so9366879ual.2; Mon, 30 Jul 2018 21:03:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=kQR8GsVqrqzv+OibNj/iF6+WKazJsHAJ6bt4XIhAKJY=; b=IXgFt3uUCJkBTr4sxYD7570icc7doVnncqhhuuwEOHHykmP6nAibz1MYZMvOUvZ/M7 lzNQ4oD/7U+0PooBnHHDKCFbm35TNI/bLxfV97XSPed6SQxsdlqkXKVkHayI5RdlrGej m1BNsEg4ngWquNsXZri1wm0CsK7il9Ck9vJ9/PZ4jB+hNarMZyiv3Kn4iTCQ+trhWWTp eVX3QajSdR/pGS+9a2/O2FK+g2ypJs2ob82LZ26GZ70TBoaoiVNWecISWDE/Cb0f/scm 8JPwxHpgnEYZc7ZpjHiSXKzUzudxhenLJQaFtN5cJciSAIAzo/jJQo/h6XvLXQl1Bby0 kdgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=kQR8GsVqrqzv+OibNj/iF6+WKazJsHAJ6bt4XIhAKJY=; b=Psn34mgSpeXkcFPFiJKa5bbOtPtzuzCj12rQHilwNPqKE7xz8a0l4v8lT9Gr8gwAEy Kwffma+zdoAV+wyOpUAhyQQVQEYC7BV4o18/87rIt+4/t7phJyHReE35HpCvUgO0K3kS Sc1SHLu4j8ew08wMRwfPwGasYsMc0dOQ+g8rfD6HXKfk+bMaNrMBKmyslGJpuzOsfp33 uOYauyJm11U6FWLitqvGpydHi5Ud5sRdVAZcluFUjhexxsoEkHKofAGYN9FWliIZD65j +yb8uz91XKFghFscgidM3y/pVFPzsatOJP44FI8BO9leealMZIeuUakIwQ/gQAjHdkh+ c56Q== X-Gm-Message-State: AOUpUlE4k6I9L86pjgdDBuLdKUTvtkp/vLL2+vXer62ELIA4WAPHpQP/ PuJlSC0hNOcj83eL1ibAykNv8SgQx4uWSyQbrXudEX+y X-Received: by 2002:ab0:10cc:: with SMTP id x12-v6mr13992380uab.55.1533009838760; Mon, 30 Jul 2018 21:03:58 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a67:7f52:0:0:0:0:0 with HTTP; Mon, 30 Jul 2018 21:03:18 -0700 (PDT) In-Reply-To: <20180731020629.GB22311@joelaf.mtv.corp.google.com> References: <20180731020122.GA22311@joelaf.mtv.corp.google.com> <20180731020629.GB22311@joelaf.mtv.corp.google.com> From: Y Song Date: Mon, 30 Jul 2018 21:03:18 -0700 Message-ID: Subject: Re: [PATCH v2] Add BPF_SYNCHRONIZE_MAPS bpf(2) command To: Joel Fernandes Cc: Alexei Starovoitov , Daniel Colascione , Joel Fernandes , LKML , Tim Murray , Network Development , Lorenzo Colitti , Chenbo Feng , Mathieu Desnoyers , Alexei Starovoitov , Daniel Borkmann Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 30, 2018 at 7:06 PM, Joel Fernandes wrote: > On Mon, Jul 30, 2018 at 07:01:22PM -0700, Joel Fernandes wrote: >> On Sun, Jul 29, 2018 at 06:51:18PM +0300, Alexei Starovoitov wrote: >> > On Thu, Jul 26, 2018 at 7:51 PM, Daniel Colascione wrote: >> > > BPF_SYNCHRONIZE_MAPS waits for the release of any references to a BPF >> > > map made by a BPF program that is running at the time the >> > > BPF_SYNCHRONIZE_MAPS command is issued. The purpose of this command is >> > > to provide a means for userspace to replace a BPF map with another, >> > > newer version, then ensure that no component is still using the "old" >> > > map before manipulating the "old" map in some way. >> > > >> > > Signed-off-by: Daniel Colascione >> > > --- >> > > include/uapi/linux/bpf.h | 9 +++++++++ >> > > kernel/bpf/syscall.c | 13 +++++++++++++ >> > > 2 files changed, 22 insertions(+) >> > > >> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >> > > index b7db3261c62d..5b27e9117d3e 100644 >> > > --- a/include/uapi/linux/bpf.h >> > > +++ b/include/uapi/linux/bpf.h >> > > @@ -75,6 +75,14 @@ struct bpf_lpm_trie_key { >> > > __u8 data[0]; /* Arbitrary size */ >> > > }; >> > > >> > > +/* BPF_SYNCHRONIZE_MAPS waits for the release of any references to a >> > > + * BPF map made by a BPF program that is running at the time the >> > > + * BPF_SYNCHRONIZE_MAPS command is issued. The purpose of this command >> > >> > that doesn't sound right to me. >> > such command won't wait for the release of the references. >> > in case of map-in-map the program does not hold >> > the references to inner map (only to outer map). >> >> I didn't follow this completely. >> >> The userspace program is using the inner map per your description of the > > Sorry just to correct myself, here I meant "The kernel eBPF program is using > the inner map on multiple CPUs" instead of "userspace". > > thanks, > > - Joel > > > > > >> algorithm for using map-in-map to solve the race conditions that this patch >> is trying to address: >> >> If you don't mind, I copy-pasted it below from your netdev post: >> >> if you use map-in-map you don't need extra boolean map. >> 0. bpf prog can do >> inner_map = lookup(map_in_map, key=0); >> lookup(inner_map, your_real_key); >> 1. user space writes into map_in_map[0] <- FD of new map >> 2. some cpus are using old inner map and some a new >> 3. user space does sys_membarrier(CMD_GLOBAL) which will do synchronize_sched() >> which in CONFIG_PREEMPT_NONE=y servers is the same as synchronize_rcu() >> which will guarantee that progs finished. >> 4. scan old inner map >> >> In step 2, as you mentioned there are CPUs using different inner maps. So >> could you clarify how the synchronize_rcu mechanism will even work if you're >> now saying "program does not hold references to the inner maps"? The program only held references to the outer maps, and the outer map held references to the inner maps. The user space program can add/remove the inner map for a particular outer map while the prog <-> outer-map relationship is not changed. >> >> Thanks! >> >> - Joel >>