Received: by 2002:a05:6358:4e97:b0:b3:742d:4702 with SMTP id ce23csp3582319rwb; Tue, 16 Aug 2022 05:44:11 -0700 (PDT) X-Google-Smtp-Source: AA6agR4LC9nXpwH1EnbhBxaU4mEMVCSKNf0cHBKHIX2j6tFFnObTzreZoBS7dAwtj2oHIQxR2IzM X-Received: by 2002:a05:6402:3595:b0:43d:710a:3f3f with SMTP id y21-20020a056402359500b0043d710a3f3fmr18874674edc.375.1660653851401; Tue, 16 Aug 2022 05:44:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660653851; cv=none; d=google.com; s=arc-20160816; b=DWg3bzodrGUCEAq+8e3cBd0y4+oeoakrO2Dur6ES7sackCV/J/MZNo8OayxxC96s+c O0eazt7+815Byq1U0+wf/d4M94kUoWupT1I+gtIPgLOOWtKN7iiu+0KQTu8+yKWlhI+c BMp24qi7xhtkKLXt9ZguflI8LwGYGoIeVzOhOElZdmputLGrxeOS1wIjmDZaiivRWHBX SVZJMSeBC34txZJE1yjetAMglwPTKIBetgrIyKHpVlRga4HF0n9naMq/YpWdwh5yPc+w ryHXdPakI1nLedPsinPJMIdxoWO06XxM6osc5TMU0jAw/ndrWZwGkDOhdYHqEC2afMKy kXqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=12z+d6LglnQXoKvRDJb4MyEPTMUhYNXf828Z5Uit61g=; b=MbsDdHLFX/ieLSOGfZ2D19R8a9Z65Kj1EUb8egh3jaF5TkultyiqYCyu974xKRlcG3 jY43pXhmhuKx4COwOwcuv7CqUP+NxMGtJepmOOYMSGjvxsps7O68E7q19CSXq9Q+V4BL aQlEh6jmxdQHLQKqNCTZRvrPZJFZFRROJn06OEQI8AzBu4mp2jx0oi/ZweCkM1zYquPF FTrAXY5M+jpC+CM7jYbQd10yDOtiYu1zVyNTJyedbS4Of2Qnm1Rx8LeRfIs6sYC+ZJO9 Qtu+YRLmzBWaloXGVevairRhVvLzPkirnM17T8GJEY8G7dGW++pErQeRBVjLJeXsPksx Mgnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=hjPuWsDr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hd43-20020a17090796ab00b0073306936037si11373500ejc.320.2022.08.16.05.43.45; Tue, 16 Aug 2022 05:44:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=hjPuWsDr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231533AbiHPM22 (ORCPT + 99 others); Tue, 16 Aug 2022 08:28:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230015AbiHPM20 (ORCPT ); Tue, 16 Aug 2022 08:28:26 -0400 Received: from mail-lf1-x132.google.com (mail-lf1-x132.google.com [IPv6:2a00:1450:4864:20::132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E507361B3F; Tue, 16 Aug 2022 05:28:25 -0700 (PDT) Received: by mail-lf1-x132.google.com with SMTP id u3so14683477lfk.8; Tue, 16 Aug 2022 05:28:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=12z+d6LglnQXoKvRDJb4MyEPTMUhYNXf828Z5Uit61g=; b=hjPuWsDrCnwehNj5ojranCW3leCDRP9cbzilnfPqGUuAVlXX5mhdBc89FYhoB0J82k FBbLsi0zKgaeRpIxJ1yDFM74E8VNg0pO4LG7iv9jflzxLWP0fOVra7iPXwoTNJkO0iUI LxyrEjB+MnZegX2IHG26ovhpQmnQNQsDUjk2F45DH5I+wZRQnbTuXSTiiTyUdaGVYf6M NdeQTdlVRclIYh8GnA5RjhGk5cVDBayMm4VxF+cLbcZWNVnoHWKCwNylFwm6xdFPTan/ 6BXpim/59GH8ywME1CD15/UYalmxKoAJRETTk27d7DMhA3ohqBlrLLG/3O9bJ7JUyeaf gr/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=12z+d6LglnQXoKvRDJb4MyEPTMUhYNXf828Z5Uit61g=; b=yA6JxEU712OJLGuesIj23W2tTNwcmuVfCWWjik9JZyWBUYeFd2WK/L2+73x370POGY Zhi1u/2k7pqsb6xB1Z9pohpl8UyIa8Cm/K1jW2XpXEhTK7wH8b2ZWwJvMpTfjLNnZ2i/ RQZrMi1P4quR8lR+H7NS4agOqeLxxnhaxmkIWd6SiFzHO+zBx6m2ErLtFpZg2k61nYkg Poq9Jgd07w/HcVot6jBh0oGfXminp84kV2wgfN44MRcdkLdZMK46K/u6S6I6D7Vft8zF gBEkzsWITfBnXzCji72PwpiRhjPPmcPXasxTjEW1ylaJloyDfsaMYmzgiy26LULVXIUx jOPw== X-Gm-Message-State: ACgBeo2SmQEJ4HS5mNL+pkmGpQkfr1ErEQYZUnIlvIwjaldcgKAHzyzt /mWMdAIu/j+BZzT+9efll9v4OJwXB+ktvs7uoTE= X-Received: by 2002:a05:6512:1149:b0:48c:ddba:b793 with SMTP id m9-20020a056512114900b0048cddbab793mr6842994lfg.280.1660652904038; Tue, 16 Aug 2022 05:28:24 -0700 (PDT) MIME-Version: 1.0 References: <20220810171702.74932-1-flaniel@linux.microsoft.com> <20220810171702.74932-2-flaniel@linux.microsoft.com> In-Reply-To: From: Alban Crequy Date: Tue, 16 Aug 2022 14:28:12 +0200 Message-ID: Subject: Re: [RFC PATCH v1 1/3] bpf: Make ring buffer overwritable. To: Andrii Nakryiko Cc: Francis Laniel , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Joanne Koong , Dave Marchevsky , Lorenzo Bianconi , Geliang Tang , Hengqi Chen Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 16 Aug 2022 at 04:22, Andrii Nakryiko wrote: > > On Wed, Aug 10, 2022 at 10:18 AM Francis Laniel > wrote: > > > > By default, BPF ring buffer are size bounded, when producers already filled the > > buffer, they need to wait for the consumer to get those data before adding new > > ones. > > In terms of API, bpf_ringbuf_reserve() returns NULL if the buffer is full. > > > > This patch permits making BPF ring buffer overwritable. > > When producers already wrote as many data as the buffer size, they will begin to > > over write existing data, so the oldest will be replaced. > > As a result, bpf_ringbuf_reserve() never returns NULL. > > > > Part of BPF ringbuf record (first 8 bytes) stores information like > record size and offset in pages to the beginning of ringbuf map > metadata. This is used by consumer to know how much data belongs to > data record, but also for making sure that > bpf_ringbuf_reserve()/bpf_ringbuf_submit() work correctly and don't > corrupt kernel memory. > > If we simply allow overwriting this information (and no, spinlock > doesn't protect from that, you can have multiple producers writing to > different parts of ringbuf data area in parallel after "reserving" > their respective records), it completely breaks any sort of > correctness, both for user-space consumer and kernel-side producers. The perf ring buffer solved this issue by adding an option to write data backward with commit 9ecda41acb97 ("perf/core: Add ::write_backward attribute to perf event"). I'd like to see the BPF ring buffer have a backward option as well to make overwrites work without corruption. It's not completely clear to me if that will work but I'd like to explore this with Francis. (Francis and I work in the same team and we would like to use this for https://github.com/kinvolk/traceloop). Best regards, Alban