Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp8007914rwb; Tue, 13 Dec 2022 00:26:10 -0800 (PST) X-Google-Smtp-Source: AA0mqf4ARUCi8rP0R1sOy1RVuhxA7eAvFaEBckmqnJ+6yNThyCCWfx0C66RMuT2vhmLoNDOoclDN X-Received: by 2002:a05:6402:110c:b0:46f:7453:a99d with SMTP id u12-20020a056402110c00b0046f7453a99dmr6893329edv.6.1670919970509; Tue, 13 Dec 2022 00:26:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670919970; cv=none; d=google.com; s=arc-20160816; b=RW7MQK4MxR0LsiZQnqDegnxsfNqVlpDAV8bqlTn3ehbJIzzZPm4DexUgKbPt6w3dxE gxycj/5i7iaNRe+Krx5kbIKbXj2nM8PFoLQ5PLJ/i/k0tBFgR99g85yjGppH5tIcrAVh 1dk7QMo/biP1gr+OCRF4aBkqTTLwV0L9WW2fKNZEU3C31qr1z1Fsi92adDut9kJBWTI4 kUxxwEy9+VEqRVkkHFR4IoZFoUKM2hwsuNWdt20CItdQeNsqLak4cK8Gslz0Py6O4Vb5 Vf+2ndd+meSVNViyzfhTv62+GGLDRNdWNuS4o45wKSg10iKSZYbQEt2RRIxVdTRq5hUP 3dUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=VFqAyp486VxHBW/rh0z5cNT8CQKi3KWcwqHcpX33JJc=; b=JTHPClI1DeA5iMN3LlCZg3z2w5Kx/kFu0fH2ehP8fJbR2TNhyP3u2NVSJ4SptpVq50 qKZYg4y4lDFfU1rdjeRDt+etVx9PzkOxD3+41NF2PsuZkoV8FaE0qW4cVMrXtisqsFkO XVDbzG8Oj10+fA0omwS5wK8afDF626QBXRH96yv6egueCoFkoN6U55cm/IpiuG+Ars7E IEGH0dAR2OwY6jW2X4CgAxMoNijYKcPCgFCyeuic7KWH3kzSxDqiKB/tJY53NzWlHF+q ghHRpWWjLbaQjDIf8mziaDlN4LH8DZYO/9jR8xvSnQrfQF4txJtNkZbjW7HQSe4HRJKJ vYqw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="HApDkL/d"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j11-20020a05640211cb00b0046c455300edsi12517043edw.292.2022.12.13.00.25.52; Tue, 13 Dec 2022 00:26:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="HApDkL/d"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234257AbiLMIAL (ORCPT + 74 others); Tue, 13 Dec 2022 03:00:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233935AbiLMIAC (ORCPT ); Tue, 13 Dec 2022 03:00:02 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2604817ABA for ; Mon, 12 Dec 2022 23:59:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670918355; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=VFqAyp486VxHBW/rh0z5cNT8CQKi3KWcwqHcpX33JJc=; b=HApDkL/dPOVCslOfSFPS36LR9DB3HdtuZNAXBMZA7ccYP3Hld5dZz13oXDssIchbXYGIWz PsuxxKqSuCPGrqtTSVGo68lQj+HF+b4Hr16PBLy9rH8AOlod3sgAL43ffj0NYwklDCgoLE PUnf0SkFC4RHQFGFyM8u6XEXHCvEygI= Received: from mail-il1-f197.google.com (mail-il1-f197.google.com [209.85.166.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-161-HEAakjrnMXKcwfvZnDkVBQ-1; Tue, 13 Dec 2022 02:59:14 -0500 X-MC-Unique: HEAakjrnMXKcwfvZnDkVBQ-1 Received: by mail-il1-f197.google.com with SMTP id 7-20020a056e0220c700b0030386f0d0e6so6925254ilq.3 for ; Mon, 12 Dec 2022 23:59:14 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VFqAyp486VxHBW/rh0z5cNT8CQKi3KWcwqHcpX33JJc=; b=sELHGXLGvhGsqghOvofb3DyjxgUCOp5k670OtKzfCxf6VXO1FgKDbFQv5zczhD92b5 Wtdr1ETiLuHGWnoVeB708tsBo1nfSlv6XRQ4614S1DOUNYB9BXBLnihMTY3LvOP/+UFF egy7Q5Rps4sLmRWOUocpDXDXr5IyPoxBc3ypgmvT6x+mKldlb2oF8Z/39HJA1KZ3aFqH ObS5NOeOX4cG89V8exeosH0BK5yxxtOyryOnF7OUlS70BWMCAurdfFIbe3U3/F+X4bkk OuXD9dqx1ZqOK/3f/uQ9/CIhV7NeLaUSFZkfcImWIn2tVXCKXVzF7Gh7wcGWoQHeIDof 5o9Q== X-Gm-Message-State: ANoB5pl3j+uAlMkBllA3ytz134B7AVwld4PItBCulKfOw4IZT+jhrWEy beU86yM1rcy7prlBQzKam1u6XvUIdkVqnyFFAtj7cYs2TwLmJHKNMQrhHcHPyed8fDuWGiaahdv 32DNblaJTJ32CaN/IpjYTTdDGogCS5/0jLi4aVSHL X-Received: by 2002:a05:6e02:de1:b0:303:4414:8ea8 with SMTP id m1-20020a056e020de100b0030344148ea8mr12363607ilj.124.1670918353393; Mon, 12 Dec 2022 23:59:13 -0800 (PST) X-Received: by 2002:a05:6e02:de1:b0:303:4414:8ea8 with SMTP id m1-20020a056e020de100b0030344148ea8mr12363601ilj.124.1670918353170; Mon, 12 Dec 2022 23:59:13 -0800 (PST) MIME-Version: 1.0 References: <20221103155756.687789-1-benjamin.tissoires@redhat.com> <20221103155756.687789-6-benjamin.tissoires@redhat.com> <53f21d98-4ee6-c0e9-1c0a-5fae23c1b9a8@meta.com> <43e6e9ec-3a0c-7238-30b2-daa7e71b169b@meta.com> In-Reply-To: From: Benjamin Tissoires Date: Tue, 13 Dec 2022 08:59:02 +0100 Message-ID: Subject: Re: [PATCH hid v12 05/15] HID: bpf jmp table: simplify the logic of cleaning up programs To: Greg KH Cc: Yonghong Song , Jiri Kosina , Jonathan Corbet , Shuah Khan , Alexei Starovoitov , Tero Kristo , linux-kernel@vger.kernel.org, linux-input@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-doc@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 13, 2022 at 7:28 AM Greg KH wrote: > > On Mon, Dec 12, 2022 at 10:39:26AM -0800, Yonghong Song wrote: > > > > > > On 12/12/22 10:20 AM, Greg KH wrote: > > > On Mon, Dec 12, 2022 at 09:52:03AM -0800, Yonghong Song wrote: > > > > > > > > > > > > On 12/12/22 9:02 AM, Benjamin Tissoires wrote: > > > > > On Thu, Nov 3, 2022 at 4:58 PM Benjamin Tissoires > > > > > wrote: > > > > > > > > > > > > Kind of a hack, but works for now: > > > > > > > > > > > > Instead of listening for any close of eBPF program, we now > > > > > > decrement the refcount when we insert it in our internal > > > > > > map of fd progs. > > > > > > > > > > > > This is safe to do because: > > > > > > - we listen to any call of destructor of programs > > > > > > - when a program is being destroyed, we disable it by removing > > > > > > it from any RCU list used by any HID device (so it will never > > > > > > be called) > > > > > > - we then trigger a job to cleanup the prog fd map, but we overwrite > > > > > > the removal of the elements to not do anything on the programs, just > > > > > > remove the allocated space > > > > > > > > > > > > This is better than previously because we can remove the map of known > > > > > > programs and their usage count. We now rely on the refcount of > > > > > > bpf, which has greater chances of being accurate. > > > > > > > > > > > > Signed-off-by: Benjamin Tissoires > > > > > > > > > > > > --- > > > > > > > > > > So... I am a little bit embarrassed, but it turns out that this hack > > > > > is not safe enough. > > > > > > > > > > If I compile the kernel with LLVM=1, the function > > > > > bpf_prog_put_deferred() is optimized in a weird way: if we are not in > > > > > irq, the function is inlined into __bpf_prog_put(), but if we are, the > > > > > function is still kept around as it is called in a scheduled work > > > > > item. > > > > > > > > > > This is something I completely overlooked: I assume that if the > > > > > function would be inlined, the HID entrypoint BPF preloaded object > > > > > would not be able to bind, thus deactivating HID-BPF safely. But if a > > > > > function can be both inlined and not inlined, then I have no > > > > > guarantees that my cleanup call will be called. Meaning that a HID > > > > > device might believe there is still a bpf function to call. And things > > > > > will get messy, with kernel crashes and others. > > > > > > > > You should not rely fentry to a static function. This is unstable > > > > as compiler could inline it if that static function is called > > > > directly. You could attach to a global function if it is not > > > > compiled with lto. > > > > > > But now that the kernel does support LTO, how can you be sure this will > > > always work properly? The code author does not know if LTO will kick in > > > and optimize this away or not, that's the linker's job. > > > > Ya, that is right. So for in-kernel bpf programs, attaching to global > > functions are not safe either. For other not-in-kernel bpf programs, it > > may not work but that is user's responsibility to adjust properly > > (to different functions based on a particular build, etc.). > > So if in-kernel bpf programs will not work or are not safe, how will > in-kernel bpf programs properly attach? Sorry if that wasn't clear. Loading a bpf program from the kernel is fine and safe. But it was the use of it that wasn't. In my case, HID-BPF to fix devices is safe (whether the program is loaded from the kernel or from userspace): the bpf JIT/verifier ensures that there are no out of bound read/write and the API is properly defined. But the problem I am facing with the generic bpf implementation is that it is made to be a global processing and to attach to one given function, when I wanted to have a couple function + device. So in this patch, I actually abused BPF to get free event notifications when the bpf program was released. The first implementation (HID: initial BPF implementation) was safer than this patch because I was using BPF for notifications of my internals but I wasn't messing up with the reference count. So if I did not get the events, I wouldn't decrement the bpf_prog and the end result means that the bpf program would stay forever attached to the device. Not user friendly but it doesn't introduce a kernel crash. However, this patch is messing with reference counting of an internal kernel object assuming I would always get the event. This is not the case and so I get read after free errors. TL;DR: (ab)using BPF internally for kernel introspection to manage kernel structures is just plain wrong. Mea culpa. Cheers, Benjamin