Received: by 2002:ab2:7a55:0:b0:1f4:4a7d:290d with SMTP id u21csp187247lqp; Thu, 4 Apr 2024 10:08:43 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWEk7yky+l9Dv97J65lsqt+OtH2Wz6rKtWu4bhxkPj7eS+JSwyMcjaPVo2wpNaG5bVdmUuxB/xzTev2fS+9rCRcdeO34v2w1hfIQfSKjg== X-Google-Smtp-Source: AGHT+IFKYSsr3C+luOYGiX/VTOz8AgritBQOWVEkrT3QcImTXGD3dC2j2W9nISrf8hBNJvYu+Dxq X-Received: by 2002:a05:6a21:81a8:b0:1a7:3365:d8e1 with SMTP id pd40-20020a056a2181a800b001a73365d8e1mr413327pzb.20.1712250523396; Thu, 04 Apr 2024 10:08:43 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712250523; cv=pass; d=google.com; s=arc-20160816; b=NDcB9Y+HuqlMUOyenUarRjSqAzhG0SStoFxzmsN7qnRqNktQKbe7U/MtUoo0jESw4D 5fN3uVl0q9pX1w7kNMYM8LgByaZ312ZfFiXPSLuonYWONhVpGXfMC+cfdk8gVnd1v+1p IzxAyFKtglFRirso8mr2ECLfb5YRqf1e076wZP9BHwn2Opn5JhbGianlikOzQjTfBfYM nHoNGn+j73Soktd2kE+DO/KyANVZpViC1dA6QLzQHc2LD55fHYTmxre3LlKPFnX18qOH QrA8aSpSHshxsnVK8Pl59swGkY7ut8A7uuEgInDlIZ3O7iQSQmA8QeIZGaJak9zxwLHH jqnw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=lbLmsAzzJdGPLAvAsCum54KXU9vJ5+iCYp76EK5DMIc=; fh=0GkD8yoWK6GcQuvAdw9kTE4F3qwbqpEiGW0Rud2o44g=; b=k5ALQIdlLEifDVKOZZraCzvqSSDv6pKBNRdmB6QIeuQNUS0OdNlcYtNj7oSaQlOaIJ MGmAtVngMm++Vpu8W1XT4Y4ZeO4pvhCjXn0Uo71TyHJRD2tAu42X1K5DX1GxycLtfJrk KuyeKa9WjL3witvYDnj70kXl0seodGMPuw98Jzd/anqQ1CLqPXsb5MZx70wJpZOOPR/m IJwGpkLMaGFeYGVfa92tfxNVsfyYvGAC7h5OnKysPAcPRzflAQRJGGCpbAvaKHwUPL4q c6C4VybvM0a0lY6m+6OyJLhyN7huDClrWK7eFqCsPKCUufPREyZpX6MsnuxOpu2RI6qA ssLg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="aQ/+1JrV"; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-131840-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-131840-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id y8-20020a656c08000000b005dc4f9ceccdsi16470721pgu.609.2024.04.04.10.08.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Apr 2024 10:08:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-131840-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="aQ/+1JrV"; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-131840-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-131840-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 503D5B2CCE6 for ; Thu, 4 Apr 2024 16:41:27 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 56D9A129E6F; Thu, 4 Apr 2024 16:41:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="aQ/+1JrV" Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B02FB1F922; Thu, 4 Apr 2024 16:41:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.51 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712248863; cv=none; b=aIvp7vUwwywGVg4hdyU3tF3a8gjeSf4a5bXM2AUEYVvN0djZwmz3T4PF52R5jf8yx+XWkCSj2zriDyPViEXkWo0ENSm64C3YqNi4LeXg+oFt2VqJTetEk0bfrtT4+XOmiJ8xvO7rSHJC46c/W/r6FfDNeAY4pFSKO8Rawem8O9w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712248863; c=relaxed/simple; bh=x/s6vyO2Ku1FEpgpwyNPo6be9Ds/5FEj3IOqROrewf0=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=LiMyexmqrqEt/UvNPhyceqpxeP4XU5ow3YUFYHl9khtAjf+PQgQ2KQ4DC+qWiH5LISoOtckx9YH0LQs6W5qcvhBeZ27OHAWlP8JLooNFjNZBAR28XkzwqrWGcmX+m2P9dqryFaQkUy2P3isWoEKATbymvKArSG1+TZXRWDd9zUY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=aQ/+1JrV; arc=none smtp.client-ip=209.85.128.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-41550858cabso8380035e9.2; Thu, 04 Apr 2024 09:41:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712248860; x=1712853660; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=lbLmsAzzJdGPLAvAsCum54KXU9vJ5+iCYp76EK5DMIc=; b=aQ/+1JrVRIUJqsPG4P6DjD4SkvkSowefXsG5zplfk5m2hfJ9iUnD/lXLpfvsKoM9l1 iHHWjg2AWJJ7fdzdOdjPxttD/FXwMj54Or0a/a7hAkl9dmafaV2VaqJzmc0To7U5nPcV IHP3djtbJiYtxFSrA2ko4hTZc2UpolQaEG/vl/pcSBAKdWnBlPhtfVblfERog9saqjfE jQ1Wd89MVvq5fUzK2Gh/oH3gHHvdYQYUbdz8KpM0ZR45BXuzs7Ov/saY8pJlVwJabpkr pfpsz93EyHrZp6nKpYvh9Z15SGmkTXnDZTp87QAP6kEhS9Ved9Nl28SeTNLE+x05oSgT D53w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712248860; x=1712853660; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lbLmsAzzJdGPLAvAsCum54KXU9vJ5+iCYp76EK5DMIc=; b=JoG9AmIxsSx0IvWfiO5CUkKPrcJvR+nhywUPpGqkqOWa2d6ZR3scaMjuwHINJQle5b JJLf1ra41L/TO/xEuQm9F8SvW1ybRtzEwH33vVnBn2zLeUCm11ak+0lB0iYJFiSeQbKU vM77YOYytVtyPeY9Gmd/W5lk1oDysQQkG0wNqqfrCJePyODWUlqkAdpuu01a1ElTV7Fd NiaRUUAQavUlwydkJyzF9HUB3DdeojjflscPrbXQLOFbg0YnSXzXvJ5obW0N9/uAWo4j qUD03OqXNOKAqB+Y0gxkZOBZ4ZiOAMiixhs9xRTx7ttRivtRsvBVOmPl2mEPCI3sOYCO wZew== X-Forwarded-Encrypted: i=1; AJvYcCWIrwH15QYkzpM+qrSZW5hKToLYeVciFACXFY5aNqKldqVe2Xajktxhv+9qGnGWtGv9Mq9kM5hE6FTkk+bUnJnlX+YWsv6uVbJ6T1+8mRQuiJUdfvOQA4lZuaDw/cdHh/SFq8lxNfisFoFybj8czx4azMDWXZWO0arSj256EyQJ0G83 X-Gm-Message-State: AOJu0YxB8qkp21110KMFu5GnKFA0bwmcQ5IWgA8jb21MXWleZXncf5MT QhC7Kwagq83yXsraBvFA3GfaL6ssA5OZg8xyzbMQkYyvYMgZeTR6xEe40JdxhzaNGnnIZ7rVJhS cpA/ESttfl2HJVAAwjFdbC+jpwsk= X-Received: by 2002:a5d:58c8:0:b0:341:abd4:a6e8 with SMTP id o8-20020a5d58c8000000b00341abd4a6e8mr2184968wrf.60.1712248859779; Thu, 04 Apr 2024 09:40:59 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240322-hid-bpf-sleepable-v5-0-179c7b59eaaa@kernel.org> <20240322-hid-bpf-sleepable-v5-1-179c7b59eaaa@kernel.org> In-Reply-To: From: Alexei Starovoitov Date: Thu, 4 Apr 2024 09:40:48 -0700 Message-ID: Subject: Re: [PATCH bpf-next v5 1/6] bpf/helpers: introduce sleepable bpf_timers To: Benjamin Tissoires Cc: Benjamin Tissoires , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan , bpf , LKML , "open list:KERNEL SELFTEST FRAMEWORK" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Apr 4, 2024 at 8:27=E2=80=AFAM Benjamin Tissoires wrote: > > > > > > So we need something like: > > > > struct bpf_hrtimer { > > union { > > struct hrtimer timer; > > + struct work_struct work; > > }; > > struct bpf_map *map; > > struct bpf_prog *prog; > > void __rcu *callback_fn; > > void *value; > > union { > > Are you sure we need an union here? If we get to call kfree_rcu() we > need to have both struct rcu_head and sync_work, not one or the other. why? with an extra flag it's one or the other. In bpf_timer_cancel_and_free() if (flag & SLEEPABLE) { schedule_work() to cancel_work_sync + kfree_rcu } else { hrtimer_cancel kfree_rcu } > > struct rcu_head rcu; > > + struct work_struct sync_work; > > }; > > + u64 flags; // bpf_timer_init() will require BPF_F_TIMER_SLEEPABLE > > If I understand, you want BPF_F_TIMER_SLEEPABLE in bpf_timer_init() > (like in my v2 or v3 IIRC). But that means that once a timer is > initialized it needs to be of one or the other type (especially true > with the first union in this struct). yes. That's an idea. The code to support wq vs timer seems to be diverging more than what we expected initially. It seems cleaner to set it as init time and enforce in other helpers. Also with two work_struct-s we're pushing the sizeof(bpf_hrtimer) too far. It's already at 112 bytes and some people use bpf_timer per flow. So potentially millions of such timers. Adding extra sizeof(struct work_struct)=3D32 * 2 that won't be used is too much. Note that sizeof(struct hrtimer)=3D64, so unions make everything fit nicely. > So should we reject during run time bpf_timer_set_callback() for > sleepable timers and only allow bpf_timer_set_sleepable_cb() for > those? (and the invert in the other case). yes. > This version of the patch allows for one timer to be used as softIRQ > or WQ, depending on the timer_set_callback that is used. But it might > be simpler for the kfree_rcu race to define the bpf_timer to be of one > kind, so we are sure to call the correct kfree method. I think one or another simplifies the code and makes it easier to think through combinations. I'm still contemplating adding new "struct bpf_wq" and new kfuncs to completely separate wq vs timer. The code reuse seems to be relatively small. We can potentially factor out internals of bpf_timer_* into smaller helpers and use them from bpf_timer_* and from new bpf_wq_* kfuncs. One more thing. bpf_timer_cancel() api turned out to be troublesome. Not only it cancels the timer, but drops callback too. It was a surprising behavior for people familiar with kernel timer api-s. We should not repeat this mistake with wq. We can try to fix bpf_timer_cancel() too. If we drop drop_prog_refcnt() from it it shouldn't affect existing bpf_timer users who are forced to do: bpf_timer_cancel() bpf_timer_set_callback() bpf_timer_start() all the time. If/when bpf_timer_cancel() stops dropping the callback such bpf prog won't be affected. So low chance of breaking any prog. wdyt?