Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp2533448pxu; Fri, 9 Oct 2020 21:55:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzj1kuT4zihHxZaMVAV8lc1zSz0+zRF6yXSjU7UutGmeIOVYY9ntGese037Nbq1uhFCTUks X-Received: by 2002:a17:906:3a8c:: with SMTP id y12mr16928264ejd.531.1602305757387; Fri, 09 Oct 2020 21:55:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602305757; cv=none; d=google.com; s=arc-20160816; b=C73VLHeL4Ie++0/4n8R5dMI2HbeU+exCfNH68C1nZvf/bhItZiEokn3lA4ckxS2ttL jQzUZQIwIHDcAKGhe0wezeyQek/zJdTu13QlQoznKC6U0YiUsSkz/2jJ1eTVeaJUzhup hgUZlfjU3NYpj420CZAngYSBK5xfvvAollAAiPj3lx8fEk8DW+mk7rfiQ1qh6MnmFPUF EdCHJUQ76K35Mr7JjWTZCo+WNvQMlTRGuqKACWtmYyjPgN9KpNRYjY9QrOLRRwEDJT9E Y4VAJSISNMKW0l+Cuu364T/Gz48USn1v1SJzqCfENdkPir7kw1xbzESSc81Jc6IYiJVt R+Zg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=CSRAHb/axz17M+0sWNwy+ixadRZAMXlB5z5CiZ4OuV8=; b=RhdNMLViwVb4vH1pObB2DFHgH7w441dBZXz5CjRhcxXz7A9t1fXGRP5ogFIueBJ7NF hZAFQfmaWnTLY95kYrJ/0dAx+FOtbi14xnMNU4pE7s2qAs6j3R2XDJ+l4yNNJlPbwabw yiBCr79y2EAzzjId0Pc/+2prKrylVH5mMG6SO8Qbm9ZpSmsxpx/fYff7lXOqq13nKu6X zjglOoPOHtDezWmw4g6zP6RqkN4ADHf5JlubYSyl4PF0Dt4wBoXqgjXOLvVldxU0Vk5t pgsQrxmBAM9Ed61SImx5sLEpejLJ3r/XyiqNSZVDaTLsKazL7I4LbF3OtG+3bA1Clint VNew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=HY0u+kX5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id fx22si7660025ejb.299.2020.10.09.21.55.34; Fri, 09 Oct 2020 21:55:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=HY0u+kX5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388997AbgJIVbl (ORCPT + 99 others); Fri, 9 Oct 2020 17:31:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391615AbgJIVaz (ORCPT ); Fri, 9 Oct 2020 17:30:55 -0400 Received: from mail-ed1-x542.google.com (mail-ed1-x542.google.com [IPv6:2a00:1450:4864:20::542]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2B3FC0613D2 for ; Fri, 9 Oct 2020 14:30:54 -0700 (PDT) Received: by mail-ed1-x542.google.com with SMTP id l16so10905548eds.3 for ; Fri, 09 Oct 2020 14:30:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=CSRAHb/axz17M+0sWNwy+ixadRZAMXlB5z5CiZ4OuV8=; b=HY0u+kX5HLC2Qd/hrbktWnWpqOSCxBXT3upvUFDtjn+PQwxihMAxO7b17zrilGpniu Z/H4JmecauoImC4n6z/XuMGPe9m/fKvi3HUsIaSIN5Vh0h9oJizzFej6p8VupLfSBpv1 1DxoGx28bnw9wjP3qaVpNc+QqlJxc/NVthbUpnHP6rzn4hxdW9x4BmvgBCkwRxaV4uG3 TvIrmUEl3tqjzHanRb36NzEAD3F/TD94mNKCpByd9hMTqtXQEnSbVMIkAALhyvoRTHJD dVE++0gF0w5nI42UG8NH4b76H/O3FzWMQwcPWdDfcJxiVuoQxs7pfn6hdE4zvJQarNUM +JGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=CSRAHb/axz17M+0sWNwy+ixadRZAMXlB5z5CiZ4OuV8=; b=JdfNimV8XcygXAbBCHst80NIZ9lHHI6k85BEwwxfwlLYYM/rffJ9Ib5oRpsEe7Y5NI KU+H70sw45gKj9DvA6jHnXhZG7Wl31kwRrg5hZnQbJvPkC3YvwWemd5at8bWHGw2lBFC y/+txNomKfg/0qYbAgPqOP1lcHeAH1agThOQ4eSG1cbfeDsQ7bJrilM7Cv8AqCd2Gjfz jm07znFKKRE+/Hm0iYUFf+9K7FzStMs/KghPoJb3ZQ+j2jw3E8mgiTpkoHROAOQLw5QY K2tylVMG3IGj5TzBoYLxmJVnOEBSiqyoY89WjDeBFGklirCBg8TpPqzqEx6bGTxQXOGa qsog== X-Gm-Message-State: AOAM530D4MHWxxWeOLTW5aamZXBGmLOxKs/p5SnOinyUL2/lf7DHl/aS T8Wqe5K1kYZZhC3TkF88bezY4/wPt6GGGT8lu8dLdQ== X-Received: by 2002:a05:6402:32f:: with SMTP id q15mr1339366edw.230.1602279053404; Fri, 09 Oct 2020 14:30:53 -0700 (PDT) MIME-Version: 1.0 References: <896cd9de97318d20c25edb1297db8c65e1cfdf84.1602263422.git.yifeifz2@illinois.edu> In-Reply-To: <896cd9de97318d20c25edb1297db8c65e1cfdf84.1602263422.git.yifeifz2@illinois.edu> From: Jann Horn Date: Fri, 9 Oct 2020 23:30:27 +0200 Message-ID: Subject: Re: [PATCH v4 seccomp 1/5] seccomp/cache: Lookup syscall allowlist bitmap for fast path To: YiFei Zhu Cc: Linux Containers , YiFei Zhu , bpf , kernel list , Aleksa Sarai , Andrea Arcangeli , Andy Lutomirski , David Laight , Dimitrios Skarlatos , Giuseppe Scrivano , Hubertus Franke , Jack Chen , Josep Torrellas , Kees Cook , Tianyin Xu , Tobin Feldman-Fitzthum , Tycho Andersen , Valentin Rothberg , Will Drewry Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 9, 2020 at 7:15 PM YiFei Zhu wrote: > The overhead of running Seccomp filters has been part of some past > discussions [1][2][3]. Oftentimes, the filters have a large number > of instructions that check syscall numbers one by one and jump based > on that. Some users chain BPF filters which further enlarge the > overhead. A recent work [6] comprehensively measures the Seccomp > overhead and shows that the overhead is non-negligible and has a > non-trivial impact on application performance. > > We observed some common filters, such as docker's [4] or > systemd's [5], will make most decisions based only on the syscall > numbers, and as past discussions considered, a bitmap where each bit > represents a syscall makes most sense for these filters. > > The fast (common) path for seccomp should be that the filter permits > the syscall to pass through, and failing seccomp is expected to be > an exceptional case; it is not expected for userspace to call a > denylisted syscall over and over. > > When it can be concluded that an allow must occur for the given > architecture and syscall pair (this determination is introduced in > the next commit), seccomp will immediately allow the syscall, > bypassing further BPF execution. > > Each architecture number has its own bitmap. The architecture > number in seccomp_data is checked against the defined architecture > number constant before proceeding to test the bit against the > bitmap with the syscall number as the index of the bit in the > bitmap, and if the bit is set, seccomp returns allow. The bitmaps > are all clear in this patch and will be initialized in the next > commit. [...] > Co-developed-by: Dimitrios Skarlatos > Signed-off-by: Dimitrios Skarlatos > Signed-off-by: YiFei Zhu Reviewed-by: Jann Horn