Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp2532431pxu; Fri, 9 Oct 2020 21:53:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw4c1w/KIfETetDQv0X6TkUPIC799IV/g/BfXvqkOAxOG6P9BQnJ966guYNQenxj/5z/fFH X-Received: by 2002:a17:906:3e4b:: with SMTP id t11mr17173405eji.420.1602305603964; Fri, 09 Oct 2020 21:53:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602305603; cv=none; d=google.com; s=arc-20160816; b=XUIrPtEWJvkNcyJ6NMPjqbEJlTmCOhXKeyGVuVi9GzPOd5X2FeAi9OYwX2Jt1dUyhv JGLwuqkZ883Mnt/Y/ZJh7hFF8vrcUN5G+8G65FDmcxWVuKoV7E88l785M7dmbLVtUGx8 Coz0R/nQQm3x7+c8/5ofgT65D4gtQo2hiqJZDJ/yAPla9gs152dg/Oa+euXWCs2tTqBM zDAnHSTpNoyqKjOo3MlzQhxROicfcmyDBzUAI5SL5tCsSAUijBVFl9CY7IuwfApAni3k A3VyfD2Yt96ygx8M8QbcHXAfoiSZrHzwWQwx8t4LGIEEi/VwrI0OCvQSVjH3pjla83Qx gKIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=pbFcqf/DjvQ4JWXqdwx2qoDcjZm2i9Wcon6pZeNDz5o=; b=nKCqGpKYGZAXU8ehonHqOBX8l1P3VNNkIjeRE7F7Gxym8TKOb0j5A7sSO9PAm72cBU 8CKM+sSQYYNrLzcUjojRLwl2nmsG247EOWbBU1VavD+Qm7ujdTTwmsE2zPs6jc2R1xW7 qYyHc5zmli4Dbj/JZ17FAN4H5fUb4i380ov48o0rUSI4kXZE/KqLO39zel+WSo0qXKhh sAHXFFBpBTFjP8D9rpB+Mmc7u4F8B7BkJ0T7dpekOfjSpfYcdoBROJ1a+9Poi1fbXm+0 Xq4aCy2Hklll9SsoPOCnKYoQAGeQItS7UYMb8F1UjH+kg3C7y63xTqXBNLzpcSTNMEWy Dmlw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="ZG7/cR3D"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g4si7422880ejw.351.2020.10.09.21.53.00; Fri, 09 Oct 2020 21:53:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="ZG7/cR3D"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391593AbgJIVar (ORCPT + 99 others); Fri, 9 Oct 2020 17:30:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391457AbgJIVar (ORCPT ); Fri, 9 Oct 2020 17:30:47 -0400 Received: from mail-ej1-x641.google.com (mail-ej1-x641.google.com [IPv6:2a00:1450:4864:20::641]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 67040C0613D2 for ; Fri, 9 Oct 2020 14:30:46 -0700 (PDT) Received: by mail-ej1-x641.google.com with SMTP id qp15so15161952ejb.3 for ; Fri, 09 Oct 2020 14:30:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pbFcqf/DjvQ4JWXqdwx2qoDcjZm2i9Wcon6pZeNDz5o=; b=ZG7/cR3DKUuQ/RPLGoR9GRiVD5sFiQY9qmvwtKGyjQ7dUx6cRoXQxPJrsofVX4GvT8 +bHrvlkc1QdE5fUuR+B4/jEqIhD96HiZRW7ovaFsuN3fEZ+QZA4WfW7Zoh4Ks3/JQ0EC foiF78nQ6mvg4w1M98iFAIEqKbJDxwbcU0l389avKqsHHsxkP1F73Y3V7TTlMKHaLOCS Dm4YqX7AZSogV5VQASZlZfU5CJKMZXVuwmHApTvWmqpz+YKzBi4/BWNPCESa5p7ZbAdQ xYthw3ucPw0eXtEwmhEb7cEy71fk2ZQJ7DXR2iGsteCPu+HY7ggdx/mX75DxDqxJz9QQ OIUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pbFcqf/DjvQ4JWXqdwx2qoDcjZm2i9Wcon6pZeNDz5o=; b=TO54wliAuz53O0iCeVhxxFGJdZxGeBycrKxGP447Xvmim7xLqn8Jew+q8BsSoFnXnS L4kzFRJO+IZmV7PGQNxWny0zjpRMxAe6XnAqX6y4soquK2SvXBX2LkmajY9oKweR4WSq YKLapvtl4wHx9fp1q1u4u1BdMvyHP8+w3pdf04SDUptMG2xkC8JmoAdDTLB5R47G2FgL +pMclPBQJUhikI4RAQC4yBUjLiTF9bp3J4trZO1EBAV84fmEwtuufk1ysCsW/ON3iDne 24MqOWyBaB17bJ6/lo3zzSFJatxkJsNZ5X15TebKa9nq6XBwFHLBIwkzhXCAj6mr9BfM JGuA== X-Gm-Message-State: AOAM530G7V6IW9VNbEemkaemOGLmLf6u28xz9tlySs0ofimOHsr3tv5n WyrSIbq0G4PiqRdvlZK46vjI7O9PbPW5GvrdAyNZUw== X-Received: by 2002:a17:906:86c3:: with SMTP id j3mr16811843ejy.493.1602279044763; Fri, 09 Oct 2020 14:30:44 -0700 (PDT) MIME-Version: 1.0 References: <1a40458d081ce0d5423eb0282210055496e28774.1602263422.git.yifeifz2@illinois.edu> In-Reply-To: <1a40458d081ce0d5423eb0282210055496e28774.1602263422.git.yifeifz2@illinois.edu> From: Jann Horn Date: Fri, 9 Oct 2020 23:30:18 +0200 Message-ID: Subject: Re: [PATCH v4 seccomp 2/5] seccomp/cache: Add "emulator" to check if filter is constant allow To: YiFei Zhu Cc: Linux Containers , YiFei Zhu , bpf , kernel list , Aleksa Sarai , Andrea Arcangeli , Andy Lutomirski , David Laight , Dimitrios Skarlatos , Giuseppe Scrivano , Hubertus Franke , Jack Chen , Josep Torrellas , Kees Cook , Tianyin Xu , Tobin Feldman-Fitzthum , Tycho Andersen , Valentin Rothberg , Will Drewry Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 9, 2020 at 7:15 PM YiFei Zhu wrote: > > From: YiFei Zhu > > SECCOMP_CACHE will only operate on syscalls that do not access > any syscall arguments or instruction pointer. To facilitate > this we need a static analyser to know whether a filter will > return allow regardless of syscall arguments for a given > architecture number / syscall number pair. This is implemented > here with a pseudo-emulator, and stored in a per-filter bitmap. > > In order to build this bitmap at filter attach time, each filter is > emulated for every syscall (under each possible architecture), and > checked for any accesses of struct seccomp_data that are not the "arch" > nor "nr" (syscall) members. If only "arch" and "nr" are examined, and > the program returns allow, then we can be sure that the filter must > return allow independent from syscall arguments. > > Nearly all seccomp filters are built from these cBPF instructions: > > BPF_LD | BPF_W | BPF_ABS > BPF_JMP | BPF_JEQ | BPF_K > BPF_JMP | BPF_JGE | BPF_K > BPF_JMP | BPF_JGT | BPF_K > BPF_JMP | BPF_JSET | BPF_K > BPF_JMP | BPF_JA > BPF_RET | BPF_K > BPF_ALU | BPF_AND | BPF_K > > Each of these instructions are emulated. Any weirdness or loading > from a syscall argument will cause the emulator to bail. > > The emulation is also halted if it reaches a return. In that case, > if it returns an SECCOMP_RET_ALLOW, the syscall is marked as good. > > Emulator structure and comments are from Kees [1] and Jann [2]. > > Emulation is done at attach time. If a filter depends on more > filters, and if the dependee does not guarantee to allow the > syscall, then we skip the emulation of this syscall. > > [1] https://lore.kernel.org/lkml/20200923232923.3142503-5-keescook@chromium.org/ > [2] https://lore.kernel.org/lkml/CAG48ez1p=dR_2ikKq=xVxkoGg0fYpTBpkhJSv1w-6BG=76PAvw@mail.gmail.com/ [...] > @@ -682,6 +693,150 @@ seccomp_prepare_user_filter(const char __user *user_filter) > return filter; > } > > +#ifdef SECCOMP_ARCH_NATIVE > +/** > + * seccomp_is_const_allow - check if filter is constant allow with given data > + * @fprog: The BPF programs > + * @sd: The seccomp data to check against, only syscall number are arch > + * number are considered constant. nit: s/syscall number are arch number/syscall number and arch number/ > + */ > +static bool seccomp_is_const_allow(struct sock_fprog_kern *fprog, > + struct seccomp_data *sd) > +{ > + unsigned int insns; > + unsigned int reg_value = 0; > + unsigned int pc; > + bool op_res; > + > + if (WARN_ON_ONCE(!fprog)) > + return false; > + > + insns = bpf_classic_proglen(fprog); bpf_classic_proglen() is defined as: #define bpf_classic_proglen(fprog) (fprog->len * sizeof(fprog->filter[0])) so this is wrong - what you want is the number of instructions in the program, what you actually have is the size of the program in bytes. Please instead check for `pc < fprog->len` in the loop condition. > + for (pc = 0; pc < insns; pc++) { > + struct sock_filter *insn = &fprog->filter[pc]; > + u16 code = insn->code; > + u32 k = insn->k; [...] > + } > + > + /* ran off the end of the filter?! */ > + WARN_ON(1); > + return false; > +}