Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3076599imu; Thu, 29 Nov 2018 15:10:36 -0800 (PST) X-Google-Smtp-Source: AFSGD/W9ganlMOjHNODw3k1f0jVSkGWGPhrBt3Lx3mbfGJ0yDZQ93G/YcAQbGBsAJXbITJJqvYZo X-Received: by 2002:a63:3d03:: with SMTP id k3mr2873077pga.191.1543533036180; Thu, 29 Nov 2018 15:10:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543533036; cv=none; d=google.com; s=arc-20160816; b=FG0QTlJmSxomby/20yKtwXQZGOTjMa1UK/4/QpEK/0FGuswKuKu43u8Mit7YPqRY27 9Txb6WguGLleOOdlPHp3CWTwEWYImkm496AuxEoXGrzpf5KMQ00N2jGS7Ho6+0NabHiD tmvA4+37Qc2chkhgTN+rKUsmbnUZRTBiORFxoZ6WHL40xhDtfwdApXV4s6JkU81J0yWt EEPj+fKX3GqCjv3VX3PNkTi+o/hV3KYz/5pyB/aN6G2V1OI92ffaZkjmneAu0E7pm7Wl C1mEbEAtz1MjKhoFOX4WZFNVLY+nMPsPXdLun012USd9RL9Xi+RFWiT3Gk6cfAiAoi2s /Dlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=37xWa5ppls1wBcn3ORaj0+L5Gz0YBCVILQxhZKoSFQo=; b=FEDMD7cufTyXHBb0yt0hspkXauvp+3gWp/xRX9JHUDp/GBLAcgabv/sBkNS38W2vDf ggMDiQM8G8M9cdCD1kOJZ3ExjDrS52y+Rn5vT8U3xLV4Xhu7+nkQggwESb5UsYLKxUIx lJMXUxx3/ShsuTndxOVjpi45aXIcMVzNQ0qpCUTWmAJrVtTmqdNntwIjlS0RqEc8lbyi ElqqsuOl8L9jOfGY3wBRXPSqTsCUYoBXMG9HSZV9uyxZIbtqYrZGHjUnBwE9DOUAM8Bm cVYJ+dpxq5gAtEj2+h6P5M17ikoKwEuLsD39tOLE2EFe1f7tyFFyAIIlOg1MavdeGJD+ scaQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@tycho-ws.20150623.gappssmtp.com header.s=20150623 header.b=D3v74xax; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g23si3136276pgb.229.2018.11.29.15.10.21; Thu, 29 Nov 2018 15:10:36 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@tycho-ws.20150623.gappssmtp.com header.s=20150623 header.b=D3v74xax; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727187AbeK3KPj (ORCPT + 99 others); Fri, 30 Nov 2018 05:15:39 -0500 Received: from mail-pf1-f195.google.com ([209.85.210.195]:41428 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726824AbeK3KPj (ORCPT ); Fri, 30 Nov 2018 05:15:39 -0500 Received: by mail-pf1-f195.google.com with SMTP id b7so1773919pfi.8 for ; Thu, 29 Nov 2018 15:08:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tycho-ws.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=37xWa5ppls1wBcn3ORaj0+L5Gz0YBCVILQxhZKoSFQo=; b=D3v74xaxjmaV2Q98XgwWS6ruPskB9hIrccOgMFOTAjEibQfg+jc9tl5FU2JjYTWUYu J+Ia1tj/xvBNEOfW0QsvXOiAFNyhB4OCNfGlfXaWJkqPdzTyZezlcuUrdJkxvKGj3WHL X9P00gUKkyZEvFoKMv9NIPqvCejjna1rCQU5RaKChOXC8mSJU23t+wWULKZifGiKjXgA 6fCP0Sm2ZSTHUQ2Dt55hOlHXeGPzg6DHMcRF/Pow8UIqo+X9X1FB7zaxSWn7dLssrsUW X30eSrIlAOVGoz5xZ4RlXySs520wCRDbazZ8rIpdUknVmUCfLP1myMnleh4LCdZrq7DZ UcWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=37xWa5ppls1wBcn3ORaj0+L5Gz0YBCVILQxhZKoSFQo=; b=GFlkUGsR8KVs4uma6CueZPZEXSKgunjq6jYQZJjy0dEONLJFQrh2bezDiXFpMuMX/A dx76UiDFTTjxtnQfbgT4tmiWk9E/gfCZSBAQ0lVGXRBUQQTaf5vJ2RppQC21RtvlmhOF ilwbLl2o0ATePXHxJRjvP2WTSlFqLnqDEhXmC9M3zU5czaaTC+OWT1ybibFOtP24OLrA OxF9Adweq0qQ++M4RCG6CIOgm3B0qi4ja5gi7S+Wuzqh6FG67XqeDaY1eVJZOF6fNnDW 3rKvfCGqKCNgm3HRSto12WuGqhMtRsF78QBViENPez6ROotguA1O02RPNv7tJuQLPhlg GxAw== X-Gm-Message-State: AA+aEWZoaH2jrIvXu81y6Ugpqy8Ek7sbi9bqUDbP1+oPQqVlREPt4S9W K3yKNhEAEWFf0LDjDTyKNvFVkQ== X-Received: by 2002:a62:190e:: with SMTP id 14mr3253200pfz.70.1543532909714; Thu, 29 Nov 2018 15:08:29 -0800 (PST) Received: from cisco ([128.107.241.161]) by smtp.gmail.com with ESMTPSA id v190sm4625799pfv.26.2018.11.29.15.08.27 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 29 Nov 2018 15:08:28 -0800 (PST) Date: Thu, 29 Nov 2018 16:08:26 -0700 From: Tycho Andersen To: Kees Cook , Andy Lutomirski Cc: Oleg Nesterov , "Eric W . Biederman" , "Serge E . Hallyn" , Christian Brauner , Tyler Hicks , Akihiro Suda , Aleksa Sarai , linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org, linux-api@vger.kernel.org Subject: Re: [PATCH v8 1/2] seccomp: add a return code to trap to userspace Message-ID: <20181129230826.GB4676@cisco> References: <20181029224031.29809-1-tycho@tycho.ws> <20181029224031.29809-2-tycho@tycho.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181029224031.29809-2-tycho@tycho.ws> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 29, 2018 at 04:40:30PM -0600, Tycho Andersen wrote: > + resp.id = req.id; > + resp.error = -512; /* -ERESTARTSYS */ > + resp.val = 0; > + > + EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_SEND, &resp), 0); So, it turns out this *doesn't* work, and the reason this test was passing is because of poor hygiene on my part. Per the documentation in include/linux/errno.h, /* * These should never be seen by user programs. To return one of ERESTART* * codes, signal_pending() MUST be set. Note that ptrace can observe these * at syscall exit tracing, but they will never be left for the debugged user * process to see. */ #define ERESTARTSYS 512 So basically, if you respond with -ERESTARTSYS with no signal pending, you'll leak it to userspace. It turns out this is already possible with SECCOMP_RET_TRAP (and probably ptrace alone, although I didn't try it out), see the program below. The question is: do we care? If so, it seems like we may need to handle the -ERESTARTSYS-style cases even when there is no signal pending. If we don't, there's precedent for us to just do the same thing as what happens for SECCOMP_RET_TRACE, but we should probably at least fix the docs. Tycho #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static int filter_syscall(int syscall_nr) { struct sock_filter filter[] = { BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct seccomp_data, nr)), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, syscall_nr, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_TRACE), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), }; struct sock_fprog bpf_prog = { .len = (unsigned short)(sizeof(filter)/sizeof(filter[0])), .filter = filter, }; int ret; ret = syscall(__NR_seccomp, SECCOMP_SET_MODE_FILTER, 0, &bpf_prog); if (ret < 0) { perror("prctl failed"); return -1; } return ret; } typedef struct { uint64_t r15; uint64_t r14; uint64_t r13; uint64_t r12; uint64_t bp; uint64_t bx; uint64_t r11; uint64_t r10; uint64_t r9; uint64_t r8; uint64_t ax; uint64_t cx; uint64_t dx; uint64_t si; uint64_t di; uint64_t orig_ax; uint64_t ip; uint64_t cs; uint64_t flags; uint64_t sp; uint64_t ss; uint64_t fs_base; uint64_t gs_base; uint64_t ds; uint64_t es; uint64_t fs; uint64_t gs; } user_regs_struct64; int main(int argc, char **argv) { pid_t pid; user_regs_struct64 regs; struct iovec iov = {.iov_base = ®s, .iov_len = sizeof(regs)}; int status; pid = fork(); if (pid == 0) { if (signal(SIGUSR1, signal_handler) == SIG_ERR) { perror("signal"); exit(1); } if (filter_syscall(__NR_getpid) < 0) exit(1); /* i'm lazy, so sue me :) */ sleep(1); errno = 0; pid = syscall(__NR_getpid); /* * we get: * getpid(): -1, errno: 512 * probably should get * getpid(): errno: 0 */ printf("getpid(): %d, errno: %d\n", pid, errno); exit(errno); } if (ptrace(PTRACE_ATTACH, pid, NULL, 0) < 0) { perror("ptrace attach"); goto out; } if (waitpid(pid, NULL, 0) != pid) { perror("waitpid"); goto out; } if (ptrace(PTRACE_SETOPTIONS, pid, NULL, PTRACE_O_TRACESECCOMP) < 0) { perror("ptrace setoptions"); goto out; } if (ptrace(PTRACE_CONT, pid, NULL, 0) != 0) { perror("ptrace cont"); goto out; } if (waitpid(pid, &status, 0) != pid) { perror("wait for trace"); goto out; } if (status >> 8 != (SIGTRAP | (PTRACE_EVENT_SECCOMP<<8))) { printf("got bad trap event?\n"); goto out; } if (ptrace(PTRACE_GETREGSET, pid, NT_PRSTATUS, &iov) < 0) { perror("getregset"); goto out; } /* * Tell the syscall to restart. Per include/linux/errno.h this should * only be set when signal_pending() is set. But we just won't send * any signals to the process, and we'll see this in userspace. */ regs.ax = -512; /* -ERESTARTSYS */ /* * This makes the this_syscall < 0 check in __seccomp_filter() * trigger, so we skip the syscall and return whatever is in ax */ regs.orig_ax = -512; /* -ERESTARTSYS */ if (ptrace(PTRACE_SETREGSET, pid, NT_PRSTATUS, &iov) < 0) { perror("setregset"); goto out; } if (ptrace(PTRACE_CONT, pid, NULL, 0) < 0) { perror("cont after setregset"); goto out; } while (1) { if (waitpid(pid, &status, 0) != pid) { perror("wait for death"); goto out; } if (!WIFSTOPPED(status)) { break; } if (ptrace(PTRACE_CONT, pid, NULL, 0) < 0) { perror("cont after setregset"); goto out; } } if (WIFSIGNALED(status)) { printf("didn't exit: %d\n", WTERMSIG(status)); return 1; } if (WEXITSTATUS(status)) { printf("exited: %d\n", WEXITSTATUS(status)); return 1; } return 0; out: kill(pid, SIGKILL); return 1; }