Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751929AbaL3Tp1 (ORCPT ); Tue, 30 Dec 2014 14:45:27 -0500 Received: from mail-we0-f178.google.com ([74.125.82.178]:40424 "EHLO mail-we0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751543AbaL3TpZ (ORCPT ); Tue, 30 Dec 2014 14:45:25 -0500 Message-ID: <54A300D0.7090802@gmail.com> Date: Tue, 30 Dec 2014 20:45:20 +0100 From: "Michael Kerrisk (man-pages)" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Kees Cook CC: mtk.manpages@gmail.com, Daniel Borkmann , Linux API , "linux-man@vger.kernel.org" , lkml , Will Drewry Subject: Re: Edited seccomp.2 man page for review [v2] References: <54A29722.1010901@gmail.com> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [CC += Will Drewry; Will, maybe you have input on a point below (search for your name in the message text)] Kees, Thanks for the quick response. On 12/30/2014 06:16 PM, Kees Cook wrote: > On Tue, Dec 30, 2014 at 4:14 AM, Michael Kerrisk (man-pages) > wrote: >> Hi Kees, (and all), >> >> Thanks for your comments on the previous draft of the seccomp(2) >> man page and (once again) my apologies for the slow follow-up. >> >> I have done some further editing of the page. Could you check >> the revised version below. I have added a number of FIXMEs >> for points where I'd either like you to check new text that I >> added (in case it contains errors) or where I hope you can >> provide answers to questions relating to details that may need >> clarifying in the page. >> >> I've appended the revised page at the foot of this mail. You can also >> find the branch holding this page in Git at: >> http://git.kernel.org/cgit/docs/man-pages/man-pages.git/log/?h=draft_seccomp >> >> Notable changes from the previous draft: >> * Several new error cases added under ERRORS >> * New subsection on Seccomp-specific BPF details >> * Add some detail in discussion of 'siginfo_t' fields >> * Tweaked comments on BPF program in EXAMPLE section >> * Added various FIXMEs >> >> I also have one API quibble, regarding the name of the >> SYS_SECCOMP constant; see below. >> >> Feedback as inline comments to the below would be great! >> >> Cheers, >> >> Michael >> >> .\" Copyright (C) 2014 Kees Cook >> .\" and Copyright (C) 2012 Will Drewry >> .\" and Copyright (C) 2008, 2014 Michael Kerrisk >> .\" >> .\" %%%LICENSE_START(VERBATIM) >> .\" Permission is granted to make and distribute verbatim copies of this >> .\" manual provided the copyright notice and this permission notice are >> .\" preserved on all copies. >> .\" >> .\" Permission is granted to copy and distribute modified versions of this >> .\" manual under the conditions for verbatim copying, provided that the >> .\" entire resulting derived work is distributed under the terms of a >> .\" permission notice identical to this one. >> .\" >> .\" Since the Linux kernel and libraries are constantly changing, this >> .\" manual page may be incorrect or out-of-date. The author(s) assume no >> .\" responsibility for errors or omissions, or for damages resulting from >> .\" the use of the information contained herein. The author(s) may not >> .\" have taken the same level of care in the production of this manual, >> .\" which is licensed free of charge, as they might when working >> .\" professionally. >> .\" >> .\" Formatted or processed versions of this manual, if unaccompanied by >> .\" the source, must acknowledge the copyright and authors of this work. >> .\" %%%LICENSE_END >> .\" >> .TH SECCOMP 2 2014-06-23 "Linux" "Linux Programmer's Manual" >> .SH NAME >> seccomp \- operate on Secure Computing state of the process >> .SH SYNOPSIS >> .nf >> .B #include >> .B #include >> .B #include >> .B #include >> .B #include >> .\" Kees Cook noted: Anything that uses SECCOMP_RET_TRACE returns will >> .\" need >> >> .BI "int seccomp(unsigned int " operation ", unsigned int " flags \ >> ", void *" args ); >> .fi >> .SH DESCRIPTION >> The >> .BR seccomp () >> system call operates on the Secure Computing (seccomp) state of the >> calling process. >> >> Currently, Linux supports the following >> .IR operation >> values: >> .TP >> .BR SECCOMP_SET_MODE_STRICT >> The only system calls that the calling thread is permitted to make are >> .BR read (2), >> .BR write (2), >> .BR _exit (2), >> and >> .BR sigreturn (2). >> Other system calls result in the delivery of a >> .BR SIGKILL >> signal. >> Strict secure computing mode is useful for number-crunching >> applications that may need to execute untrusted byte code, perhaps >> obtained by reading from a pipe or socket. >> >> This operation is available only if the kernel is configured with >> .BR CONFIG_SECCOMP >> enabled. >> >> The value of >> .IR flags >> must be 0, and >> .IR args >> must be NULL. >> >> This operation is functionally identical to the call: >> >> prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT); >> .TP >> .BR SECCOMP_SET_MODE_FILTER >> The system calls allowed are defined by a pointer to a Berkeley Packet >> Filter (BPF) passed via >> .IR args . >> This argument is a pointer to a >> .IR "struct\ sock_fprog" ; >> it can be designed to filter arbitrary system calls and system call >> arguments. >> If the filter is invalid, >> .BR seccomp () >> fails, returning >> .BR EINVAL >> in >> .IR errno . >> >> If >> .BR fork (2) >> or >> .BR clone (2) >> is allowed by the filter, any child processes will be constrained to >> the same system call filters as the parent. >> If >> .BR execve (2) >> is allowed, >> the existing filters will be preserved across a call to >> .BR execve (2). >> >> In order to use the >> .BR SECCOMP_SET_MODE_FILTER >> operation, either the caller must have the >> .BR CAP_SYS_ADMIN >> capability, or the thread must already have the >> .I no_new_privs >> bit set. >> If that bit was not already set by an ancestor of this thread, >> the thread must make the following call: >> >> prctl(PR_SET_NO_NEW_PRIVS, 1); >> >> Otherwise, the >> .BR SECCOMP_SET_MODE_FILTER >> operation will fail and return >> .BR EACCES >> in >> .IR errno . >> This requirement ensures that an unprivileged process cannot apply >> a malicious filter and then invoke a set-user-ID or >> other privileged program using >> .BR execve (2), >> thus potentially compromising that program. >> (Such a malicious filter might, for example, cause an attempt to use >> .BR setuid (2) >> to set the caller's user IDs to non-zero values to instead >> return 0 without actually making the system call. >> Thus, the program might be tricked into retaining superuser privileges >> in circumstances where it is possible to influence it to do >> dangerous things because it did not actually drop privileges.) >> >> If >> .BR prctl (2) >> or >> .BR seccomp (2) >> is allowed by the attached filter, further filters may be added. >> This will increase evaluation time, but allows for further reduction of >> the attack surface during execution of a thread. >> >> The >> .BR SECCOMP_SET_MODE_FILTER >> operation is available only if the kernel is configured with >> .BR CONFIG_SECCOMP_FILTER >> enabled. >> >> When >> .IR flags >> is 0, this operation is functionally identical to the call: >> >> prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, args); >> >> The recognized >> .IR flags >> are: >> .RS >> .TP >> .BR SECCOMP_FILTER_FLAG_TSYNC >> When adding a new filter, synchronize all other threads of the calling >> process to the same seccomp filter tree. >> A "filter tree" is the ordered list of filters attached to a thread. >> (Attaching identical filters in separate >> .BR seccomp () >> calls results in different filters from this perspective.) >> >> If any thread cannot synchronize to the same filter tree, >> the call will not attach the new seccomp filter, >> and will fail, returning the first thread ID found that cannot synchronize. >> Synchronization will fail if another thread in the same process is in >> .BR SECCOMP_MODE_STRICT >> or if it has attached new seccomp filters to itself, >> diverging from the calling thread's filter tree. >> .RE >> .SS Filters >> When adding filters via >> .BR SECCOMP_SET_MODE_FILTER , >> .IR args >> points to a filter program: >> >> .in +4n >> .nf >> struct sock_fprog { >> unsigned short len; /* Number of BPF instructions */ >> struct sock_filter *filter; /* Pointer to array of >> BPF instructions */ >> }; >> .fi >> .in >> >> Each program must contain one or more BPF instructions: >> >> .in +4n >> .nf >> struct sock_filter { /* Filter block */ >> __u16 code; /* Actual filter code */ >> __u8 jt; /* Jump true */ >> __u8 jf; /* Jump false */ >> __u32 k; /* Generic multiuse field */ >> }; >> .fi >> .in >> >> .\" FIXME I reworded/enhanced the following sentence. Is it okay? >> When executing the instructions, the BPF program operates on the >> system call information made available (i.e., use the >> .BR BPF_ABS >> addressing mode) as a buffer of the following form: > > That looks correct to me, yes. Okay. Thanks. >> .in +4n >> .nf >> struct seccomp_data { >> int nr; /* System call number */ >> __u32 arch; /* AUDIT_ARCH_* value >> (see ) */ >> __u64 instruction_pointer; /* CPU instruction pointer */ >> __u64 args[6]; /* Up to 6 system call arguments */ >> }; >> .fi >> .in >> >> A seccomp filter returns a 32-bit value consisting of two parts: >> the most significant 16 bits >> (corresponding to the mask defined by the constant >> .BR SECCOMP_RET_ACTION ) >> contain one of the "action" values listed below; >> the least significant 16-bits (defined by the constant >> .BR SECCOMP_RET_DATA ) >> are "data" to be associated with this return value. >> >> If multiple filters exist, they are all executed, >> in reverse order of their addition to the filter tree >> (i.e., the most recently installed filter is executed first). >> The return value for the evaluation of a given system call is the first-seen >> .BR SECCOMP_RET_ACTION >> value of highest precedence (along with its accompanying data) >> returned by execution of all of the filters. >> >> In decreasing order of precedence, >> the values that may be returned by a seccomp filter are: >> .TP >> .BR SECCOMP_RET_KILL >> This value results in the process exiting immediately >> without executing the system call. >> The process terminates as though killed by a >> .B SIGSYS >> signal >> .RI ( not >> .BR SIGKILL ). >> .TP >> .BR SECCOMP_RET_TRAP >> This value results in the kernel sending a >> .BR SIGSYS >> signal to the triggering process without executing the system call. >> Various fields will be set in the >> .I siginfo_t >> structure (see >> .BR sigaction (2)) >> associated with signal: >> .RS >> .IP * 3 >> .I si_signo >> will contain >> .BR SIGSYS . >> .IP * >> .IR si_call_addr >> will show the address of the system call instruction. >> .IP * >> .IR si_syscall >> and >> .IR si_arch >> will indicate which system call was attempted. >> .IP * >> .I si_code >> .\" FIXME Why is the constant thus named? All of the other 'si_code' >> .\" constants are prefixed 'SI_'. Why the inconsistency? >> will contain >> .BR SYS_SECCOMP . > > Only certain reserved values have the SI_ prefix. All the > signal-specific values have their signal name as the prefix. See ILL_* > FPE_* SEGV_* BUS_* TRAP_* CLD_* POLL_* and SYS_*. I see these in > /usr/include/asm-generic/siginfo.h Ahh -- yes, of course. Thanks. >> .IP * >> .I si_errno >> will contain the >> .BR SECCOMP_RET_DATA >> portion of the filter return value. >> .RE >> .IP >> The program counter will be as though the system call happened >> (i.e., it will not point to the system call instruction). >> The return value register will contain an architecture\-dependent value; >> if resuming execution, set it to something sensible. >> .\" FIXME Regarding the preceding line, can you give an example(s) >> .\" of "something sensible"? (Depending on the answer, maybe it >> .\" might be useful to add some text on this point.) > > This means sensible in the context of the syscall made, or the desired > behavior. For example, setting the return value to ELOOP for something > like a "bind" syscall isn't very sensible. Okay -- I did s/sensible/appropriate for the system call/ >> .\" >> .\" FIXME Please check: >> .\" In an attempt to make the text clearer, I changed >> .\" "replacing it with" to "setting the return value register to" >> .\" Okay? >> (The architecture dependency is because setting the return value register to >> .BR ENOSYS >> could overwrite some useful information.) > > Well, the arch dependency is really because _how_ to change the > register, and the register itself, is different between architectures. > (i.e. which ptrace call is needed, and which register is being > changed.) The overwriting of useful information is certainly true too, > though. So, revert to the previous wording? Or do you have a suggested better wording? >> .TP >> .BR SECCOMP_RET_ERRNO >> This value results in the >> .B SECCOMP_RET_DATA >> portion of the filter's return value being passed to user space as the >> .IR errno >> value without executing the system call. >> .TP >> .BR SECCOMP_RET_TRACE >> When returned, this value will cause the kernel to attempt to notify a >> .BR ptrace (2)-based >> tracer prior to executing the system call. >> If there is no tracer present, >> the system call is not executed and returns a failure status with >> .I errno >> set to >> .BR ENOSYS . >> >> A tracer will be notified if it requests >> .BR PTRACE_O_TRACESECCOMP >> using >> .IR ptrace(PTRACE_SETOPTIONS) . >> The tracer will be notified of a >> .BR PTRACE_EVENT_SECCOMP >> and the >> .BR SECCOMP_RET_DATA >> portion of the filter's return value will be available to the tracer via >> .BR PTRACE_GETEVENTMSG . >> >> The tracer can skip the system call by changing the system call number >> to \-1. >> Alternatively, the tracer can change the system call >> requested by changing the system call to a valid system call number. >> If the tracer asks to skip the system call, then the system call will >> appear to return the value that the tracer puts in the return value register. >> >> The seccomp check will not be run again after the tracer is notified. >> (This means that seccomp-based sandboxes >> .B "must not" >> allow use of >> .BR ptrace (2)\(emeven >> of other >> sandboxed processes\(emwithout extreme care; >> .\" FIXME Below, I think it would be helpful to add some words after >> .\" "to escape", as in "to escape [what?]" I suppose the wording >> .\" would be something like "to escape the seccomp sandbox mechanism" >> .\" but perhaps you have a better wording. >> ptracers can use this mechanism to escape.) > > Yeah, that could be further clarified to "... use this mechanism to > escape from the seccomp sandbox." How does that sound? Good. Changed. >> .TP >> .BR SECCOMP_RET_ALLOW >> This value results in the system call being executed. >> .SH RETURN VALUE >> On success, >> .BR seccomp () >> returns 0. >> On error, if >> .BR SECCOMP_FILTER_FLAG_TSYNC >> was used, >> the return value is the ID of the thread >> that caused the synchronization failure. >> (This ID is a kernel thread ID of the type returned by >> .BR clone (2) >> and >> .BR gettid (2).) >> On other errors, \-1 is returned, and >> .IR errno >> is set to indicate the cause of the error. >> .SH ERRORS >> .BR seccomp () >> can fail for the following reasons: >> .TP >> .BR EACCESS >> The caller did not have the >> .BR CAP_SYS_ADMIN >> capability, or had not set >> .IR no_new_privs >> before using >> .BR SECCOMP_SET_MODE_FILTER . >> .TP >> .BR EFAULT >> .IR args >> was not a valid address. >> .TP >> .BR EINVAL >> .IR operation >> is unknown; or >> .IR flags >> are invalid for the given >> .IR operation . >> .\" FIXME Please review the following >> .TP >> .BR EINVAL >> .I operation >> included >> .BR BPF_ABS , >> but the specified offset was not aligned to a 32-bit boundary or exceeded >> .IR "sizeof(struct\ seccomp_data)" . >> .\" FIXME Please review the following >> .TP >> .BR EINVAL >> .\" See kernel/seccomp.c::seccomp_may_assign_mode() in 3.18 sources >> A secure computing mode has already been set, and >> .I operation >> differs from the existing setting. >> .\" FIXME Please review the following >> .TP >> .BR EINVAL >> .\" See stub kernel/seccomp.c::seccomp_set_mode_filter() in 3.18 sources >> .I operation >> specified >> .BR SECCOMP_SET_MODE_FILTER , >> but the kernel was not built with >> .B CONFIG_SECCOMP_FILTER >> enabled. >> .\" FIXME Please review the following >> .TP >> .BR EINVAL >> .I operation >> specified >> .BR SECCOMP_SET_MODE_FILTER , >> but the filter program pointed to by >> .I args >> was not valid or the length of the filter program was zero or exceeded >> .B BPF_MAXINSNS >> (4096) instructions. >> .BR EINVAL >> .TP >> .BR ENOMEM >> Out of memory. >> .\" FIXME Please review the following >> .TP >> .BR ENOMEM >> .\" ENOMEM in kernel/seccomp.c::seccomp_attach_filter() in 3.18 sources >> The total length of all filter programs attached >> to the calling thread would exceed >> .B MAX_INSNS_PER_PATH >> (32768) instructions. >> Note that for the purposes of calculating this limit, >> each already existing filter program incurs an >> overhead penalty of 4 instructions. >> .TP >> .BR ESRCH >> Another thread caused a failure during thread sync, but its ID could not >> be determined. >> .SH VERSIONS >> The >> .BR seccomp() >> system call first appeared in Linux 3.17. >> .\" FIXME . Add glibc version >> .SH CONFORMING TO >> The >> .BR seccomp() >> system call is a nonstandard Linux extension. >> .SH NOTES >> .BR seccomp () >> provides a superset of the functionality provided by the >> .BR prctl (2) >> .BR PR_SET_SECCOMP >> operation (which does not support >> .IR flags ). >> .\" FIXME Please review the following new subsection {{{ >> .SS Seccomp-specific BPF details >> Note the following BPF details specific to seccomp filters: >> .IP * 3 >> The >> .B BPF_H >> and >> .B BPF_B >> size modifiers are not supported: all operations must load and store >> (4-byte) words >> .RB ( BPF_W ). >> .IP * >> To access the contents of the >> .I seccomp_data >> buffer, use the >> .B BPF_ABS >> addressing mode modifier. >> .\" FIXME What is the significance of the line >> .\" ftest->code = BPF_LDX | BPF_W | BPF_ABS; >> .\" in kernel/seccomp.c::seccomp_check_filter()? > > This is converting an accumulator load (BPF_LD) into a index load > (BPF_LDX). I think this is to avoid addressing modes 1 and 2, but Will > may remember more here. The LD|W|ABS structure is very common, so I > think this was a way to accept that in the filter, but change it into > a more limited command. Will, could you comment? >> .IP * >> The >> .B BPF_LEN >> addressing mode modifier yields an immediate mode operand >> whose value is the size of the >> .IR seccomp_data >> buffer. >> .\" FIXME Any other seccomp-specific BPF details that should be added here? >> .\" >> .\" FIXME End of new subsection for review }}} > > All the rest of the FIXMEs above (excepting the standing glibc one) > looks correct to me. Okay. Thanks. >> .SH EXAMPLE >> The program below accepts four or more arguments. >> The first three arguments are a system call number, >> a numeric architecture identifier, and an error number. >> The program uses these values to construct a BPF filter >> that is used at run time to perform the following checks: >> .IP [1] 4 >> If the program is not running on the specified architecture, >> the BPF filter causes system calls to fail with the error >> .BR ENOSYS . >> .IP [2] >> If the program attempts to execute the system call with the specified number, >> the BPF filter causes the system call to fail, with >> .I errno >> being set to the specified error number. >> .PP >> The remaining command-line arguments specify >> the pathname and additional arguments of a program >> that the example program should attempt to execute using >> .BR execve (3) >> (a library function that employs the >> .BR execve (2) >> system call). >> Some example runs of the program are shown below. >> >> First, we display the architecture that we are running on (x86-64) >> and then construct a shell function that looks up system call >> numbers on this architecture: >> >> .nf >> .in +4n >> $ \fBuname -m\fP >> x86_64 >> $ \fBsyscall_nr() { >> cat /usr/src/linux/arch/x86/syscalls/syscall_64.tbl | \\ >> awk '$2 != "x32" && $3 == "'$1'" { print $1 }' >> }\fP >> .in >> .fi >> >> When the BPF filter rejects a system call (case [2] above), >> it causes the system call to fail with the error number >> specified on the command line. >> In the experiments shown here, we'll use error number 99: >> >> .nf >> .in +4n >> $ \fBerrno 99\fP >> EADDRNOTAVAIL 99 Cannot assign requested address >> .in >> .fi >> >> In the following example, we attempt to run the command >> .BR whoami (1), >> but the BPF filter rejects the >> .BR execve (2) >> system call, so that the command is not even executed: >> >> .nf >> .in +4n >> $ \fBsyscall_nr execve\fP >> 59 >> $ \fB./a.out\fP >> Usage: ./a.out [] >> Hint for : AUDIT_ARCH_I386: 0x40000003 >> AUDIT_ARCH_X86_64: 0xC000003E >> $ \fB./a.out 59 0xC000003E 99 /bin/whoami\fP >> execv: Cannot assign requested address >> .in >> .fi >> >> In the next example, the BPF filter rejects the >> .BR write (2) >> system call, so that, although it is successfully started, the >> .BR whoami (1) >> command is not able to write output: >> >> .nf >> .in +4n >> $ \fBsyscall_nr write\fP >> 1 >> $ \fB./a.out 1 0xC000003E 99 /bin/whoami\fP >> .in >> .fi >> >> In the final example, >> the BPF filter rejects a system call that is not used by the >> .BR whoami (1) >> command, so it is able to successfully execute and produce output: >> >> .nf >> .in +4n >> $ \fBsyscall_nr preadv\fP >> 295 >> $ \fB./a.out 295 0xC000003E 99 /bin/whoami\fP >> cecilia >> .in >> .fi >> .SS Program source >> .fi >> .nf >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> >> static int >> install_filter(int syscall_nr, int t_arch, int f_errno) >> { >> .\" FIXME In the BPF program below, you use '+' to build the instructions. >> .\" However, most other BPF example code I see uses '|'. While I >> .\" assume it's equivalent (i.e., the bit fields are nonoverlapping), >> .\" was there a reason to use '+' rather than '|'? (To me, the >> .\" latter is a little clearer in its intent.) > > Ah, no, "|" should be used, good catch. Okay -- all instances of '+' changed to '|' >> .\" >> .\" FIXME I expanded comments [0], [1], [2], [3], [4] a little. >> .\" Are they okay? */ > > Yup, these look good to me. Okay. Thanks. > >> .\" >> struct sock_filter filter[] = { >> /* [0] Load architecture from 'seccomp_data' buffer into >> accumulator */ >> BPF_STMT(BPF_LD + BPF_W + BPF_ABS, >> (offsetof(struct seccomp_data, arch))), >> >> /* [1] Jump forward 4 instructions if architecture does not >> match 't_arch' */ >> BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, t_arch, 0, 4), >> >> /* [2] Load system call number from 'seccomp_data' buffer into >> accumulator */ >> BPF_STMT(BPF_LD + BPF_W + BPF_ABS, >> (offsetof(struct seccomp_data, nr))), >> >> /* [3] Jump forward 1 instruction if system call number >> does not match 'syscall_nr' */ >> BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, syscall_nr, 0, 1), >> >> /* [4] Matching architecture and system call: don't execute >> the system call, and return 'f_errno' in 'errno' */ >> BPF_STMT(BPF_RET + BPF_K, >> SECCOMP_RET_ERRNO | (f_errno & SECCOMP_RET_DATA)), >> >> /* [5] Destination of system call number mismatch: allow other >> system calls */ >> BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_ALLOW), >> >> /* [6] Destination of architecture mismatch: kill process */ >> BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_KILL), >> }; >> >> struct sock_fprog prog = { >> .len = (unsigned short) (sizeof(filter) / sizeof(filter[0])), >> .filter = filter, >> }; >> >> if (seccomp(SECCOMP_SET_MODE_FILTER, 0, &prog)) { >> perror("seccomp"); >> return 1; >> } >> >> return 0; >> } >> >> int >> main(int argc, char **argv) >> { >> if (argc < 5) { >> fprintf(stderr, "Usage: " >> "%s []\\n" >> "Hint for : AUDIT_ARCH_I386: 0x%X\\n" >> " AUDIT_ARCH_X86_64: 0x%X\\n" >> "\\n", argv[0], AUDIT_ARCH_I386, AUDIT_ARCH_X86_64); >> exit(EXIT_FAILURE); >> } >> >> if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) { >> perror("prctl"); >> exit(EXIT_FAILURE); >> } >> >> if (install_filter(strtol(argv[1], NULL, 0), >> strtol(argv[2], NULL, 0), >> strtol(argv[3], NULL, 0))) >> exit(EXIT_FAILURE); >> >> execv(argv[4], &argv[4]); >> perror("execv"); >> exit(EXIT_FAILURE); >> } >> .fi >> .SH SEE ALSO >> .BR prctl (2), >> .BR ptrace (2), >> .BR signal (7), >> .BR socket (7) >> .sp >> The kernel source files >> .IR Documentation/networking/filter.txt >> and >> .IR Documentation/prctl/seccomp_filter.txt . >> .sp >> McCanne, S. and Jacobson, V. (1992) >> .IR "The BSD Packet Filter: A New Architecture for User-level Packet Capture" , >> Proceedings of the USENIX Winter 1993 Conference >> .UR http://www.tcpdump.org/papers/bpf-usenix93.pdf >> .UE > > Thanks for the additional details and clarifications! Thanks. We're getting close now. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/