Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp1012388imm; Mon, 21 May 2018 19:23:30 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrf0wGkTRCDfKuUMLnf3+u3+3J6oOfZMU8B/ub/zwSu5Av5svJdxMZx23liQyj92Q35gSPl X-Received: by 2002:a17:902:8c83:: with SMTP id t3-v6mr22898935plo.357.1526955810882; Mon, 21 May 2018 19:23:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526955810; cv=none; d=google.com; s=arc-20160816; b=JNrNsanhqKErQAQ8gaN6Et+rmI1IsMi9KpbMLrlQdwEMZR0DAiSagWRYEFzc1g0DB9 HpR8AfeKJi6pBb5GrH6klcOouv5rA23GZe5NYS9v7LverTU1UqeBVy2dJn/G3BbbpkwA TVLcgpZkz+YhqPoTn1pw8xsUQTGr7oMKPfGPOH/W2iAX7FT4LtvEtCE2seMcQbwOL0bd nbCKrZWp9lGAASk+KGNvJ1J9O+vPbAfjPnRBYIF3fIhp50B6MD96vKmz49brRt6qjQUS ph7xsx34d5jVJstYs8Ds6PmOKhWR+XrETWJ0EHXVXoKkk0RlBKjjY5ZSFuHCvuWTgzzk XKBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:subject :smtp-origin-cluster:cc:to:smtp-origin-hostname:from :smtp-origin-hostprefix:arc-authentication-results; bh=LOT04lg8mUgwzc6aiyU21QO8czPVx7TowCYYKUYpSHs=; b=YVvNW0+htMQIPuVMwKPzmvAMo3sK5jWLWIdpPiiHRpPHxxMssI8p/L+XA267fiBrCB T5lQF6Nr2oFmhW0q14Jp+/n29atpkfMPc72lK/6Z+/tP7sHT7iP+xbRlBEIBfPOuIrhh trjYzAySMB0KFpVGtP/cvcvzXaGvmVWhtmH14I8SSyhLoRMLQ+834nssohmJwQ29q+4D xT5mxWWSDnTxJ4qjiEhBbnUk/QqfXfXMUfS7CxWBsdY/Tpeko8PciaIVg/nhX+liu88h Kv2P7rZ+bZFVE7dwvFTBZwJS9H6g94+eZipl6raoTx2fP/i+7sNnL43OlAUeWna1iUow NN/A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p23-v6si12008951pgv.153.2018.05.21.19.23.16; Mon, 21 May 2018 19:23:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752464AbeEVCWi (ORCPT + 99 others); Mon, 21 May 2018 22:22:38 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:59204 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751813AbeEVCWd (ORCPT ); Mon, 21 May 2018 22:22:33 -0400 Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w4M2Jo0I025729 for ; Mon, 21 May 2018 19:22:33 -0700 Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 2j46p20f21-4 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Mon, 21 May 2018 19:22:33 -0700 Received: from mx-out.facebook.com (192.168.52.123) by PRN-CHUB03.TheFacebook.com (192.168.16.13) with Microsoft SMTP Server id 14.3.361.1; Mon, 21 May 2018 19:22:31 -0700 Received: by devbig007.ftw2.facebook.com (Postfix, from userid 572438) id 18809760B5F; Mon, 21 May 2018 19:22:30 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Alexei Starovoitov Smtp-Origin-Hostname: devbig007.ftw2.facebook.com To: "David S . Miller" CC: , , , , , , , , Smtp-Origin-Cluster: ftw2c04 Subject: [PATCH v3 net-next 0/2] bpfilter Date: Mon, 21 May 2018 19:22:28 -0700 Message-ID: <20180522022230.2492505-1-ast@kernel.org> X-Mailer: git-send-email 2.9.5 X-FB-Internal: Safe MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-05-21_11:,, signatures=0 X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi All, v2->v3: - followed Luis's suggestion and significantly simplied first patch with shmem_kernel_file_setup+kernel_write. Added kdoc for new helper - fixed typos and race to access pipes with mutex - tested with bpfilter being 'builtin'. CONFIG_BPFILTER_UMH=y|m both work. Interesting to see a usermode executable being embedded inside vmlinux. - it doesn't hurt to enable bpfilter in .config. ip_setsockopt commands sent to usermode via pipes and -ENOPROTOOPT is returned from userspace, so kernel falls back to original iptables code v1->v2: this patch set is almost a full rewrite of the earlier umh modules approach The v1 of patches and follow up discussion was covered by LWN: https://lwn.net/Articles/749108/ I believe the v2 addresses all issues brought up by Andy and others. Mainly there are zero changes to kernel/module.c Instead of teaching module loading logic to recognize special umh module, let normal kernel modules execute part of its own .init.rodata as a new user space process (Andy's idea) Patch 1 introduces this new helper: int fork_usermode_blob(void *data, size_t len, struct umh_info *info); Input: data + len == executable file Output: struct umh_info { struct file *pipe_to_umh; struct file *pipe_from_umh; pid_t pid; }; Advantages vs v1: - the embedded user mode executable is stored as .init.rodata inside normal kernel module. These pages are freed when .ko finishes loading - the elf file is copied into tmpfs file. The user mode process is swappable. - the communication between user mode process and 'parent' kernel module is done via two unix pipes, hence protocol is not exposed to user space - impossible to launch umh on its own (that was the main issue of v1) and impossible to be man-in-the-middle due to pipes - bpfilter.ko consists of tiny kernel part that passes the data between kernel and umh via pipes and much bigger umh part that doing all the work - 'lsmod' shows bpfilter.ko as usual. 'rmmod bpfilter' removes kernel module and kills corresponding umh - signed bpfilter.ko covers the whole image including umh code Few issues: - the user can still attach to the process and debug it with 'gdb /proc/pid/exe pid', but 'gdb -p pid' doesn't work. (a bit worse comparing to v1) - tinyconfig will notice a small increase in .text +766 | TEXT | 7c8b94806bec umh: introduce fork_usermode_blob() helper Alexei Starovoitov (2): umh: introduce fork_usermode_blob() helper net: add skeleton of bpfilter kernel module fs/exec.c | 38 ++++++++++--- include/linux/binfmts.h | 1 + include/linux/bpfilter.h | 15 +++++ include/linux/umh.h | 12 ++++ include/uapi/linux/bpfilter.h | 21 +++++++ kernel/umh.c | 125 +++++++++++++++++++++++++++++++++++++++++- net/Kconfig | 2 + net/Makefile | 1 + net/bpfilter/Kconfig | 16 ++++++ net/bpfilter/Makefile | 30 ++++++++++ net/bpfilter/bpfilter_kern.c | 111 +++++++++++++++++++++++++++++++++++++ net/bpfilter/main.c | 63 +++++++++++++++++++++ net/bpfilter/msgfmt.h | 17 ++++++ net/ipv4/Makefile | 2 + net/ipv4/bpfilter/Makefile | 2 + net/ipv4/bpfilter/sockopt.c | 42 ++++++++++++++ net/ipv4/ip_sockglue.c | 17 ++++++ 17 files changed, 503 insertions(+), 12 deletions(-) create mode 100644 include/linux/bpfilter.h create mode 100644 include/uapi/linux/bpfilter.h create mode 100644 net/bpfilter/Kconfig create mode 100644 net/bpfilter/Makefile create mode 100644 net/bpfilter/bpfilter_kern.c create mode 100644 net/bpfilter/main.c create mode 100644 net/bpfilter/msgfmt.h create mode 100644 net/ipv4/bpfilter/Makefile create mode 100644 net/ipv4/bpfilter/sockopt.c -- 2.9.5