Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp3405617imm; Sun, 13 May 2018 10:35:45 -0700 (PDT) X-Google-Smtp-Source: AB8JxZq3wz/lDDhzn6heJCMsRMww9xXN5Gc2soFn6xF80HahdxilsTZG6xn/YTY1gubrec5pDLpR X-Received: by 2002:a63:b742:: with SMTP id w2-v6mr5838710pgt.343.1526232945140; Sun, 13 May 2018 10:35:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526232945; cv=none; d=google.com; s=arc-20160816; b=Cx3UxxiqN/LB+KVSSq6AdXtvUg87KZhfS7i6/WwAsAtxnl5c3juWBK5b1iUFgqFh32 zsC6n/9P22bCYoDx7NEhPJ9aCegyDM/PjL+ie3hblTukHataGxUCEbmMkWspDqxv7FtU d2oMQtqP0uVSLzM65dDWCIw31VqqwY3dp9vi3s9hlSKLkGyn+Vz60Bsg5XMN0Dn5TVay kSHTKcwwAAt6/Wq9kJUgNYlM88BfG98dvniKXSa6NXKv9NGFFJV07Zp8uq6jq0/iS9uu 9A+Z5jp9oM25Tp6Cpy0zZl21X2vtHzq+ewwkM+jpBHke5Cs6797QBEmfVYVk8g01otgh SBWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=/O07zhqyJhf3SAVB/WUi61oEU5pZYsjEzjRZ4QBWLpg=; b=A/QeiGYuk0RvP8pboUkkFavhRxt3zUft0Dzq/qpc29BU65r8Bp7WUn3KZMcxiS76OH XSPb85hmKSsmD1c13Rfi+17D4VIwBOav7Bqf/l+6hfFBVyPKtSsocJrqfg0Z+X3bjZCc Y2esVmgKhxxBb8QhBt9Ts18ePssGNj79K8NQWueMERWNe8CTZdbRV85cJg0i0LFTFQ5K ikzVOMo9rFo2UiryBXKUEoYqglH2MX9pP/01fCiucg5VaXUSr1v+lW2lbotE53jNfDIu b06aZEwc/kOA4LboqNM8fEeBiAzWGH08x630avy6dGMyDenkUI9VmFFBJyNJeKWWYHLV jpwA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=VSgEba+u; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h64-v6si6083497pgc.673.2018.05.13.10.35.30; Sun, 13 May 2018 10:35:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=VSgEba+u; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752019AbeEMRfS (ORCPT + 99 others); Sun, 13 May 2018 13:35:18 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:55808 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751787AbeEMRfQ (ORCPT ); Sun, 13 May 2018 13:35:16 -0400 Received: by mail-wm0-f66.google.com with SMTP id a8-v6so9745649wmg.5; Sun, 13 May 2018 10:35:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id; bh=/O07zhqyJhf3SAVB/WUi61oEU5pZYsjEzjRZ4QBWLpg=; b=VSgEba+u7EPMQJIJXczxnlunLEyvcBdQaWI8+4zDznyNL/vLYOGafUuCEvjlBkVUHE FQtb1tjg2uBC6Ze3g8qvhGbkyny4mPoxlvN8cFnfGKwRvDuQzwch9ah9KqYaTv3z65E2 ulQr6L6tqovzvObvj5jTAZC+Mg1rV748PsuTz1I1fsSWq+plEReniSyyE+eZgSMpGGSP mVP1/edPSJeuB+lAUK7eU5Oe13eZ91qSlvY/r5zv0mHWm5X6Jc2hCvWCfWI+2J27C5nS 11szM3hDTgxC4ToF5jSgldmTR4etISICZSX9ISTL5s5Sk3LpBfxDX/n/fl8tlUeFD7zL gUFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id; bh=/O07zhqyJhf3SAVB/WUi61oEU5pZYsjEzjRZ4QBWLpg=; b=goGIC3wZ1enp3ve4ihF+sa0Ow6BSWY9xax1zSylEzgWGY8d4b5kCfTLDNBFjvXjBT8 Ro5gZAiNi8wYtiA5Mgav+PC8AS89YYfX7aTP2UoPVaNVgZvzTG1w6nx6GFjGMTyEtbXh tquE+8BSaW6ARryXB21kx79m2UoBNSPJAcks1ryoeqCwa8WKKy6Mo4UO7JqAoDotY1kp aHeJpN1sAEfw7BbDu26frpdoS1O7pOhWt/4dcLb88GxIzxrFsbhW9KN5FE8rWuPPq8EC Hg/LGZVGzrpGY8NxLbs8i9PHKYR9zUkPYPmefm5xnfPPGI6fNfcZ9WO9gN6L3b1+wVaj nDRw== X-Gm-Message-State: ALKqPwcgmpxjxml8vBZ0r+/jO233jpv8K1iad4Iqq6jjvjlAG5DtK3Yr P7mVNVw8pLA82IguhbHXqBfulf2L X-Received: by 2002:a50:b763:: with SMTP id g90-v6mr8817857ede.129.1526232914378; Sun, 13 May 2018 10:35:14 -0700 (PDT) Received: from neptune.primacom.net ([2a00:c1a0:489a:8700:e6a4:71ff:fe8c:a7ee]) by smtp.gmail.com with ESMTPSA id j17-v6sm4366519ede.1.2018.05.13.10.35.13 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 13 May 2018 10:35:13 -0700 (PDT) From: Alban Crequy X-Google-Original-From: Alban Crequy To: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org, cgroups@vger.kernel.org Cc: Alban Crequy Subject: [PATCH] [RFC] bpf: tracing: new helper bpf_get_current_cgroup_ino Date: Sun, 13 May 2018 19:33:18 +0200 Message-Id: <20180513173318.21680-1-alban@kinvolk.io> X-Mailer: git-send-email 2.14.3 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Alban Crequy bpf_get_current_cgroup_ino() allows BPF trace programs to get the inode of the cgroup where the current process resides. My use case is to get statistics about syscalls done by a specific Kubernetes container. I have a tracepoint on raw_syscalls/sys_enter and a BPF map containing the cgroup inode that I want to trace. I use bpf_get_current_cgroup_ino() and I quickly return from the tracepoint if the inode is not in the BPF hash map. Without this BPF helper, I would need to keep track of all pids in the container. The Netlink proc connector can be used to follow process creation and destruction but it is racy. This patch only looks at the memory cgroup, which was enough for me since each Kubernetes container is placed in a different mem cgroup. For a generic implementation, I'm not sure how to proceed: it seems I would need to use 'for_each_root(root)' (see example in proc_cgroup_show() from kernel/cgroup/cgroup.c) but I don't know if taking the cgroup mutex is possible in the BPF helper function. It might be ok in the tracepoint raw_syscalls/sys_enter but could the mutex already be taken in some other tracepoints? Signed-off-by: Alban Crequy --- include/uapi/linux/bpf.h | 11 ++++++++++- kernel/trace/bpf_trace.c | 25 +++++++++++++++++++++++++ 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c5ec89732a8d..38ac3959cdf3 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -755,6 +755,14 @@ union bpf_attr { * @addr: pointer to struct sockaddr to bind socket to * @addr_len: length of sockaddr structure * Return: 0 on success or negative error code + * + * u64 bpf_get_current_cgroup_ino(hierarchy, flags) + * Get the cgroup{1,2} inode of current task under the specified hierarchy. + * @hierarchy: cgroup hierarchy + * @flags: reserved for future use + * Return: + * == 0 error + * > 0 inode of the cgroup */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -821,7 +829,8 @@ union bpf_attr { FN(msg_apply_bytes), \ FN(msg_cork_bytes), \ FN(msg_pull_data), \ - FN(bind), + FN(bind), \ + FN(get_current_cgroup_ino), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 56ba0f2a01db..9bf92a786639 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -524,6 +524,29 @@ static const struct bpf_func_proto bpf_probe_read_str_proto = { .arg3_type = ARG_ANYTHING, }; +BPF_CALL_2(bpf_get_current_cgroup_ino, u32, hierarchy, u64, flags) +{ + // TODO: pick the correct hierarchy instead of the mem controller + struct cgroup *cgrp = task_cgroup(current, memory_cgrp_id); + + if (unlikely(!cgrp)) + return -EINVAL; + if (unlikely(hierarchy)) + return -EINVAL; + if (unlikely(flags)) + return -EINVAL; + + return cgrp->kn->id.ino; +} + +static const struct bpf_func_proto bpf_get_current_cgroup_ino_proto = { + .func = bpf_get_current_cgroup_ino, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_DONTCARE, + .arg2_type = ARG_DONTCARE, +}; + static const struct bpf_func_proto * tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { @@ -564,6 +587,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_get_prandom_u32_proto; case BPF_FUNC_probe_read_str: return &bpf_probe_read_str_proto; + case BPF_FUNC_get_current_cgroup_ino: + return &bpf_get_current_cgroup_ino_proto; default: return NULL; } -- 2.14.3