Syzkaller reports the following crash:
kernel BUG at include/linux/skbuff.h:2311!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 2 PID: 4615 Comm: syz-executor260 Not tainted 5.10.152-syzkaller #0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
RIP: 0010:__skb_pull include/linux/skbuff.h:2311 [inline]
RIP: 0010:__ip_make_skb.cold+0x5b/0x5d net/ipv4/ip_output.c:1507
Code: 79 a2 f9 8b 44 24 3c 89 ee 48 c7 c7 00 2e de 88 41 89 45 70 e8 d1 84 de ff 31 d2 4c 89 ee 48 c7 c7 40 2e de 88 e8 a1 26 ff ff <0f> 0b e8 63 79 a2 f9 e8 4e f7 e2 f9 48 c7 c7 40 39 de 88 e8 5e c1
RSP: 0018:ffff88801e0af698 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88814c408288 RCX: ffffffff87ce00ec
RDX: ffff888024a11ac0 RSI: ffffffff87ceccc6 RDI: 0000000000000001
RBP: 0000000000000028 R08: 000000000000003b R09: ffff8880b8438ba7
R10: 0000000000000000 R11: 0000000000000001 R12: ffff8881436a9cc0
R13: ffff8881436a9cc0 R14: ffff8881436a9e00 R15: dffffc0000000000
FS: 00007f2750eb7700(0000) GS:ffff8880b8500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4438a29010 CR3: 0000000021f58000 CR4: 0000000000350ee0
Call Trace:
ip_finish_skb include/net/ip.h:241 [inline]
udp_push_pending_frames net/ipv4/udp.c:979 [inline]
udp_sendpage+0x36e/0x570 net/ipv4/udp.c:1354
inet_sendpage+0xd3/0x140 net/ipv4/af_inet.c:831
kernel_sendpage.part.0+0x13c/0x280 net/socket.c:3514
kernel_sendpage net/socket.c:3511 [inline]
sock_sendpage+0xe5/0x140 net/socket.c:944
pipe_to_sendpage+0x2af/0x380 fs/splice.c:364
splice_from_pipe_feed fs/splice.c:418 [inline]
__splice_from_pipe+0x3e5/0x840 fs/splice.c:562
splice_from_pipe fs/splice.c:597 [inline]
generic_splice_sendpage+0xd4/0x140 fs/splice.c:743
do_splice_from fs/splice.c:764 [inline]
direct_splice_actor+0x10f/0x170 fs/splice.c:933
splice_direct_to_actor+0x38f/0x990 fs/splice.c:888
do_splice_direct+0x1b3/0x280 fs/splice.c:976
do_sendfile+0x553/0x10a0 fs/read_write.c:1257
__do_sys_sendfile64 fs/read_write.c:1318 [inline]
__se_sys_sendfile64 fs/read_write.c:1304 [inline]
__x64_sys_sendfile64+0x1d0/0x210 fs/read_write.c:1304
do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x61/0xc6
It was actually found on a 5.10 kernel instance but I didn't find any
upstream commit referencing something of that kind so this bug is likely
to be in upstream, too. The reproducers unfortunately do not work on my
machine, but I'll add one in the next email for additional info.
From here the terminology is used as from the fragment of __ip_make_skb():
---
if (skb->data < skb_network_header(skb))
__skb_pull(skb, skb_network_offset(skb));
while ((tmp_skb = __skb_dequeue(queue)) != NULL) {
__skb_pull(tmp_skb, skb_network_header_len(skb)); <-- BUG is here
---
We get the first fragment (called 'skb') from the queue and then start
getting another fragments ('tmp_skb') from the queue and combine them to
the first fragment.
The problem is that the difference between tmp_skb->len and
tmp_skb->data_len is smaller than the length of skb network (IP) header
and while doing __skb_pull(), tmp_skb->len becomes smaller than
tmp_skb->data_len causing a bug.
Something is probably wrong with IP header layout of the first fragment
(the SKB to which we are combining another ones). It is 40 bytes long, and
the tmp_skb IP header's length is 20 bytes (have a look at debug info
lower). The first fragment, however, can contain some specific IP options
which are stored only in this fragment and are not included into the
following ones. On the other hand, the problem can be with tmp_skb where
something casued its data_len be incorrect.
Maybe there is some sanity check missing while constructing an IP datagram
in ip_append_page()? Additional check of extra IP headers or MTU
values...?
We managed to get some debug info about the failing SKBs.
tmp_skb info:
skb len=1476 headroom=160 headlen=20 tailroom=0
mac=(-1,-1) net=(160,20) trans=180
shinfo(txflags=0 nr_frags=1 gso(size=0 type=0 segs=0))
csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
hash(0xe62f05e3 sw=0 l4=1) proto=0x0000 pkttype=0 iif=0
sk family=2 type=2 proto=17
skb linear: 00000000: 00 00 00 00 00 00 00 00 30 06 a9 86 ff ff ff ff
skb linear: 00000010: 02 00 00 00
skb frag: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000110: 00 00 00 00 00 00 00 00 00 00 00 00
skb info:
skb len=1476 headroom=168 headlen=241 tailroom=0
mac=(-1,-1) net=(168,40) trans=208
shinfo(txflags=0 nr_frags=1 gso(size=0 type=0 segs=0))
csum(0x3f886391 ip_summed=0 complete_sw=0 valid=0 level=0)
hash(0xe62f05e3 sw=0 l4=1) proto=0x86dd pkttype=0 iif=0
sk family=2 type=2 proto=17
skb linear: 00000000: 00 00 00 00 00 00 00 00 28 00 00 00 00 00 00 00
skb linear: 00000010: 80 44 f9 8a ff ff ff ff 00 00 00 00 00 00 00 00
skb linear: 00000020: 00 00 00 00 00 00 00 00 fe ff ff ff 00 00 00 00
skb linear: 00000030: f3 57 61 ad ca 6f 38 25 00 62 d5 b1 7d e1 d0 94
skb linear: 00000040: 04 ae 20 57 20 1a 06 db 10 92 76 4f 8d 2e af 83
skb linear: 00000050: 91 0c c3 cd b5 d1 96 9e c8 7e c8 e5 90 a9 be aa
skb linear: 00000060: ae f8 7d 1d bb af 99 62 36 3f c9 a3 44 4e 18 fa
skb linear: 00000070: 0e 5f 40 32 59 ad 8b 90 89 df 79 63 13 80 da 2b
skb linear: 00000080: de 47 62 24 61 fd 47 d8 89 4a 74 8a 91 32 aa c6
skb linear: 00000090: ad 59 30 2f 2c 2e 94 9f 83 00 46 5b f9 11 98 a9
skb linear: 000000a0: cb ed ca cb 8d 70 d8 78 4c 95 b6 12 af 8b 81 33
skb linear: 000000b0: 13 58 9d 74 ef 30 91 1d 10 bd 55 22 67 6b b9 43
skb linear: 000000c0: 97 72 5c a2 c7 24 df f4 2c 3f b8 5e cb 3a 10 f6
skb linear: 000000d0: 10 5f 3a 11 32 2a d5 22 b2 14 73 c0 1a df b0 6f
skb linear: 000000e0: 3c 98 91 df bf d2 b6 99 ee 3c fe 91 98 f6 55 20
skb linear: 000000f0: 45
skb frag: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb frag: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
More info:
tmp_skb->len = 1476; tmp_skb->data_len = 1456
skb->len = 1476; skb->data_len = 1235
I actually found some commits with resembling call trace but they don't help
me that much to solve the issue:
10b8a3de603d ("ipv6: the entire IPv6 header chain must fit the first fragment")
e9d3f80935b6 ("net/af_packet: make sure to pull mac header")
501a90c94510 ("inet: protect against too small mtu values.")
Fedor
// https://None.appspot.com/bug?id=8285e071373596d59da12d6c5721d04d100dbe56
// autogenerated by syzkaller (https://github.com/google/syzkaller)
#define _GNU_SOURCE
#include <arpa/inet.h>
#include <dirent.h>
#include <endian.h>
#include <errno.h>
#include <fcntl.h>
#include <net/if.h>
#include <net/if_arp.h>
#include <netinet/in.h>
#include <pthread.h>
#include <sched.h>
#include <signal.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/epoll.h>
#include <sys/ioctl.h>
#include <sys/mount.h>
#include <sys/prctl.h>
#include <sys/resource.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/uio.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>
#include <linux/capability.h>
#include <linux/futex.h>
#include <linux/genetlink.h>
#include <linux/if_addr.h>
#include <linux/if_ether.h>
#include <linux/if_link.h>
#include <linux/if_tun.h>
#include <linux/in6.h>
#include <linux/ip.h>
#include <linux/neighbour.h>
#include <linux/net.h>
#include <linux/netlink.h>
#include <linux/nl80211.h>
#include <linux/rfkill.h>
#include <linux/rtnetlink.h>
#include <linux/tcp.h>
#include <linux/veth.h>
static unsigned long long procid;
static void sleep_ms(uint64_t ms)
{
usleep(ms * 1000);
}
static uint64_t current_time_ms(void)
{
struct timespec ts;
if (clock_gettime(CLOCK_MONOTONIC, &ts))
exit(1);
return (uint64_t)ts.tv_sec * 1000 + (uint64_t)ts.tv_nsec / 1000000;
}
static void use_temporary_dir(void)
{
char tmpdir_template[] = "./syzkaller.XXXXXX";
char* tmpdir = mkdtemp(tmpdir_template);
if (!tmpdir)
exit(1);
if (chmod(tmpdir, 0777))
exit(1);
if (chdir(tmpdir))
exit(1);
}
static void thread_start(void* (*fn)(void*), void* arg)
{
pthread_t th;
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setstacksize(&attr, 128 << 10);
int i = 0;
for (; i < 100; i++) {
if (pthread_create(&th, &attr, fn, arg) == 0) {
pthread_attr_destroy(&attr);
return;
}
if (errno == EAGAIN) {
usleep(50);
continue;
}
break;
}
exit(1);
}
typedef struct {
int state;
} event_t;
static void event_init(event_t* ev)
{
ev->state = 0;
}
static void event_reset(event_t* ev)
{
ev->state = 0;
}
static void event_set(event_t* ev)
{
if (ev->state)
exit(1);
__atomic_store_n(&ev->state, 1, __ATOMIC_RELEASE);
syscall(SYS_futex, &ev->state, FUTEX_WAKE | FUTEX_PRIVATE_FLAG, 1000000);
}
static void event_wait(event_t* ev)
{
while (!__atomic_load_n(&ev->state, __ATOMIC_ACQUIRE))
syscall(SYS_futex, &ev->state, FUTEX_WAIT | FUTEX_PRIVATE_FLAG, 0, 0);
}
static int event_isset(event_t* ev)
{
return __atomic_load_n(&ev->state, __ATOMIC_ACQUIRE);
}
static int event_timedwait(event_t* ev, uint64_t timeout)
{
uint64_t start = current_time_ms();
uint64_t now = start;
for (;;) {
uint64_t remain = timeout - (now - start);
struct timespec ts;
ts.tv_sec = remain / 1000;
ts.tv_nsec = (remain % 1000) * 1000 * 1000;
syscall(SYS_futex, &ev->state, FUTEX_WAIT | FUTEX_PRIVATE_FLAG, 0, &ts);
if (__atomic_load_n(&ev->state, __ATOMIC_ACQUIRE))
return 1;
now = current_time_ms();
if (now - start > timeout)
return 0;
}
}
static bool write_file(const char* file, const char* what, ...)
{
char buf[1024];
va_list args;
va_start(args, what);
vsnprintf(buf, sizeof(buf), what, args);
va_end(args);
buf[sizeof(buf) - 1] = 0;
int len = strlen(buf);
int fd = open(file, O_WRONLY | O_CLOEXEC);
if (fd == -1)
return false;
if (write(fd, buf, len) != len) {
int err = errno;
close(fd);
errno = err;
return false;
}
close(fd);
return true;
}
struct nlmsg {
char* pos;
int nesting;
struct nlattr* nested[8];
char buf[4096];
};
static void netlink_init(struct nlmsg* nlmsg, int typ, int flags,
const void* data, int size)
{
memset(nlmsg, 0, sizeof(*nlmsg));
struct nlmsghdr* hdr = (struct nlmsghdr*)nlmsg->buf;
hdr->nlmsg_type = typ;
hdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | flags;
memcpy(hdr + 1, data, size);
nlmsg->pos = (char*)(hdr + 1) + NLMSG_ALIGN(size);
}
static void netlink_attr(struct nlmsg* nlmsg, int typ, const void* data,
int size)
{
struct nlattr* attr = (struct nlattr*)nlmsg->pos;
attr->nla_len = sizeof(*attr) + size;
attr->nla_type = typ;
if (size > 0)
memcpy(attr + 1, data, size);
nlmsg->pos += NLMSG_ALIGN(attr->nla_len);
}
static void netlink_nest(struct nlmsg* nlmsg, int typ)
{
struct nlattr* attr = (struct nlattr*)nlmsg->pos;
attr->nla_type = typ;
nlmsg->pos += sizeof(*attr);
nlmsg->nested[nlmsg->nesting++] = attr;
}
static void netlink_done(struct nlmsg* nlmsg)
{
struct nlattr* attr = nlmsg->nested[--nlmsg->nesting];
attr->nla_len = nlmsg->pos - (char*)attr;
}
static int netlink_send_ext(struct nlmsg* nlmsg, int sock, uint16_t reply_type,
int* reply_len, bool dofail)
{
if (nlmsg->pos > nlmsg->buf + sizeof(nlmsg->buf) || nlmsg->nesting)
exit(1);
struct nlmsghdr* hdr = (struct nlmsghdr*)nlmsg->buf;
hdr->nlmsg_len = nlmsg->pos - nlmsg->buf;
struct sockaddr_nl addr;
memset(&addr, 0, sizeof(addr));
addr.nl_family = AF_NETLINK;
ssize_t n = sendto(sock, nlmsg->buf, hdr->nlmsg_len, 0,
(struct sockaddr*)&addr, sizeof(addr));
if (n != (ssize_t)hdr->nlmsg_len) {
if (dofail)
exit(1);
return -1;
}
n = recv(sock, nlmsg->buf, sizeof(nlmsg->buf), 0);
if (reply_len)
*reply_len = 0;
if (n < 0) {
if (dofail)
exit(1);
return -1;
}
if (n < (ssize_t)sizeof(struct nlmsghdr)) {
errno = EINVAL;
if (dofail)
exit(1);
return -1;
}
if (hdr->nlmsg_type == NLMSG_DONE)
return 0;
if (reply_len && hdr->nlmsg_type == reply_type) {
*reply_len = n;
return 0;
}
if (n < (ssize_t)(sizeof(struct nlmsghdr) + sizeof(struct nlmsgerr))) {
errno = EINVAL;
if (dofail)
exit(1);
return -1;
}
if (hdr->nlmsg_type != NLMSG_ERROR) {
errno = EINVAL;
if (dofail)
exit(1);
return -1;
}
errno = -((struct nlmsgerr*)(hdr + 1))->error;
return -errno;
}
static int netlink_send(struct nlmsg* nlmsg, int sock)
{
return netlink_send_ext(nlmsg, sock, 0, NULL, true);
}
static int netlink_query_family_id(struct nlmsg* nlmsg, int sock,
const char* family_name, bool dofail)
{
struct genlmsghdr genlhdr;
memset(&genlhdr, 0, sizeof(genlhdr));
genlhdr.cmd = CTRL_CMD_GETFAMILY;
netlink_init(nlmsg, GENL_ID_CTRL, 0, &genlhdr, sizeof(genlhdr));
netlink_attr(nlmsg, CTRL_ATTR_FAMILY_NAME, family_name,
strnlen(family_name, GENL_NAMSIZ - 1) + 1);
int n = 0;
int err = netlink_send_ext(nlmsg, sock, GENL_ID_CTRL, &n, dofail);
if (err < 0) {
return -1;
}
uint16_t id = 0;
struct nlattr* attr = (struct nlattr*)(nlmsg->buf + NLMSG_HDRLEN +
NLMSG_ALIGN(sizeof(genlhdr)));
for (; (char*)attr < nlmsg->buf + n;
attr = (struct nlattr*)((char*)attr + NLMSG_ALIGN(attr->nla_len))) {
if (attr->nla_type == CTRL_ATTR_FAMILY_ID) {
id = *(uint16_t*)(attr + 1);
break;
}
}
if (!id) {
errno = EINVAL;
return -1;
}
recv(sock, nlmsg->buf, sizeof(nlmsg->buf), 0);
return id;
}
static int netlink_next_msg(struct nlmsg* nlmsg, unsigned int offset,
unsigned int total_len)
{
struct nlmsghdr* hdr = (struct nlmsghdr*)(nlmsg->buf + offset);
if (offset == total_len || offset + hdr->nlmsg_len > total_len)
return -1;
return hdr->nlmsg_len;
}
static void netlink_add_device_impl(struct nlmsg* nlmsg, const char* type,
const char* name)
{
struct ifinfomsg hdr;
memset(&hdr, 0, sizeof(hdr));
netlink_init(nlmsg, RTM_NEWLINK, NLM_F_EXCL | NLM_F_CREATE, &hdr,
sizeof(hdr));
if (name)
netlink_attr(nlmsg, IFLA_IFNAME, name, strlen(name));
netlink_nest(nlmsg, IFLA_LINKINFO);
netlink_attr(nlmsg, IFLA_INFO_KIND, type, strlen(type));
}
static void netlink_add_device(struct nlmsg* nlmsg, int sock, const char* type,
const char* name)
{
netlink_add_device_impl(nlmsg, type, name);
netlink_done(nlmsg);
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}
static void netlink_add_veth(struct nlmsg* nlmsg, int sock, const char* name,
const char* peer)
{
netlink_add_device_impl(nlmsg, "veth", name);
netlink_nest(nlmsg, IFLA_INFO_DATA);
netlink_nest(nlmsg, VETH_INFO_PEER);
nlmsg->pos += sizeof(struct ifinfomsg);
netlink_attr(nlmsg, IFLA_IFNAME, peer, strlen(peer));
netlink_done(nlmsg);
netlink_done(nlmsg);
netlink_done(nlmsg);
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}
static void netlink_add_hsr(struct nlmsg* nlmsg, int sock, const char* name,
const char* slave1, const char* slave2)
{
netlink_add_device_impl(nlmsg, "hsr", name);
netlink_nest(nlmsg, IFLA_INFO_DATA);
int ifindex1 = if_nametoindex(slave1);
netlink_attr(nlmsg, IFLA_HSR_SLAVE1, &ifindex1, sizeof(ifindex1));
int ifindex2 = if_nametoindex(slave2);
netlink_attr(nlmsg, IFLA_HSR_SLAVE2, &ifindex2, sizeof(ifindex2));
netlink_done(nlmsg);
netlink_done(nlmsg);
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}
static void netlink_add_linked(struct nlmsg* nlmsg, int sock, const char* type,
const char* name, const char* link)
{
netlink_add_device_impl(nlmsg, type, name);
netlink_done(nlmsg);
int ifindex = if_nametoindex(link);
netlink_attr(nlmsg, IFLA_LINK, &ifindex, sizeof(ifindex));
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}
static void netlink_add_vlan(struct nlmsg* nlmsg, int sock, const char* name,
const char* link, uint16_t id, uint16_t proto)
{
netlink_add_device_impl(nlmsg, "vlan", name);
netlink_nest(nlmsg, IFLA_INFO_DATA);
netlink_attr(nlmsg, IFLA_VLAN_ID, &id, sizeof(id));
netlink_attr(nlmsg, IFLA_VLAN_PROTOCOL, &proto, sizeof(proto));
netlink_done(nlmsg);
netlink_done(nlmsg);
int ifindex = if_nametoindex(link);
netlink_attr(nlmsg, IFLA_LINK, &ifindex, sizeof(ifindex));
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}
static void netlink_add_macvlan(struct nlmsg* nlmsg, int sock, const char* name,
const char* link)
{
netlink_add_device_impl(nlmsg, "macvlan", name);
netlink_nest(nlmsg, IFLA_INFO_DATA);
uint32_t mode = MACVLAN_MODE_BRIDGE;
netlink_attr(nlmsg, IFLA_MACVLAN_MODE, &mode, sizeof(mode));
netlink_done(nlmsg);
netlink_done(nlmsg);
int ifindex = if_nametoindex(link);
netlink_attr(nlmsg, IFLA_LINK, &ifindex, sizeof(ifindex));
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}
static void netlink_add_geneve(struct nlmsg* nlmsg, int sock, const char* name,
uint32_t vni, struct in_addr* addr4,
struct in6_addr* addr6)
{
netlink_add_device_impl(nlmsg, "geneve", name);
netlink_nest(nlmsg, IFLA_INFO_DATA);
netlink_attr(nlmsg, IFLA_GENEVE_ID, &vni, sizeof(vni));
if (addr4)
netlink_attr(nlmsg, IFLA_GENEVE_REMOTE, addr4, sizeof(*addr4));
if (addr6)
netlink_attr(nlmsg, IFLA_GENEVE_REMOTE6, addr6, sizeof(*addr6));
netlink_done(nlmsg);
netlink_done(nlmsg);
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}
#define IFLA_IPVLAN_FLAGS 2
#define IPVLAN_MODE_L3S 2
#undef IPVLAN_F_VEPA
#define IPVLAN_F_VEPA 2
static void netlink_add_ipvlan(struct nlmsg* nlmsg, int sock, const char* name,
const char* link, uint16_t mode, uint16_t flags)
{
netlink_add_device_impl(nlmsg, "ipvlan", name);
netlink_nest(nlmsg, IFLA_INFO_DATA);
netlink_attr(nlmsg, IFLA_IPVLAN_MODE, &mode, sizeof(mode));
netlink_attr(nlmsg, IFLA_IPVLAN_FLAGS, &flags, sizeof(flags));
netlink_done(nlmsg);
netlink_done(nlmsg);
int ifindex = if_nametoindex(link);
netlink_attr(nlmsg, IFLA_LINK, &ifindex, sizeof(ifindex));
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}
static void netlink_device_change(struct nlmsg* nlmsg, int sock,
const char* name, bool up, const char* master,
const void* mac, int macsize,
const char* new_name)
{
struct ifinfomsg hdr;
memset(&hdr, 0, sizeof(hdr));
if (up)
hdr.ifi_flags = hdr.ifi_change = IFF_UP;
hdr.ifi_index = if_nametoindex(name);
netlink_init(nlmsg, RTM_NEWLINK, 0, &hdr, sizeof(hdr));
if (new_name)
netlink_attr(nlmsg, IFLA_IFNAME, new_name, strlen(new_name));
if (master) {
int ifindex = if_nametoindex(master);
netlink_attr(nlmsg, IFLA_MASTER, &ifindex, sizeof(ifindex));
}
if (macsize)
netlink_attr(nlmsg, IFLA_ADDRESS, mac, macsize);
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}
static int netlink_add_addr(struct nlmsg* nlmsg, int sock, const char* dev,
const void* addr, int addrsize)
{
struct ifaddrmsg hdr;
memset(&hdr, 0, sizeof(hdr));
hdr.ifa_family = addrsize == 4 ? AF_INET : AF_INET6;
hdr.ifa_prefixlen = addrsize == 4 ? 24 : 120;
hdr.ifa_scope = RT_SCOPE_UNIVERSE;
hdr.ifa_index = if_nametoindex(dev);
netlink_init(nlmsg, RTM_NEWADDR, NLM_F_CREATE | NLM_F_REPLACE, &hdr,
sizeof(hdr));
netlink_attr(nlmsg, IFA_LOCAL, addr, addrsize);
netlink_attr(nlmsg, IFA_ADDRESS, addr, addrsize);
return netlink_send(nlmsg, sock);
}
static void netlink_add_addr4(struct nlmsg* nlmsg, int sock, const char* dev,
const char* addr)
{
struct in_addr in_addr;
inet_pton(AF_INET, addr, &in_addr);
int err = netlink_add_addr(nlmsg, sock, dev, &in_addr, sizeof(in_addr));
if (err < 0) {
}
}
static void netlink_add_addr6(struct nlmsg* nlmsg, int sock, const char* dev,
const char* addr)
{
struct in6_addr in6_addr;
inet_pton(AF_INET6, addr, &in6_addr);
int err = netlink_add_addr(nlmsg, sock, dev, &in6_addr, sizeof(in6_addr));
if (err < 0) {
}
}
static struct nlmsg nlmsg;
#define DEVLINK_FAMILY_NAME "devlink"
#define DEVLINK_CMD_PORT_GET 5
#define DEVLINK_ATTR_BUS_NAME 1
#define DEVLINK_ATTR_DEV_NAME 2
#define DEVLINK_ATTR_NETDEV_NAME 7
static struct nlmsg nlmsg2;
static void initialize_devlink_ports(const char* bus_name, const char* dev_name,
const char* netdev_prefix)
{
struct genlmsghdr genlhdr;
int len, total_len, id, err, offset;
uint16_t netdev_index;
int sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
if (sock == -1)
exit(1);
int rtsock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
if (rtsock == -1)
exit(1);
id = netlink_query_family_id(&nlmsg, sock, DEVLINK_FAMILY_NAME, true);
if (id == -1)
goto error;
memset(&genlhdr, 0, sizeof(genlhdr));
genlhdr.cmd = DEVLINK_CMD_PORT_GET;
netlink_init(&nlmsg, id, NLM_F_DUMP, &genlhdr, sizeof(genlhdr));
netlink_attr(&nlmsg, DEVLINK_ATTR_BUS_NAME, bus_name, strlen(bus_name) + 1);
netlink_attr(&nlmsg, DEVLINK_ATTR_DEV_NAME, dev_name, strlen(dev_name) + 1);
err = netlink_send_ext(&nlmsg, sock, id, &total_len, true);
if (err < 0) {
goto error;
}
offset = 0;
netdev_index = 0;
while ((len = netlink_next_msg(&nlmsg, offset, total_len)) != -1) {
struct nlattr* attr = (struct nlattr*)(nlmsg.buf + offset + NLMSG_HDRLEN +
NLMSG_ALIGN(sizeof(genlhdr)));
for (; (char*)attr < nlmsg.buf + offset + len;
attr = (struct nlattr*)((char*)attr + NLMSG_ALIGN(attr->nla_len))) {
if (attr->nla_type == DEVLINK_ATTR_NETDEV_NAME) {
char* port_name;
char netdev_name[IFNAMSIZ];
port_name = (char*)(attr + 1);
snprintf(netdev_name, sizeof(netdev_name), "%s%d", netdev_prefix,
netdev_index);
netlink_device_change(&nlmsg2, rtsock, port_name, true, 0, 0, 0,
netdev_name);
break;
}
}
offset += len;
netdev_index++;
}
error:
close(rtsock);
close(sock);
}
#define WIFI_INITIAL_DEVICE_COUNT 2
#define WIFI_MAC_BASE \
{ \
0x08, 0x02, 0x11, 0x00, 0x00, 0x00 \
}
#define WIFI_IBSS_BSSID \
{ \
0x50, 0x50, 0x50, 0x50, 0x50, 0x50 \
}
#define WIFI_IBSS_SSID \
{ \
0x10, 0x10, 0x10, 0x10, 0x10, 0x10 \
}
#define WIFI_DEFAULT_FREQUENCY 2412
#define WIFI_DEFAULT_SIGNAL 0
#define WIFI_DEFAULT_RX_RATE 1
#define HWSIM_CMD_REGISTER 1
#define HWSIM_CMD_FRAME 2
#define HWSIM_CMD_NEW_RADIO 4
#define HWSIM_ATTR_SUPPORT_P2P_DEVICE 14
#define HWSIM_ATTR_PERM_ADDR 22
#define IF_OPER_UP 6
struct join_ibss_props {
int wiphy_freq;
bool wiphy_freq_fixed;
uint8_t* mac;
uint8_t* ssid;
int ssid_len;
};
static int set_interface_state(const char* interface_name, int on)
{
struct ifreq ifr;
int sock = socket(AF_INET, SOCK_DGRAM, 0);
if (sock < 0) {
return -1;
}
memset(&ifr, 0, sizeof(ifr));
strcpy(ifr.ifr_name, interface_name);
int ret = ioctl(sock, SIOCGIFFLAGS, &ifr);
if (ret < 0) {
close(sock);
return -1;
}
if (on)
ifr.ifr_flags |= IFF_UP;
else
ifr.ifr_flags &= ~IFF_UP;
ret = ioctl(sock, SIOCSIFFLAGS, &ifr);
close(sock);
if (ret < 0) {
return -1;
}
return 0;
}
static int nl80211_set_interface(struct nlmsg* nlmsg, int sock,
int nl80211_family, uint32_t ifindex,
uint32_t iftype)
{
struct genlmsghdr genlhdr;
memset(&genlhdr, 0, sizeof(genlhdr));
genlhdr.cmd = NL80211_CMD_SET_INTERFACE;
netlink_init(nlmsg, nl80211_family, 0, &genlhdr, sizeof(genlhdr));
netlink_attr(nlmsg, NL80211_ATTR_IFINDEX, &ifindex, sizeof(ifindex));
netlink_attr(nlmsg, NL80211_ATTR_IFTYPE, &iftype, sizeof(iftype));
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
return err;
}
static int nl80211_join_ibss(struct nlmsg* nlmsg, int sock, int nl80211_family,
uint32_t ifindex, struct join_ibss_props* props)
{
struct genlmsghdr genlhdr;
memset(&genlhdr, 0, sizeof(genlhdr));
genlhdr.cmd = NL80211_CMD_JOIN_IBSS;
netlink_init(nlmsg, nl80211_family, 0, &genlhdr, sizeof(genlhdr));
netlink_attr(nlmsg, NL80211_ATTR_IFINDEX, &ifindex, sizeof(ifindex));
netlink_attr(nlmsg, NL80211_ATTR_SSID, props->ssid, props->ssid_len);
netlink_attr(nlmsg, NL80211_ATTR_WIPHY_FREQ, &(props->wiphy_freq),
sizeof(props->wiphy_freq));
if (props->mac)
netlink_attr(nlmsg, NL80211_ATTR_MAC, props->mac, ETH_ALEN);
if (props->wiphy_freq_fixed)
netlink_attr(nlmsg, NL80211_ATTR_FREQ_FIXED, NULL, 0);
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
return err;
}
static int get_ifla_operstate(struct nlmsg* nlmsg, int ifindex)
{
struct ifinfomsg info;
memset(&info, 0, sizeof(info));
info.ifi_family = AF_UNSPEC;
info.ifi_index = ifindex;
int sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
if (sock == -1) {
return -1;
}
netlink_init(nlmsg, RTM_GETLINK, 0, &info, sizeof(info));
int n;
int err = netlink_send_ext(nlmsg, sock, RTM_NEWLINK, &n, true);
close(sock);
if (err) {
return -1;
}
struct rtattr* attr = IFLA_RTA(NLMSG_DATA(nlmsg->buf));
for (; RTA_OK(attr, n); attr = RTA_NEXT(attr, n)) {
if (attr->rta_type == IFLA_OPERSTATE)
return *((int32_t*)RTA_DATA(attr));
}
return -1;
}
static int await_ifla_operstate(struct nlmsg* nlmsg, char* interface,
int operstate)
{
int ifindex = if_nametoindex(interface);
while (true) {
usleep(1000);
int ret = get_ifla_operstate(nlmsg, ifindex);
if (ret < 0)
return ret;
if (ret == operstate)
return 0;
}
return 0;
}
static int nl80211_setup_ibss_interface(struct nlmsg* nlmsg, int sock,
int nl80211_family_id, char* interface,
struct join_ibss_props* ibss_props)
{
int ifindex = if_nametoindex(interface);
if (ifindex == 0) {
return -1;
}
int ret = nl80211_set_interface(nlmsg, sock, nl80211_family_id, ifindex,
NL80211_IFTYPE_ADHOC);
if (ret < 0) {
return -1;
}
ret = set_interface_state(interface, 1);
if (ret < 0) {
return -1;
}
ret = nl80211_join_ibss(nlmsg, sock, nl80211_family_id, ifindex, ibss_props);
if (ret < 0) {
return -1;
}
return 0;
}
static int hwsim80211_create_device(struct nlmsg* nlmsg, int sock,
int hwsim_family,
uint8_t mac_addr[ETH_ALEN])
{
struct genlmsghdr genlhdr;
memset(&genlhdr, 0, sizeof(genlhdr));
genlhdr.cmd = HWSIM_CMD_NEW_RADIO;
netlink_init(nlmsg, hwsim_family, 0, &genlhdr, sizeof(genlhdr));
netlink_attr(nlmsg, HWSIM_ATTR_SUPPORT_P2P_DEVICE, NULL, 0);
netlink_attr(nlmsg, HWSIM_ATTR_PERM_ADDR, mac_addr, ETH_ALEN);
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
return err;
}
static void initialize_wifi_devices(void)
{
int rfkill = open("/dev/rfkill", O_RDWR);
if (rfkill == -1) {
if (errno != ENOENT && errno != EACCES)
exit(1);
} else {
struct rfkill_event event = {0};
event.type = RFKILL_TYPE_ALL;
event.op = RFKILL_OP_CHANGE_ALL;
if (write(rfkill, &event, sizeof(event)) != (ssize_t)(sizeof(event)))
exit(1);
close(rfkill);
}
uint8_t mac_addr[6] = WIFI_MAC_BASE;
int sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
if (sock < 0) {
return;
}
int hwsim_family_id =
netlink_query_family_id(&nlmsg, sock, "MAC80211_HWSIM", true);
int nl80211_family_id =
netlink_query_family_id(&nlmsg, sock, "nl80211", true);
uint8_t ssid[] = WIFI_IBSS_SSID;
uint8_t bssid[] = WIFI_IBSS_BSSID;
struct join_ibss_props ibss_props = {.wiphy_freq = WIFI_DEFAULT_FREQUENCY,
.wiphy_freq_fixed = true,
.mac = bssid,
.ssid = ssid,
.ssid_len = sizeof(ssid)};
for (int device_id = 0; device_id < WIFI_INITIAL_DEVICE_COUNT; device_id++) {
mac_addr[5] = device_id;
int ret = hwsim80211_create_device(&nlmsg, sock, hwsim_family_id, mac_addr);
if (ret < 0)
exit(1);
char interface[6] = "wlan0";
interface[4] += device_id;
if (nl80211_setup_ibss_interface(&nlmsg, sock, nl80211_family_id, interface,
&ibss_props) < 0)
exit(1);
}
for (int device_id = 0; device_id < WIFI_INITIAL_DEVICE_COUNT; device_id++) {
char interface[6] = "wlan0";
interface[4] += device_id;
int ret = await_ifla_operstate(&nlmsg, interface, IF_OPER_UP);
if (ret < 0)
exit(1);
}
close(sock);
}
#define DEV_IPV4 "172.20.20.%d"
#define DEV_IPV6 "fe80::%02x"
#define DEV_MAC 0x00aaaaaaaaaa
static void netdevsim_add(unsigned int addr, unsigned int port_count)
{
char buf[16];
sprintf(buf, "%u %u", addr, port_count);
if (write_file("/sys/bus/netdevsim/new_device", buf)) {
snprintf(buf, sizeof(buf), "netdevsim%d", addr);
initialize_devlink_ports("netdevsim", buf, "netdevsim");
}
}
#define WG_GENL_NAME "wireguard"
enum wg_cmd {
WG_CMD_GET_DEVICE,
WG_CMD_SET_DEVICE,
};
enum wgdevice_attribute {
WGDEVICE_A_UNSPEC,
WGDEVICE_A_IFINDEX,
WGDEVICE_A_IFNAME,
WGDEVICE_A_PRIVATE_KEY,
WGDEVICE_A_PUBLIC_KEY,
WGDEVICE_A_FLAGS,
WGDEVICE_A_LISTEN_PORT,
WGDEVICE_A_FWMARK,
WGDEVICE_A_PEERS,
};
enum wgpeer_attribute {
WGPEER_A_UNSPEC,
WGPEER_A_PUBLIC_KEY,
WGPEER_A_PRESHARED_KEY,
WGPEER_A_FLAGS,
WGPEER_A_ENDPOINT,
WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
WGPEER_A_LAST_HANDSHAKE_TIME,
WGPEER_A_RX_BYTES,
WGPEER_A_TX_BYTES,
WGPEER_A_ALLOWEDIPS,
WGPEER_A_PROTOCOL_VERSION,
};
enum wgallowedip_attribute {
WGALLOWEDIP_A_UNSPEC,
WGALLOWEDIP_A_FAMILY,
WGALLOWEDIP_A_IPADDR,
WGALLOWEDIP_A_CIDR_MASK,
};
static void netlink_wireguard_setup(void)
{
const char ifname_a[] = "wg0";
const char ifname_b[] = "wg1";
const char ifname_c[] = "wg2";
const char private_a[] =
"\xa0\x5c\xa8\x4f\x6c\x9c\x8e\x38\x53\xe2\xfd\x7a\x70\xae\x0f\xb2\x0f\xa1"
"\x52\x60\x0c\xb0\x08\x45\x17\x4f\x08\x07\x6f\x8d\x78\x43";
const char private_b[] =
"\xb0\x80\x73\xe8\xd4\x4e\x91\xe3\xda\x92\x2c\x22\x43\x82\x44\xbb\x88\x5c"
"\x69\xe2\x69\xc8\xe9\xd8\x35\xb1\x14\x29\x3a\x4d\xdc\x6e";
const char private_c[] =
"\xa0\xcb\x87\x9a\x47\xf5\xbc\x64\x4c\x0e\x69\x3f\xa6\xd0\x31\xc7\x4a\x15"
"\x53\xb6\xe9\x01\xb9\xff\x2f\x51\x8c\x78\x04\x2f\xb5\x42";
const char public_a[] =
"\x97\x5c\x9d\x81\xc9\x83\xc8\x20\x9e\xe7\x81\x25\x4b\x89\x9f\x8e\xd9\x25"
"\xae\x9f\x09\x23\xc2\x3c\x62\xf5\x3c\x57\xcd\xbf\x69\x1c";
const char public_b[] =
"\xd1\x73\x28\x99\xf6\x11\xcd\x89\x94\x03\x4d\x7f\x41\x3d\xc9\x57\x63\x0e"
"\x54\x93\xc2\x85\xac\xa4\x00\x65\xcb\x63\x11\xbe\x69\x6b";
const char public_c[] =
"\xf4\x4d\xa3\x67\xa8\x8e\xe6\x56\x4f\x02\x02\x11\x45\x67\x27\x08\x2f\x5c"
"\xeb\xee\x8b\x1b\xf5\xeb\x73\x37\x34\x1b\x45\x9b\x39\x22";
const uint16_t listen_a = 20001;
const uint16_t listen_b = 20002;
const uint16_t listen_c = 20003;
const uint16_t af_inet = AF_INET;
const uint16_t af_inet6 = AF_INET6;
const struct sockaddr_in endpoint_b_v4 = {
.sin_family = AF_INET,
.sin_port = htons(listen_b),
.sin_addr = {htonl(INADDR_LOOPBACK)}};
const struct sockaddr_in endpoint_c_v4 = {
.sin_family = AF_INET,
.sin_port = htons(listen_c),
.sin_addr = {htonl(INADDR_LOOPBACK)}};
struct sockaddr_in6 endpoint_a_v6 = {.sin6_family = AF_INET6,
.sin6_port = htons(listen_a)};
endpoint_a_v6.sin6_addr = in6addr_loopback;
struct sockaddr_in6 endpoint_c_v6 = {.sin6_family = AF_INET6,
.sin6_port = htons(listen_c)};
endpoint_c_v6.sin6_addr = in6addr_loopback;
const struct in_addr first_half_v4 = {0};
const struct in_addr second_half_v4 = {(uint32_t)htonl(128 << 24)};
const struct in6_addr first_half_v6 = {{{0}}};
const struct in6_addr second_half_v6 = {{{0x80}}};
const uint8_t half_cidr = 1;
const uint16_t persistent_keepalives[] = {1, 3, 7, 9, 14, 19};
struct genlmsghdr genlhdr = {.cmd = WG_CMD_SET_DEVICE, .version = 1};
int sock;
int id, err;
sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
if (sock == -1) {
return;
}
id = netlink_query_family_id(&nlmsg, sock, WG_GENL_NAME, true);
if (id == -1)
goto error;
netlink_init(&nlmsg, id, 0, &genlhdr, sizeof(genlhdr));
netlink_attr(&nlmsg, WGDEVICE_A_IFNAME, ifname_a, strlen(ifname_a) + 1);
netlink_attr(&nlmsg, WGDEVICE_A_PRIVATE_KEY, private_a, 32);
netlink_attr(&nlmsg, WGDEVICE_A_LISTEN_PORT, &listen_a, 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGDEVICE_A_PEERS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGPEER_A_PUBLIC_KEY, public_b, 32);
netlink_attr(&nlmsg, WGPEER_A_ENDPOINT, &endpoint_b_v4,
sizeof(endpoint_b_v4));
netlink_attr(&nlmsg, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
&persistent_keepalives[0], 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGPEER_A_ALLOWEDIPS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &first_half_v4,
sizeof(first_half_v4));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet6, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &first_half_v6,
sizeof(first_half_v6));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGPEER_A_PUBLIC_KEY, public_c, 32);
netlink_attr(&nlmsg, WGPEER_A_ENDPOINT, &endpoint_c_v6,
sizeof(endpoint_c_v6));
netlink_attr(&nlmsg, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
&persistent_keepalives[1], 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGPEER_A_ALLOWEDIPS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &second_half_v4,
sizeof(second_half_v4));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet6, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &second_half_v6,
sizeof(second_half_v6));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
err = netlink_send(&nlmsg, sock);
if (err < 0) {
}
netlink_init(&nlmsg, id, 0, &genlhdr, sizeof(genlhdr));
netlink_attr(&nlmsg, WGDEVICE_A_IFNAME, ifname_b, strlen(ifname_b) + 1);
netlink_attr(&nlmsg, WGDEVICE_A_PRIVATE_KEY, private_b, 32);
netlink_attr(&nlmsg, WGDEVICE_A_LISTEN_PORT, &listen_b, 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGDEVICE_A_PEERS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGPEER_A_PUBLIC_KEY, public_a, 32);
netlink_attr(&nlmsg, WGPEER_A_ENDPOINT, &endpoint_a_v6,
sizeof(endpoint_a_v6));
netlink_attr(&nlmsg, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
&persistent_keepalives[2], 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGPEER_A_ALLOWEDIPS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &first_half_v4,
sizeof(first_half_v4));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet6, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &first_half_v6,
sizeof(first_half_v6));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGPEER_A_PUBLIC_KEY, public_c, 32);
netlink_attr(&nlmsg, WGPEER_A_ENDPOINT, &endpoint_c_v4,
sizeof(endpoint_c_v4));
netlink_attr(&nlmsg, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
&persistent_keepalives[3], 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGPEER_A_ALLOWEDIPS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &second_half_v4,
sizeof(second_half_v4));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet6, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &second_half_v6,
sizeof(second_half_v6));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
err = netlink_send(&nlmsg, sock);
if (err < 0) {
}
netlink_init(&nlmsg, id, 0, &genlhdr, sizeof(genlhdr));
netlink_attr(&nlmsg, WGDEVICE_A_IFNAME, ifname_c, strlen(ifname_c) + 1);
netlink_attr(&nlmsg, WGDEVICE_A_PRIVATE_KEY, private_c, 32);
netlink_attr(&nlmsg, WGDEVICE_A_LISTEN_PORT, &listen_c, 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGDEVICE_A_PEERS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGPEER_A_PUBLIC_KEY, public_a, 32);
netlink_attr(&nlmsg, WGPEER_A_ENDPOINT, &endpoint_a_v6,
sizeof(endpoint_a_v6));
netlink_attr(&nlmsg, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
&persistent_keepalives[4], 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGPEER_A_ALLOWEDIPS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &first_half_v4,
sizeof(first_half_v4));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet6, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &first_half_v6,
sizeof(first_half_v6));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGPEER_A_PUBLIC_KEY, public_b, 32);
netlink_attr(&nlmsg, WGPEER_A_ENDPOINT, &endpoint_b_v4,
sizeof(endpoint_b_v4));
netlink_attr(&nlmsg, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
&persistent_keepalives[5], 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGPEER_A_ALLOWEDIPS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &second_half_v4,
sizeof(second_half_v4));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet6, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &second_half_v6,
sizeof(second_half_v6));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
err = netlink_send(&nlmsg, sock);
if (err < 0) {
}
error:
close(sock);
}
static void initialize_netdevices(void)
{
char netdevsim[16];
sprintf(netdevsim, "netdevsim%d", (int)procid);
struct {
const char* type;
const char* dev;
} devtypes[] = {
{"ip6gretap", "ip6gretap0"}, {"bridge", "bridge0"},
{"vcan", "vcan0"}, {"bond", "bond0"},
{"team", "team0"}, {"dummy", "dummy0"},
{"nlmon", "nlmon0"}, {"caif", "caif0"},
{"batadv", "batadv0"}, {"vxcan", "vxcan1"},
{"netdevsim", netdevsim}, {"veth", 0},
{"xfrm", "xfrm0"}, {"wireguard", "wg0"},
{"wireguard", "wg1"}, {"wireguard", "wg2"},
};
const char* devmasters[] = {"bridge", "bond", "team", "batadv"};
struct {
const char* name;
int macsize;
bool noipv6;
} devices[] = {
{"lo", ETH_ALEN},
{"sit0", 0},
{"bridge0", ETH_ALEN},
{"vcan0", 0, true},
{"tunl0", 0},
{"gre0", 0},
{"gretap0", ETH_ALEN},
{"ip_vti0", 0},
{"ip6_vti0", 0},
{"ip6tnl0", 0},
{"ip6gre0", 0},
{"ip6gretap0", ETH_ALEN},
{"erspan0", ETH_ALEN},
{"bond0", ETH_ALEN},
{"veth0", ETH_ALEN},
{"veth1", ETH_ALEN},
{"team0", ETH_ALEN},
{"veth0_to_bridge", ETH_ALEN},
{"veth1_to_bridge", ETH_ALEN},
{"veth0_to_bond", ETH_ALEN},
{"veth1_to_bond", ETH_ALEN},
{"veth0_to_team", ETH_ALEN},
{"veth1_to_team", ETH_ALEN},
{"veth0_to_hsr", ETH_ALEN},
{"veth1_to_hsr", ETH_ALEN},
{"hsr0", 0},
{"dummy0", ETH_ALEN},
{"nlmon0", 0},
{"vxcan0", 0, true},
{"vxcan1", 0, true},
{"caif0", ETH_ALEN},
{"batadv0", ETH_ALEN},
{netdevsim, ETH_ALEN},
{"xfrm0", ETH_ALEN},
{"veth0_virt_wifi", ETH_ALEN},
{"veth1_virt_wifi", ETH_ALEN},
{"virt_wifi0", ETH_ALEN},
{"veth0_vlan", ETH_ALEN},
{"veth1_vlan", ETH_ALEN},
{"vlan0", ETH_ALEN},
{"vlan1", ETH_ALEN},
{"macvlan0", ETH_ALEN},
{"macvlan1", ETH_ALEN},
{"ipvlan0", ETH_ALEN},
{"ipvlan1", ETH_ALEN},
{"veth0_macvtap", ETH_ALEN},
{"veth1_macvtap", ETH_ALEN},
{"macvtap0", ETH_ALEN},
{"macsec0", ETH_ALEN},
{"veth0_to_batadv", ETH_ALEN},
{"veth1_to_batadv", ETH_ALEN},
{"batadv_slave_0", ETH_ALEN},
{"batadv_slave_1", ETH_ALEN},
{"geneve0", ETH_ALEN},
{"geneve1", ETH_ALEN},
{"wg0", 0},
{"wg1", 0},
{"wg2", 0},
};
int sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
if (sock == -1)
exit(1);
unsigned i;
for (i = 0; i < sizeof(devtypes) / sizeof(devtypes[0]); i++)
netlink_add_device(&nlmsg, sock, devtypes[i].type, devtypes[i].dev);
for (i = 0; i < sizeof(devmasters) / (sizeof(devmasters[0])); i++) {
char master[32], slave0[32], veth0[32], slave1[32], veth1[32];
sprintf(slave0, "%s_slave_0", devmasters[i]);
sprintf(veth0, "veth0_to_%s", devmasters[i]);
netlink_add_veth(&nlmsg, sock, slave0, veth0);
sprintf(slave1, "%s_slave_1", devmasters[i]);
sprintf(veth1, "veth1_to_%s", devmasters[i]);
netlink_add_veth(&nlmsg, sock, slave1, veth1);
sprintf(master, "%s0", devmasters[i]);
netlink_device_change(&nlmsg, sock, slave0, false, master, 0, 0, NULL);
netlink_device_change(&nlmsg, sock, slave1, false, master, 0, 0, NULL);
}
netlink_device_change(&nlmsg, sock, "bridge_slave_0", true, 0, 0, 0, NULL);
netlink_device_change(&nlmsg, sock, "bridge_slave_1", true, 0, 0, 0, NULL);
netlink_add_veth(&nlmsg, sock, "hsr_slave_0", "veth0_to_hsr");
netlink_add_veth(&nlmsg, sock, "hsr_slave_1", "veth1_to_hsr");
netlink_add_hsr(&nlmsg, sock, "hsr0", "hsr_slave_0", "hsr_slave_1");
netlink_device_change(&nlmsg, sock, "hsr_slave_0", true, 0, 0, 0, NULL);
netlink_device_change(&nlmsg, sock, "hsr_slave_1", true, 0, 0, 0, NULL);
netlink_add_veth(&nlmsg, sock, "veth0_virt_wifi", "veth1_virt_wifi");
netlink_add_linked(&nlmsg, sock, "virt_wifi", "virt_wifi0",
"veth1_virt_wifi");
netlink_add_veth(&nlmsg, sock, "veth0_vlan", "veth1_vlan");
netlink_add_vlan(&nlmsg, sock, "vlan0", "veth0_vlan", 0, htons(ETH_P_8021Q));
netlink_add_vlan(&nlmsg, sock, "vlan1", "veth0_vlan", 1, htons(ETH_P_8021AD));
netlink_add_macvlan(&nlmsg, sock, "macvlan0", "veth1_vlan");
netlink_add_macvlan(&nlmsg, sock, "macvlan1", "veth1_vlan");
netlink_add_ipvlan(&nlmsg, sock, "ipvlan0", "veth0_vlan", IPVLAN_MODE_L2, 0);
netlink_add_ipvlan(&nlmsg, sock, "ipvlan1", "veth0_vlan", IPVLAN_MODE_L3S,
IPVLAN_F_VEPA);
netlink_add_veth(&nlmsg, sock, "veth0_macvtap", "veth1_macvtap");
netlink_add_linked(&nlmsg, sock, "macvtap", "macvtap0", "veth0_macvtap");
netlink_add_linked(&nlmsg, sock, "macsec", "macsec0", "veth1_macvtap");
char addr[32];
sprintf(addr, DEV_IPV4, 14 + 10);
struct in_addr geneve_addr4;
if (inet_pton(AF_INET, addr, &geneve_addr4) <= 0)
exit(1);
struct in6_addr geneve_addr6;
if (inet_pton(AF_INET6, "fc00::01", &geneve_addr6) <= 0)
exit(1);
netlink_add_geneve(&nlmsg, sock, "geneve0", 0, &geneve_addr4, 0);
netlink_add_geneve(&nlmsg, sock, "geneve1", 1, 0, &geneve_addr6);
netdevsim_add((int)procid, 4);
netlink_wireguard_setup();
for (i = 0; i < sizeof(devices) / (sizeof(devices[0])); i++) {
char addr[32];
sprintf(addr, DEV_IPV4, i + 10);
netlink_add_addr4(&nlmsg, sock, devices[i].name, addr);
if (!devices[i].noipv6) {
sprintf(addr, DEV_IPV6, i + 10);
netlink_add_addr6(&nlmsg, sock, devices[i].name, addr);
}
uint64_t macaddr = DEV_MAC + ((i + 10ull) << 40);
netlink_device_change(&nlmsg, sock, devices[i].name, true, 0, &macaddr,
devices[i].macsize, NULL);
}
close(sock);
}
static void initialize_netdevices_init(void)
{
int sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
if (sock == -1)
exit(1);
struct {
const char* type;
int macsize;
bool noipv6;
bool noup;
} devtypes[] = {
{"nr", 7, true},
{"rose", 5, true, true},
};
unsigned i;
for (i = 0; i < sizeof(devtypes) / sizeof(devtypes[0]); i++) {
char dev[32], addr[32];
sprintf(dev, "%s%d", devtypes[i].type, (int)procid);
sprintf(addr, "172.30.%d.%d", i, (int)procid + 1);
netlink_add_addr4(&nlmsg, sock, dev, addr);
if (!devtypes[i].noipv6) {
sprintf(addr, "fe88::%02x:%02x", i, (int)procid + 1);
netlink_add_addr6(&nlmsg, sock, dev, addr);
}
int macsize = devtypes[i].macsize;
uint64_t macaddr = 0xbbbbbb +
((unsigned long long)i << (8 * (macsize - 2))) +
(procid << (8 * (macsize - 1)));
netlink_device_change(&nlmsg, sock, dev, !devtypes[i].noup, 0, &macaddr,
macsize, NULL);
}
close(sock);
}
#define MAX_FDS 30
#define BTPROTO_HCI 1
#define ACL_LINK 1
#define SCAN_PAGE 2
typedef struct {
uint8_t b[6];
} __attribute__((packed)) bdaddr_t;
#define HCI_COMMAND_PKT 1
#define HCI_EVENT_PKT 4
#define HCI_VENDOR_PKT 0xff
struct hci_command_hdr {
uint16_t opcode;
uint8_t plen;
} __attribute__((packed));
struct hci_event_hdr {
uint8_t evt;
uint8_t plen;
} __attribute__((packed));
#define HCI_EV_CONN_COMPLETE 0x03
struct hci_ev_conn_complete {
uint8_t status;
uint16_t handle;
bdaddr_t bdaddr;
uint8_t link_type;
uint8_t encr_mode;
} __attribute__((packed));
#define HCI_EV_CONN_REQUEST 0x04
struct hci_ev_conn_request {
bdaddr_t bdaddr;
uint8_t dev_class[3];
uint8_t link_type;
} __attribute__((packed));
#define HCI_EV_REMOTE_FEATURES 0x0b
struct hci_ev_remote_features {
uint8_t status;
uint16_t handle;
uint8_t features[8];
} __attribute__((packed));
#define HCI_EV_CMD_COMPLETE 0x0e
struct hci_ev_cmd_complete {
uint8_t ncmd;
uint16_t opcode;
} __attribute__((packed));
#define HCI_OP_WRITE_SCAN_ENABLE 0x0c1a
#define HCI_OP_READ_BUFFER_SIZE 0x1005
struct hci_rp_read_buffer_size {
uint8_t status;
uint16_t acl_mtu;
uint8_t sco_mtu;
uint16_t acl_max_pkt;
uint16_t sco_max_pkt;
} __attribute__((packed));
#define HCI_OP_READ_BD_ADDR 0x1009
struct hci_rp_read_bd_addr {
uint8_t status;
bdaddr_t bdaddr;
} __attribute__((packed));
#define HCI_EV_LE_META 0x3e
struct hci_ev_le_meta {
uint8_t subevent;
} __attribute__((packed));
#define HCI_EV_LE_CONN_COMPLETE 0x01
struct hci_ev_le_conn_complete {
uint8_t status;
uint16_t handle;
uint8_t role;
uint8_t bdaddr_type;
bdaddr_t bdaddr;
uint16_t interval;
uint16_t latency;
uint16_t supervision_timeout;
uint8_t clk_accurancy;
} __attribute__((packed));
struct hci_dev_req {
uint16_t dev_id;
uint32_t dev_opt;
};
struct vhci_vendor_pkt {
uint8_t type;
uint8_t opcode;
uint16_t id;
};
#define HCIDEVUP _IOW('H', 201, int)
#define HCISETSCAN _IOW('H', 221, int)
static int vhci_fd = -1;
static void rfkill_unblock_all()
{
int fd = open("/dev/rfkill", O_WRONLY);
if (fd < 0)
exit(1);
struct rfkill_event event = {0};
event.idx = 0;
event.type = RFKILL_TYPE_ALL;
event.op = RFKILL_OP_CHANGE_ALL;
event.soft = 0;
event.hard = 0;
if (write(fd, &event, sizeof(event)) < 0)
exit(1);
close(fd);
}
static void hci_send_event_packet(int fd, uint8_t evt, void* data,
size_t data_len)
{
struct iovec iv[3];
struct hci_event_hdr hdr;
hdr.evt = evt;
hdr.plen = data_len;
uint8_t type = HCI_EVENT_PKT;
iv[0].iov_base = &type;
iv[0].iov_len = sizeof(type);
iv[1].iov_base = &hdr;
iv[1].iov_len = sizeof(hdr);
iv[2].iov_base = data;
iv[2].iov_len = data_len;
if (writev(fd, iv, sizeof(iv) / sizeof(struct iovec)) < 0)
exit(1);
}
static void hci_send_event_cmd_complete(int fd, uint16_t opcode, void* data,
size_t data_len)
{
struct iovec iv[4];
struct hci_event_hdr hdr;
hdr.evt = HCI_EV_CMD_COMPLETE;
hdr.plen = sizeof(struct hci_ev_cmd_complete) + data_len;
struct hci_ev_cmd_complete evt_hdr;
evt_hdr.ncmd = 1;
evt_hdr.opcode = opcode;
uint8_t type = HCI_EVENT_PKT;
iv[0].iov_base = &type;
iv[0].iov_len = sizeof(type);
iv[1].iov_base = &hdr;
iv[1].iov_len = sizeof(hdr);
iv[2].iov_base = &evt_hdr;
iv[2].iov_len = sizeof(evt_hdr);
iv[3].iov_base = data;
iv[3].iov_len = data_len;
if (writev(fd, iv, sizeof(iv) / sizeof(struct iovec)) < 0)
exit(1);
}
static bool process_command_pkt(int fd, char* buf, ssize_t buf_size)
{
struct hci_command_hdr* hdr = (struct hci_command_hdr*)buf;
if (buf_size < (ssize_t)sizeof(struct hci_command_hdr) ||
hdr->plen != buf_size - sizeof(struct hci_command_hdr))
exit(1);
switch (hdr->opcode) {
case HCI_OP_WRITE_SCAN_ENABLE: {
uint8_t status = 0;
hci_send_event_cmd_complete(fd, hdr->opcode, &status, sizeof(status));
return true;
}
case HCI_OP_READ_BD_ADDR: {
struct hci_rp_read_bd_addr rp = {0};
rp.status = 0;
memset(&rp.bdaddr, 0xaa, 6);
hci_send_event_cmd_complete(fd, hdr->opcode, &rp, sizeof(rp));
return false;
}
case HCI_OP_READ_BUFFER_SIZE: {
struct hci_rp_read_buffer_size rp = {0};
rp.status = 0;
rp.acl_mtu = 1021;
rp.sco_mtu = 96;
rp.acl_max_pkt = 4;
rp.sco_max_pkt = 6;
hci_send_event_cmd_complete(fd, hdr->opcode, &rp, sizeof(rp));
return false;
}
}
char dummy[0xf9] = {0};
hci_send_event_cmd_complete(fd, hdr->opcode, dummy, sizeof(dummy));
return false;
}
static void* event_thread(void* arg)
{
while (1) {
char buf[1024] = {0};
ssize_t buf_size = read(vhci_fd, buf, sizeof(buf));
if (buf_size < 0)
exit(1);
if (buf_size > 0 && buf[0] == HCI_COMMAND_PKT) {
if (process_command_pkt(vhci_fd, buf + 1, buf_size - 1))
break;
}
}
return NULL;
}
#define HCI_HANDLE_1 200
#define HCI_HANDLE_2 201
static void initialize_vhci()
{
int hci_sock = socket(AF_BLUETOOTH, SOCK_RAW, BTPROTO_HCI);
if (hci_sock < 0)
exit(1);
vhci_fd = open("/dev/vhci", O_RDWR);
if (vhci_fd == -1)
exit(1);
const int kVhciFd = 202;
if (dup2(vhci_fd, kVhciFd) < 0)
exit(1);
close(vhci_fd);
vhci_fd = kVhciFd;
struct vhci_vendor_pkt vendor_pkt;
if (read(vhci_fd, &vendor_pkt, sizeof(vendor_pkt)) != sizeof(vendor_pkt))
exit(1);
if (vendor_pkt.type != HCI_VENDOR_PKT)
exit(1);
pthread_t th;
if (pthread_create(&th, NULL, event_thread, NULL))
exit(1);
int ret = ioctl(hci_sock, HCIDEVUP, vendor_pkt.id);
if (ret) {
if (errno == ERFKILL) {
rfkill_unblock_all();
ret = ioctl(hci_sock, HCIDEVUP, vendor_pkt.id);
}
if (ret && errno != EALREADY)
exit(1);
}
struct hci_dev_req dr = {0};
dr.dev_id = vendor_pkt.id;
dr.dev_opt = SCAN_PAGE;
if (ioctl(hci_sock, HCISETSCAN, &dr))
exit(1);
struct hci_ev_conn_request request;
memset(&request, 0, sizeof(request));
memset(&request.bdaddr, 0xaa, 6);
*(uint8_t*)&request.bdaddr.b[5] = 0x10;
request.link_type = ACL_LINK;
hci_send_event_packet(vhci_fd, HCI_EV_CONN_REQUEST, &request,
sizeof(request));
struct hci_ev_conn_complete complete;
memset(&complete, 0, sizeof(complete));
complete.status = 0;
complete.handle = HCI_HANDLE_1;
memset(&complete.bdaddr, 0xaa, 6);
*(uint8_t*)&complete.bdaddr.b[5] = 0x10;
complete.link_type = ACL_LINK;
complete.encr_mode = 0;
hci_send_event_packet(vhci_fd, HCI_EV_CONN_COMPLETE, &complete,
sizeof(complete));
struct hci_ev_remote_features features;
memset(&features, 0, sizeof(features));
features.status = 0;
features.handle = HCI_HANDLE_1;
hci_send_event_packet(vhci_fd, HCI_EV_REMOTE_FEATURES, &features,
sizeof(features));
struct {
struct hci_ev_le_meta le_meta;
struct hci_ev_le_conn_complete le_conn;
} le_conn;
memset(&le_conn, 0, sizeof(le_conn));
le_conn.le_meta.subevent = HCI_EV_LE_CONN_COMPLETE;
memset(&le_conn.le_conn.bdaddr, 0xaa, 6);
*(uint8_t*)&le_conn.le_conn.bdaddr.b[5] = 0x11;
le_conn.le_conn.role = 1;
le_conn.le_conn.handle = HCI_HANDLE_2;
hci_send_event_packet(vhci_fd, HCI_EV_LE_META, &le_conn, sizeof(le_conn));
pthread_join(th, NULL);
close(hci_sock);
}
#define XT_TABLE_SIZE 1536
#define XT_MAX_ENTRIES 10
struct xt_counters {
uint64_t pcnt, bcnt;
};
struct ipt_getinfo {
char name[32];
unsigned int valid_hooks;
unsigned int hook_entry[5];
unsigned int underflow[5];
unsigned int num_entries;
unsigned int size;
};
struct ipt_get_entries {
char name[32];
unsigned int size;
uint64_t entrytable[XT_TABLE_SIZE / sizeof(uint64_t)];
};
struct ipt_replace {
char name[32];
unsigned int valid_hooks;
unsigned int num_entries;
unsigned int size;
unsigned int hook_entry[5];
unsigned int underflow[5];
unsigned int num_counters;
struct xt_counters* counters;
uint64_t entrytable[XT_TABLE_SIZE / sizeof(uint64_t)];
};
struct ipt_table_desc {
const char* name;
struct ipt_getinfo info;
struct ipt_replace replace;
};
static struct ipt_table_desc ipv4_tables[] = {
{.name = "filter"}, {.name = "nat"}, {.name = "mangle"},
{.name = "raw"}, {.name = "security"},
};
static struct ipt_table_desc ipv6_tables[] = {
{.name = "filter"}, {.name = "nat"}, {.name = "mangle"},
{.name = "raw"}, {.name = "security"},
};
#define IPT_BASE_CTL 64
#define IPT_SO_SET_REPLACE (IPT_BASE_CTL)
#define IPT_SO_GET_INFO (IPT_BASE_CTL)
#define IPT_SO_GET_ENTRIES (IPT_BASE_CTL + 1)
struct arpt_getinfo {
char name[32];
unsigned int valid_hooks;
unsigned int hook_entry[3];
unsigned int underflow[3];
unsigned int num_entries;
unsigned int size;
};
struct arpt_get_entries {
char name[32];
unsigned int size;
uint64_t entrytable[XT_TABLE_SIZE / sizeof(uint64_t)];
};
struct arpt_replace {
char name[32];
unsigned int valid_hooks;
unsigned int num_entries;
unsigned int size;
unsigned int hook_entry[3];
unsigned int underflow[3];
unsigned int num_counters;
struct xt_counters* counters;
uint64_t entrytable[XT_TABLE_SIZE / sizeof(uint64_t)];
};
struct arpt_table_desc {
const char* name;
struct arpt_getinfo info;
struct arpt_replace replace;
};
static struct arpt_table_desc arpt_tables[] = {
{.name = "filter"},
};
#define ARPT_BASE_CTL 96
#define ARPT_SO_SET_REPLACE (ARPT_BASE_CTL)
#define ARPT_SO_GET_INFO (ARPT_BASE_CTL)
#define ARPT_SO_GET_ENTRIES (ARPT_BASE_CTL + 1)
static void checkpoint_iptables(struct ipt_table_desc* tables, int num_tables,
int family, int level)
{
int fd = socket(family, SOCK_STREAM, IPPROTO_TCP);
if (fd == -1) {
switch (errno) {
case EAFNOSUPPORT:
case ENOPROTOOPT:
return;
}
exit(1);
}
for (int i = 0; i < num_tables; i++) {
struct ipt_table_desc* table = &tables[i];
strcpy(table->info.name, table->name);
strcpy(table->replace.name, table->name);
socklen_t optlen = sizeof(table->info);
if (getsockopt(fd, level, IPT_SO_GET_INFO, &table->info, &optlen)) {
switch (errno) {
case EPERM:
case ENOENT:
case ENOPROTOOPT:
continue;
}
exit(1);
}
if (table->info.size > sizeof(table->replace.entrytable))
exit(1);
if (table->info.num_entries > XT_MAX_ENTRIES)
exit(1);
struct ipt_get_entries entries;
memset(&entries, 0, sizeof(entries));
strcpy(entries.name, table->name);
entries.size = table->info.size;
optlen = sizeof(entries) - sizeof(entries.entrytable) + table->info.size;
if (getsockopt(fd, level, IPT_SO_GET_ENTRIES, &entries, &optlen))
exit(1);
table->replace.valid_hooks = table->info.valid_hooks;
table->replace.num_entries = table->info.num_entries;
table->replace.size = table->info.size;
memcpy(table->replace.hook_entry, table->info.hook_entry,
sizeof(table->replace.hook_entry));
memcpy(table->replace.underflow, table->info.underflow,
sizeof(table->replace.underflow));
memcpy(table->replace.entrytable, entries.entrytable, table->info.size);
}
close(fd);
}
static void reset_iptables(struct ipt_table_desc* tables, int num_tables,
int family, int level)
{
int fd = socket(family, SOCK_STREAM, IPPROTO_TCP);
if (fd == -1) {
switch (errno) {
case EAFNOSUPPORT:
case ENOPROTOOPT:
return;
}
exit(1);
}
for (int i = 0; i < num_tables; i++) {
struct ipt_table_desc* table = &tables[i];
if (table->info.valid_hooks == 0)
continue;
struct ipt_getinfo info;
memset(&info, 0, sizeof(info));
strcpy(info.name, table->name);
socklen_t optlen = sizeof(info);
if (getsockopt(fd, level, IPT_SO_GET_INFO, &info, &optlen))
exit(1);
if (memcmp(&table->info, &info, sizeof(table->info)) == 0) {
struct ipt_get_entries entries;
memset(&entries, 0, sizeof(entries));
strcpy(entries.name, table->name);
entries.size = table->info.size;
optlen = sizeof(entries) - sizeof(entries.entrytable) + entries.size;
if (getsockopt(fd, level, IPT_SO_GET_ENTRIES, &entries, &optlen))
exit(1);
if (memcmp(table->replace.entrytable, entries.entrytable,
table->info.size) == 0)
continue;
}
struct xt_counters counters[XT_MAX_ENTRIES];
table->replace.num_counters = info.num_entries;
table->replace.counters = counters;
optlen = sizeof(table->replace) - sizeof(table->replace.entrytable) +
table->replace.size;
if (setsockopt(fd, level, IPT_SO_SET_REPLACE, &table->replace, optlen))
exit(1);
}
close(fd);
}
static void checkpoint_arptables(void)
{
int fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (fd == -1) {
switch (errno) {
case EAFNOSUPPORT:
case ENOPROTOOPT:
return;
}
exit(1);
}
for (unsigned i = 0; i < sizeof(arpt_tables) / sizeof(arpt_tables[0]); i++) {
struct arpt_table_desc* table = &arpt_tables[i];
strcpy(table->info.name, table->name);
strcpy(table->replace.name, table->name);
socklen_t optlen = sizeof(table->info);
if (getsockopt(fd, SOL_IP, ARPT_SO_GET_INFO, &table->info, &optlen)) {
switch (errno) {
case EPERM:
case ENOENT:
case ENOPROTOOPT:
continue;
}
exit(1);
}
if (table->info.size > sizeof(table->replace.entrytable))
exit(1);
if (table->info.num_entries > XT_MAX_ENTRIES)
exit(1);
struct arpt_get_entries entries;
memset(&entries, 0, sizeof(entries));
strcpy(entries.name, table->name);
entries.size = table->info.size;
optlen = sizeof(entries) - sizeof(entries.entrytable) + table->info.size;
if (getsockopt(fd, SOL_IP, ARPT_SO_GET_ENTRIES, &entries, &optlen))
exit(1);
table->replace.valid_hooks = table->info.valid_hooks;
table->replace.num_entries = table->info.num_entries;
table->replace.size = table->info.size;
memcpy(table->replace.hook_entry, table->info.hook_entry,
sizeof(table->replace.hook_entry));
memcpy(table->replace.underflow, table->info.underflow,
sizeof(table->replace.underflow));
memcpy(table->replace.entrytable, entries.entrytable, table->info.size);
}
close(fd);
}
static void reset_arptables()
{
int fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (fd == -1) {
switch (errno) {
case EAFNOSUPPORT:
case ENOPROTOOPT:
return;
}
exit(1);
}
for (unsigned i = 0; i < sizeof(arpt_tables) / sizeof(arpt_tables[0]); i++) {
struct arpt_table_desc* table = &arpt_tables[i];
if (table->info.valid_hooks == 0)
continue;
struct arpt_getinfo info;
memset(&info, 0, sizeof(info));
strcpy(info.name, table->name);
socklen_t optlen = sizeof(info);
if (getsockopt(fd, SOL_IP, ARPT_SO_GET_INFO, &info, &optlen))
exit(1);
if (memcmp(&table->info, &info, sizeof(table->info)) == 0) {
struct arpt_get_entries entries;
memset(&entries, 0, sizeof(entries));
strcpy(entries.name, table->name);
entries.size = table->info.size;
optlen = sizeof(entries) - sizeof(entries.entrytable) + entries.size;
if (getsockopt(fd, SOL_IP, ARPT_SO_GET_ENTRIES, &entries, &optlen))
exit(1);
if (memcmp(table->replace.entrytable, entries.entrytable,
table->info.size) == 0)
continue;
} else {
}
struct xt_counters counters[XT_MAX_ENTRIES];
table->replace.num_counters = info.num_entries;
table->replace.counters = counters;
optlen = sizeof(table->replace) - sizeof(table->replace.entrytable) +
table->replace.size;
if (setsockopt(fd, SOL_IP, ARPT_SO_SET_REPLACE, &table->replace, optlen))
exit(1);
}
close(fd);
}
#define NF_BR_NUMHOOKS 6
#define EBT_TABLE_MAXNAMELEN 32
#define EBT_CHAIN_MAXNAMELEN 32
#define EBT_BASE_CTL 128
#define EBT_SO_SET_ENTRIES (EBT_BASE_CTL)
#define EBT_SO_GET_INFO (EBT_BASE_CTL)
#define EBT_SO_GET_ENTRIES (EBT_SO_GET_INFO + 1)
#define EBT_SO_GET_INIT_INFO (EBT_SO_GET_ENTRIES + 1)
#define EBT_SO_GET_INIT_ENTRIES (EBT_SO_GET_INIT_INFO + 1)
struct ebt_replace {
char name[EBT_TABLE_MAXNAMELEN];
unsigned int valid_hooks;
unsigned int nentries;
unsigned int entries_size;
struct ebt_entries* hook_entry[NF_BR_NUMHOOKS];
unsigned int num_counters;
struct ebt_counter* counters;
char* entries;
};
struct ebt_entries {
unsigned int distinguisher;
char name[EBT_CHAIN_MAXNAMELEN];
unsigned int counter_offset;
int policy;
unsigned int nentries;
char data[0] __attribute__((aligned(__alignof__(struct ebt_replace))));
};
struct ebt_table_desc {
const char* name;
struct ebt_replace replace;
char entrytable[XT_TABLE_SIZE];
};
static struct ebt_table_desc ebt_tables[] = {
{.name = "filter"},
{.name = "nat"},
{.name = "broute"},
};
static void checkpoint_ebtables(void)
{
int fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (fd == -1) {
switch (errno) {
case EAFNOSUPPORT:
case ENOPROTOOPT:
return;
}
exit(1);
}
for (size_t i = 0; i < sizeof(ebt_tables) / sizeof(ebt_tables[0]); i++) {
struct ebt_table_desc* table = &ebt_tables[i];
strcpy(table->replace.name, table->name);
socklen_t optlen = sizeof(table->replace);
if (getsockopt(fd, SOL_IP, EBT_SO_GET_INIT_INFO, &table->replace,
&optlen)) {
switch (errno) {
case EPERM:
case ENOENT:
case ENOPROTOOPT:
continue;
}
exit(1);
}
if (table->replace.entries_size > sizeof(table->entrytable))
exit(1);
table->replace.num_counters = 0;
table->replace.entries = table->entrytable;
optlen = sizeof(table->replace) + table->replace.entries_size;
if (getsockopt(fd, SOL_IP, EBT_SO_GET_INIT_ENTRIES, &table->replace,
&optlen))
exit(1);
}
close(fd);
}
static void reset_ebtables()
{
int fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (fd == -1) {
switch (errno) {
case EAFNOSUPPORT:
case ENOPROTOOPT:
return;
}
exit(1);
}
for (unsigned i = 0; i < sizeof(ebt_tables) / sizeof(ebt_tables[0]); i++) {
struct ebt_table_desc* table = &ebt_tables[i];
if (table->replace.valid_hooks == 0)
continue;
struct ebt_replace replace;
memset(&replace, 0, sizeof(replace));
strcpy(replace.name, table->name);
socklen_t optlen = sizeof(replace);
if (getsockopt(fd, SOL_IP, EBT_SO_GET_INFO, &replace, &optlen))
exit(1);
replace.num_counters = 0;
table->replace.entries = 0;
for (unsigned h = 0; h < NF_BR_NUMHOOKS; h++)
table->replace.hook_entry[h] = 0;
if (memcmp(&table->replace, &replace, sizeof(table->replace)) == 0) {
char entrytable[XT_TABLE_SIZE];
memset(&entrytable, 0, sizeof(entrytable));
replace.entries = entrytable;
optlen = sizeof(replace) + replace.entries_size;
if (getsockopt(fd, SOL_IP, EBT_SO_GET_ENTRIES, &replace, &optlen))
exit(1);
if (memcmp(table->entrytable, entrytable, replace.entries_size) == 0)
continue;
}
for (unsigned j = 0, h = 0; h < NF_BR_NUMHOOKS; h++) {
if (table->replace.valid_hooks & (1 << h)) {
table->replace.hook_entry[h] =
(struct ebt_entries*)table->entrytable + j;
j++;
}
}
table->replace.entries = table->entrytable;
optlen = sizeof(table->replace) + table->replace.entries_size;
if (setsockopt(fd, SOL_IP, EBT_SO_SET_ENTRIES, &table->replace, optlen))
exit(1);
}
close(fd);
}
static void checkpoint_net_namespace(void)
{
checkpoint_ebtables();
checkpoint_arptables();
checkpoint_iptables(ipv4_tables, sizeof(ipv4_tables) / sizeof(ipv4_tables[0]),
AF_INET, SOL_IP);
checkpoint_iptables(ipv6_tables, sizeof(ipv6_tables) / sizeof(ipv6_tables[0]),
AF_INET6, SOL_IPV6);
}
static void reset_net_namespace(void)
{
reset_ebtables();
reset_arptables();
reset_iptables(ipv4_tables, sizeof(ipv4_tables) / sizeof(ipv4_tables[0]),
AF_INET, SOL_IP);
reset_iptables(ipv6_tables, sizeof(ipv6_tables) / sizeof(ipv6_tables[0]),
AF_INET6, SOL_IPV6);
}
static void mount_cgroups(const char* dir, const char** controllers, int count)
{
if (mkdir(dir, 0777)) {
}
char enabled[128] = {0};
int i = 0;
for (; i < count; i++) {
if (mount("none", dir, "cgroup", 0, controllers[i])) {
continue;
}
umount(dir);
strcat(enabled, ",");
strcat(enabled, controllers[i]);
}
if (enabled[0] == 0)
return;
if (mount("none", dir, "cgroup", 0, enabled + 1)) {
}
if (chmod(dir, 0777)) {
}
}
static void setup_cgroups()
{
const char* unified_controllers[] = {"+cpu", "+memory", "+io", "+pids"};
const char* net_controllers[] = {"net", "net_prio", "devices", "blkio",
"freezer"};
const char* cpu_controllers[] = {"cpuset", "cpuacct", "hugetlb", "rlimit"};
if (mkdir("/syzcgroup", 0777)) {
}
if (mkdir("/syzcgroup/unified", 0777)) {
}
if (mount("none", "/syzcgroup/unified", "cgroup2", 0, NULL)) {
}
if (chmod("/syzcgroup/unified", 0777)) {
}
int unified_control =
open("/syzcgroup/unified/cgroup.subtree_control", O_WRONLY);
if (unified_control != -1) {
unsigned i;
for (i = 0;
i < sizeof(unified_controllers) / sizeof(unified_controllers[0]); i++)
if (write(unified_control, unified_controllers[i],
strlen(unified_controllers[i])) < 0) {
}
close(unified_control);
}
mount_cgroups("/syzcgroup/net", net_controllers,
sizeof(net_controllers) / sizeof(net_controllers[0]));
mount_cgroups("/syzcgroup/cpu", cpu_controllers,
sizeof(cpu_controllers) / sizeof(cpu_controllers[0]));
write_file("/syzcgroup/cpu/cgroup.clone_children", "1");
write_file("/syzcgroup/cpu/cpuset.memory_pressure_enabled", "1");
}
static void setup_cgroups_loop()
{
int pid = getpid();
char file[128];
char cgroupdir[64];
snprintf(cgroupdir, sizeof(cgroupdir), "/syzcgroup/unified/syz%llu", procid);
if (mkdir(cgroupdir, 0777)) {
}
snprintf(file, sizeof(file), "%s/pids.max", cgroupdir);
write_file(file, "32");
snprintf(file, sizeof(file), "%s/memory.low", cgroupdir);
write_file(file, "%d", 298 << 20);
snprintf(file, sizeof(file), "%s/memory.high", cgroupdir);
write_file(file, "%d", 299 << 20);
snprintf(file, sizeof(file), "%s/memory.max", cgroupdir);
write_file(file, "%d", 300 << 20);
snprintf(file, sizeof(file), "%s/cgroup.procs", cgroupdir);
write_file(file, "%d", pid);
snprintf(cgroupdir, sizeof(cgroupdir), "/syzcgroup/cpu/syz%llu", procid);
if (mkdir(cgroupdir, 0777)) {
}
snprintf(file, sizeof(file), "%s/cgroup.procs", cgroupdir);
write_file(file, "%d", pid);
snprintf(cgroupdir, sizeof(cgroupdir), "/syzcgroup/net/syz%llu", procid);
if (mkdir(cgroupdir, 0777)) {
}
snprintf(file, sizeof(file), "%s/cgroup.procs", cgroupdir);
write_file(file, "%d", pid);
}
static void setup_cgroups_test()
{
char cgroupdir[64];
snprintf(cgroupdir, sizeof(cgroupdir), "/syzcgroup/unified/syz%llu", procid);
if (symlink(cgroupdir, "./cgroup")) {
}
snprintf(cgroupdir, sizeof(cgroupdir), "/syzcgroup/cpu/syz%llu", procid);
if (symlink(cgroupdir, "./cgroup.cpu")) {
}
snprintf(cgroupdir, sizeof(cgroupdir), "/syzcgroup/net/syz%llu", procid);
if (symlink(cgroupdir, "./cgroup.net")) {
}
}
static void setup_common()
{
if (mount(0, "/sys/fs/fuse/connections", "fusectl", 0, 0)) {
}
}
static void setup_binderfs()
{
if (mkdir("/dev/binderfs", 0777)) {
}
if (mount("binder", "/dev/binderfs", "binder", 0, NULL)) {
}
}
static void loop();
static void sandbox_common()
{
prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
setsid();
struct rlimit rlim;
rlim.rlim_cur = rlim.rlim_max = (200 << 20);
setrlimit(RLIMIT_AS, &rlim);
rlim.rlim_cur = rlim.rlim_max = 32 << 20;
setrlimit(RLIMIT_MEMLOCK, &rlim);
rlim.rlim_cur = rlim.rlim_max = 136 << 20;
setrlimit(RLIMIT_FSIZE, &rlim);
rlim.rlim_cur = rlim.rlim_max = 1 << 20;
setrlimit(RLIMIT_STACK, &rlim);
rlim.rlim_cur = rlim.rlim_max = 0;
setrlimit(RLIMIT_CORE, &rlim);
rlim.rlim_cur = rlim.rlim_max = 256;
setrlimit(RLIMIT_NOFILE, &rlim);
if (unshare(CLONE_NEWNS)) {
}
if (mount(NULL, "/", NULL, MS_REC | MS_PRIVATE, NULL)) {
}
if (unshare(CLONE_NEWIPC)) {
}
if (unshare(0x02000000)) {
}
if (unshare(CLONE_NEWUTS)) {
}
if (unshare(CLONE_SYSVSEM)) {
}
typedef struct {
const char* name;
const char* value;
} sysctl_t;
static const sysctl_t sysctls[] = {
{"/proc/sys/kernel/shmmax", "16777216"},
{"/proc/sys/kernel/shmall", "536870912"},
{"/proc/sys/kernel/shmmni", "1024"},
{"/proc/sys/kernel/msgmax", "8192"},
{"/proc/sys/kernel/msgmni", "1024"},
{"/proc/sys/kernel/msgmnb", "1024"},
{"/proc/sys/kernel/sem", "1024 1048576 500 1024"},
};
unsigned i;
for (i = 0; i < sizeof(sysctls) / sizeof(sysctls[0]); i++)
write_file(sysctls[i].name, sysctls[i].value);
}
static int wait_for_loop(int pid)
{
if (pid < 0)
exit(1);
int status = 0;
while (waitpid(-1, &status, __WALL) != pid) {
}
return WEXITSTATUS(status);
}
static void drop_caps(void)
{
struct __user_cap_header_struct cap_hdr = {};
struct __user_cap_data_struct cap_data[2] = {};
cap_hdr.version = _LINUX_CAPABILITY_VERSION_3;
cap_hdr.pid = getpid();
if (syscall(SYS_capget, &cap_hdr, &cap_data))
exit(1);
const int drop = (1 << CAP_SYS_PTRACE) | (1 << CAP_SYS_NICE);
cap_data[0].effective &= ~drop;
cap_data[0].permitted &= ~drop;
cap_data[0].inheritable &= ~drop;
if (syscall(SYS_capset, &cap_hdr, &cap_data))
exit(1);
}
static int do_sandbox_none(void)
{
if (unshare(CLONE_NEWPID)) {
}
int pid = fork();
if (pid != 0)
return wait_for_loop(pid);
setup_common();
initialize_vhci();
sandbox_common();
drop_caps();
initialize_netdevices_init();
if (unshare(CLONE_NEWNET)) {
}
initialize_netdevices();
initialize_wifi_devices();
setup_binderfs();
loop();
exit(1);
}
#define FS_IOC_SETFLAGS _IOW('f', 2, long)
static void remove_dir(const char* dir)
{
int iter = 0;
DIR* dp = 0;
retry:
while (umount2(dir, MNT_DETACH | UMOUNT_NOFOLLOW) == 0) {
}
dp = opendir(dir);
if (dp == NULL) {
if (errno == EMFILE) {
exit(1);
}
exit(1);
}
struct dirent* ep = 0;
while ((ep = readdir(dp))) {
if (strcmp(ep->d_name, ".") == 0 || strcmp(ep->d_name, "..") == 0)
continue;
char filename[FILENAME_MAX];
snprintf(filename, sizeof(filename), "%s/%s", dir, ep->d_name);
while (umount2(filename, MNT_DETACH | UMOUNT_NOFOLLOW) == 0) {
}
struct stat st;
if (lstat(filename, &st))
exit(1);
if (S_ISDIR(st.st_mode)) {
remove_dir(filename);
continue;
}
int i;
for (i = 0;; i++) {
if (unlink(filename) == 0)
break;
if (errno == EPERM) {
int fd = open(filename, O_RDONLY);
if (fd != -1) {
long flags = 0;
if (ioctl(fd, FS_IOC_SETFLAGS, &flags) == 0) {
}
close(fd);
continue;
}
}
if (errno == EROFS) {
break;
}
if (errno != EBUSY || i > 100)
exit(1);
if (umount2(filename, MNT_DETACH | UMOUNT_NOFOLLOW))
exit(1);
}
}
closedir(dp);
for (int i = 0;; i++) {
if (rmdir(dir) == 0)
break;
if (i < 100) {
if (errno == EPERM) {
int fd = open(dir, O_RDONLY);
if (fd != -1) {
long flags = 0;
if (ioctl(fd, FS_IOC_SETFLAGS, &flags) == 0) {
}
close(fd);
continue;
}
}
if (errno == EROFS) {
break;
}
if (errno == EBUSY) {
if (umount2(dir, MNT_DETACH | UMOUNT_NOFOLLOW))
exit(1);
continue;
}
if (errno == ENOTEMPTY) {
if (iter < 100) {
iter++;
goto retry;
}
}
}
exit(1);
}
}
static void kill_and_wait(int pid, int* status)
{
kill(-pid, SIGKILL);
kill(pid, SIGKILL);
for (int i = 0; i < 100; i++) {
if (waitpid(-1, status, WNOHANG | __WALL) == pid)
return;
usleep(1000);
}
DIR* dir = opendir("/sys/fs/fuse/connections");
if (dir) {
for (;;) {
struct dirent* ent = readdir(dir);
if (!ent)
break;
if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0)
continue;
char abort[300];
snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort",
ent->d_name);
int fd = open(abort, O_WRONLY);
if (fd == -1) {
continue;
}
if (write(fd, abort, 1) < 0) {
}
close(fd);
}
closedir(dir);
} else {
}
while (waitpid(-1, status, __WALL) != pid) {
}
}
static void setup_loop()
{
setup_cgroups_loop();
checkpoint_net_namespace();
}
static void reset_loop()
{
reset_net_namespace();
}
static void setup_test()
{
prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
setpgrp();
setup_cgroups_test();
write_file("/proc/self/oom_score_adj", "1000");
if (symlink("/dev/binderfs", "./binderfs")) {
}
}
static void close_fds()
{
for (int fd = 3; fd < MAX_FDS; fd++)
close(fd);
}
static void setup_binfmt_misc()
{
if (mount(0, "/proc/sys/fs/binfmt_misc", "binfmt_misc", 0, 0)) {
}
write_file("/proc/sys/fs/binfmt_misc/register", ":syz0:M:0:\x01::./file0:");
write_file("/proc/sys/fs/binfmt_misc/register",
":syz1:M:1:\x02::./file0:POC");
}
static void setup_sysctl()
{
char mypid[32];
snprintf(mypid, sizeof(mypid), "%d", getpid());
struct {
const char* name;
const char* data;
} files[] = {
{"/sys/kernel/debug/x86/nmi_longest_ns", "10000000000"},
{"/proc/sys/kernel/hung_task_check_interval_secs", "20"},
{"/proc/sys/net/core/bpf_jit_kallsyms", "1"},
{"/proc/sys/net/core/bpf_jit_harden", "0"},
{"/proc/sys/kernel/kptr_restrict", "0"},
{"/proc/sys/kernel/softlockup_all_cpu_backtrace", "1"},
{"/proc/sys/fs/mount-max", "100"},
{"/proc/sys/vm/oom_dump_tasks", "0"},
{"/proc/sys/debug/exception-trace", "0"},
{"/proc/sys/kernel/printk", "7 4 1 3"},
{"/proc/sys/net/ipv4/ping_group_range", "0 65535"},
{"/proc/sys/kernel/keys/gc_delay", "1"},
{"/proc/sys/vm/oom_kill_allocating_task", "1"},
{"/proc/sys/kernel/ctrl-alt-del", "0"},
{"/proc/sys/kernel/cad_pid", mypid},
};
for (size_t i = 0; i < sizeof(files) / sizeof(files[0]); i++) {
if (!write_file(files[i].name, files[i].data))
printf("write to %s failed: %s\n", files[i].name, strerror(errno));
}
}
#define NL802154_CMD_SET_SHORT_ADDR 11
#define NL802154_ATTR_IFINDEX 3
#define NL802154_ATTR_SHORT_ADDR 10
static void setup_802154()
{
int sock_route = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
if (sock_route == -1)
exit(1);
int sock_generic = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
if (sock_generic < 0)
exit(1);
int nl802154_family_id =
netlink_query_family_id(&nlmsg, sock_generic, "nl802154", true);
for (int i = 0; i < 2; i++) {
char devname[] = "wpan0";
devname[strlen(devname) - 1] += i;
uint64_t hwaddr = 0xaaaaaaaaaaaa0002 + (i << 8);
uint16_t shortaddr = 0xaaa0 + i;
int ifindex = if_nametoindex(devname);
struct genlmsghdr genlhdr;
memset(&genlhdr, 0, sizeof(genlhdr));
genlhdr.cmd = NL802154_CMD_SET_SHORT_ADDR;
netlink_init(&nlmsg, nl802154_family_id, 0, &genlhdr, sizeof(genlhdr));
netlink_attr(&nlmsg, NL802154_ATTR_IFINDEX, &ifindex, sizeof(ifindex));
netlink_attr(&nlmsg, NL802154_ATTR_SHORT_ADDR, &shortaddr,
sizeof(shortaddr));
int err = netlink_send(&nlmsg, sock_generic);
if (err < 0) {
}
netlink_device_change(&nlmsg, sock_route, devname, true, 0, &hwaddr,
sizeof(hwaddr), 0);
if (i == 0) {
netlink_add_device_impl(&nlmsg, "lowpan", "lowpan0");
netlink_done(&nlmsg);
netlink_attr(&nlmsg, IFLA_LINK, &ifindex, sizeof(ifindex));
int err = netlink_send(&nlmsg, sock_route);
if (err < 0) {
}
}
}
close(sock_route);
close(sock_generic);
}
struct thread_t {
int created, call;
event_t ready, done;
};
static struct thread_t threads[16];
static void execute_call(int call);
static int running;
static void* thr(void* arg)
{
struct thread_t* th = (struct thread_t*)arg;
for (;;) {
event_wait(&th->ready);
event_reset(&th->ready);
execute_call(th->call);
__atomic_fetch_sub(&running, 1, __ATOMIC_RELAXED);
event_set(&th->done);
}
return 0;
}
static void execute_one(void)
{
int i, call, thread;
for (call = 0; call < 6; call++) {
for (thread = 0; thread < (int)(sizeof(threads) / sizeof(threads[0]));
thread++) {
struct thread_t* th = &threads[thread];
if (!th->created) {
th->created = 1;
event_init(&th->ready);
event_init(&th->done);
event_set(&th->done);
thread_start(thr, th);
}
if (!event_isset(&th->done))
continue;
event_reset(&th->done);
th->call = call;
__atomic_fetch_add(&running, 1, __ATOMIC_RELAXED);
event_set(&th->ready);
if (call == 3 || call == 4)
break;
event_timedwait(&th->done, 50);
break;
}
}
for (i = 0; i < 100 && __atomic_load_n(&running, __ATOMIC_RELAXED); i++)
sleep_ms(1);
close_fds();
}
static void execute_one(void);
#define WAIT_FLAGS __WALL
static void loop(void)
{
setup_loop();
int iter = 0;
for (;; iter++) {
char cwdbuf[32];
sprintf(cwdbuf, "./%d", iter);
if (mkdir(cwdbuf, 0777))
exit(1);
reset_loop();
int pid = fork();
if (pid < 0)
exit(1);
if (pid == 0) {
if (chdir(cwdbuf))
exit(1);
setup_test();
execute_one();
exit(0);
}
int status = 0;
uint64_t start = current_time_ms();
for (;;) {
if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid)
break;
sleep_ms(1);
if (current_time_ms() - start < 5000)
continue;
kill_and_wait(pid, &status);
break;
}
remove_dir(cwdbuf);
}
}
uint64_t r[2] = {0xffffffffffffffff, 0xffffffffffffffff};
void execute_call(int call)
{
intptr_t res = 0;
switch (call) {
case 0:
memcpy((void*)0x20000080, "/proc/self/exe\000", 15);
res = syscall(__NR_openat, 0xffffff9c, 0x20000080ul, 0ul, 0ul);
if (res != -1)
r[0] = res;
break;
case 1:
res = syscall(__NR_socket, 0xaul, 2ul, 0); // AF_INET6, SOCK_DGRAM
if (res != -1)
r[1] = res;
break;
case 2:
*(uint16_t*)0x20000000 = 0xa;
*(uint16_t*)0x20000002 = htobe16(0);
*(uint32_t*)0x20000004 = htobe32(0);
*(uint8_t*)0x20000008 = 0xfe;
*(uint8_t*)0x20000009 = 0x80;
memset((void*)0x2000000a, 0, 13);
*(uint8_t*)0x20000017 = 4;
*(uint32_t*)0x20000018 = 2;
syscall(__NR_connect, r[1], 0x20000000ul, 0x1cul);
break;
case 3:
*(uint16_t*)0x20000080 = 0xa;
*(uint16_t*)0x20000082 = htobe16(0);
*(uint32_t*)0x20000084 = htobe32(0);
memset((void*)0x20000088, 0, 10);
memset((void*)0x20000092, 255, 2);
*(uint32_t*)0x20000094 = htobe32(0xe0000001);
*(uint32_t*)0x20000098 = 0;
syscall(__NR_connect, r[1], 0x20000080ul, 0x1cul);
{
int i;
for (i = 0; i < 32; i++) {
syscall(__NR_connect, r[1], 0x20000080ul, 0x1cul);
}
}
break;
case 4:
*(uint32_t*)0x20000140 = 2;
syscall(__NR_setsockopt, r[1], 0x29, 1, 0x20000140ul, 4ul); // SO_WIFI_STATUS
break;
case 5:
syscall(__NR_sendfile, r[1], r[0], 0ul, 0xff01ul);
{
int i;
for (i = 0; i < 32; i++) {
syscall(__NR_sendfile, r[1], r[0], 0ul, 0xff01ul);
}
}
break;
}
}
int main(void)
{
syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
setup_sysctl();
setup_cgroups();
setup_binfmt_misc();
setup_802154();
for (procid = 0; procid < 8; procid++) {
if (fork() == 0) {
use_temporary_dir();
do_sandbox_none();
}
}
sleep(1000000);
return 0;
}
On Thu, Jan 5, 2023 at 4:17 PM Fedor Pchelkin <[email protected]> wrote:
>
> Syzkaller reports the following crash:
>
> kernel BUG at include/linux/skbuff.h:2311!
> invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
> CPU: 2 PID: 4615 Comm: syz-executor260 Not tainted 5.10.152-syzkaller #0
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> RIP: 0010:__skb_pull include/linux/skbuff.h:2311 [inline]
> RIP: 0010:__ip_make_skb.cold+0x5b/0x5d net/ipv4/ip_output.c:1507
> Code: 79 a2 f9 8b 44 24 3c 89 ee 48 c7 c7 00 2e de 88 41 89 45 70 e8 d1 84 de ff 31 d2 4c 89 ee 48 c7 c7 40 2e de 88 e8 a1 26 ff ff <0f> 0b e8 63 79 a2 f9 e8 4e f7 e2 f9 48 c7 c7 40 39 de 88 e8 5e c1
> RSP: 0018:ffff88801e0af698 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffff88814c408288 RCX: ffffffff87ce00ec
> RDX: ffff888024a11ac0 RSI: ffffffff87ceccc6 RDI: 0000000000000001
> RBP: 0000000000000028 R08: 000000000000003b R09: ffff8880b8438ba7
> R10: 0000000000000000 R11: 0000000000000001 R12: ffff8881436a9cc0
> R13: ffff8881436a9cc0 R14: ffff8881436a9e00 R15: dffffc0000000000
> FS: 00007f2750eb7700(0000) GS:ffff8880b8500000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f4438a29010 CR3: 0000000021f58000 CR4: 0000000000350ee0
> Call Trace:
> ip_finish_skb include/net/ip.h:241 [inline]
> udp_push_pending_frames net/ipv4/udp.c:979 [inline]
> udp_sendpage+0x36e/0x570 net/ipv4/udp.c:1354
> inet_sendpage+0xd3/0x140 net/ipv4/af_inet.c:831
> kernel_sendpage.part.0+0x13c/0x280 net/socket.c:3514
> kernel_sendpage net/socket.c:3511 [inline]
> sock_sendpage+0xe5/0x140 net/socket.c:944
> pipe_to_sendpage+0x2af/0x380 fs/splice.c:364
> splice_from_pipe_feed fs/splice.c:418 [inline]
> __splice_from_pipe+0x3e5/0x840 fs/splice.c:562
> splice_from_pipe fs/splice.c:597 [inline]
> generic_splice_sendpage+0xd4/0x140 fs/splice.c:743
> do_splice_from fs/splice.c:764 [inline]
> direct_splice_actor+0x10f/0x170 fs/splice.c:933
> splice_direct_to_actor+0x38f/0x990 fs/splice.c:888
> do_splice_direct+0x1b3/0x280 fs/splice.c:976
> do_sendfile+0x553/0x10a0 fs/read_write.c:1257
> __do_sys_sendfile64 fs/read_write.c:1318 [inline]
> __se_sys_sendfile64 fs/read_write.c:1304 [inline]
> __x64_sys_sendfile64+0x1d0/0x210 fs/read_write.c:1304
> do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
> entry_SYSCALL_64_after_hwframe+0x61/0xc6
>
> It was actually found on a 5.10 kernel instance but I didn't find any
> upstream commit referencing something of that kind so this bug is likely
> to be in upstream, too. The reproducers unfortunately do not work on my
> machine, but I'll add one in the next email for additional info.
>
> From here the terminology is used as from the fragment of __ip_make_skb():
> ---
> if (skb->data < skb_network_header(skb))
> __skb_pull(skb, skb_network_offset(skb));
> while ((tmp_skb = __skb_dequeue(queue)) != NULL) {
> __skb_pull(tmp_skb, skb_network_header_len(skb)); <-- BUG is here
> ---
>
> We get the first fragment (called 'skb') from the queue and then start
> getting another fragments ('tmp_skb') from the queue and combine them to
> the first fragment.
>
> The problem is that the difference between tmp_skb->len and
> tmp_skb->data_len is smaller than the length of skb network (IP) header
> and while doing __skb_pull(), tmp_skb->len becomes smaller than
> tmp_skb->data_len causing a bug.
>
> Something is probably wrong with IP header layout of the first fragment
> (the SKB to which we are combining another ones). It is 40 bytes long, and
> the tmp_skb IP header's length is 20 bytes (have a look at debug info
> lower). The first fragment, however, can contain some specific IP options
> which are stored only in this fragment and are not included into the
> following ones. On the other hand, the problem can be with tmp_skb where
> something casued its data_len be incorrect.
The assumption in the code that the network header length is the same
for all fragments apparently does not hold.
ip_append_page reserves head room as follows:
fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0);
[..]
skb_reset_network_header(skb);
skb->transport_header = (skb->network_header +
fragheaderlen);
The reproducer you shared makes no explicit system call that modifies
optional header length, like IP_OPTIONS.
But it opens a PF_INET6/SOCK_DGRAM socket, and yet the stack trace
shows an IPv4 stack.
The repro shows multiple connect calls.
Perhaps it manages to switch between IPv6 and v4-mapped-v6 socket in
between sendmsg/sendpage calls?
Changing the argument to skb_network_header_len in
`__skb_pull(tmp_skb, skb_network_header_len(skb));` might
superficially address the out-of-bounds read. But it should probably
not be possible to switch between IPv6 and IPv4 while a datagram is
being constructed.
> Maybe there is some sanity check missing while constructing an IP datagram
> in ip_append_page()? Additional check of extra IP headers or MTU
> values...?
>
> We managed to get some debug info about the failing SKBs.
>
> tmp_skb info:
> skb len=1476 headroom=160 headlen=20 tailroom=0
> mac=(-1,-1) net=(160,20) trans=180
> shinfo(txflags=0 nr_frags=1 gso(size=0 type=0 segs=0))
> csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
> hash(0xe62f05e3 sw=0 l4=1) proto=0x0000 pkttype=0 iif=0
> sk family=2 type=2 proto=17
> skb linear: 00000000: 00 00 00 00 00 00 00 00 30 06 a9 86 ff ff ff ff
> skb linear: 00000010: 02 00 00 00
> skb frag: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000110: 00 00 00 00 00 00 00 00 00 00 00 00
>
> skb info:
> skb len=1476 headroom=168 headlen=241 tailroom=0
> mac=(-1,-1) net=(168,40) trans=208
> shinfo(txflags=0 nr_frags=1 gso(size=0 type=0 segs=0))
> csum(0x3f886391 ip_summed=0 complete_sw=0 valid=0 level=0)
> hash(0xe62f05e3 sw=0 l4=1) proto=0x86dd pkttype=0 iif=0
> sk family=2 type=2 proto=17
> skb linear: 00000000: 00 00 00 00 00 00 00 00 28 00 00 00 00 00 00 00
> skb linear: 00000010: 80 44 f9 8a ff ff ff ff 00 00 00 00 00 00 00 00
> skb linear: 00000020: 00 00 00 00 00 00 00 00 fe ff ff ff 00 00 00 00
> skb linear: 00000030: f3 57 61 ad ca 6f 38 25 00 62 d5 b1 7d e1 d0 94
> skb linear: 00000040: 04 ae 20 57 20 1a 06 db 10 92 76 4f 8d 2e af 83
> skb linear: 00000050: 91 0c c3 cd b5 d1 96 9e c8 7e c8 e5 90 a9 be aa
> skb linear: 00000060: ae f8 7d 1d bb af 99 62 36 3f c9 a3 44 4e 18 fa
> skb linear: 00000070: 0e 5f 40 32 59 ad 8b 90 89 df 79 63 13 80 da 2b
> skb linear: 00000080: de 47 62 24 61 fd 47 d8 89 4a 74 8a 91 32 aa c6
> skb linear: 00000090: ad 59 30 2f 2c 2e 94 9f 83 00 46 5b f9 11 98 a9
> skb linear: 000000a0: cb ed ca cb 8d 70 d8 78 4c 95 b6 12 af 8b 81 33
> skb linear: 000000b0: 13 58 9d 74 ef 30 91 1d 10 bd 55 22 67 6b b9 43
> skb linear: 000000c0: 97 72 5c a2 c7 24 df f4 2c 3f b8 5e cb 3a 10 f6
> skb linear: 000000d0: 10 5f 3a 11 32 2a d5 22 b2 14 73 c0 1a df b0 6f
> skb linear: 000000e0: 3c 98 91 df bf d2 b6 99 ee 3c fe 91 98 f6 55 20
> skb linear: 000000f0: 45
> skb frag: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> skb frag: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>
> More info:
> tmp_skb->len = 1476; tmp_skb->data_len = 1456
> skb->len = 1476; skb->data_len = 1235
>
> I actually found some commits with resembling call trace but they don't help
> me that much to solve the issue:
> 10b8a3de603d ("ipv6: the entire IPv6 header chain must fit the first fragment")
> e9d3f80935b6 ("net/af_packet: make sure to pull mac header")
> 501a90c94510 ("inet: protect against too small mtu values.")
>
> Fedor