Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1473305imm; Wed, 25 Jul 2018 19:36:20 -0700 (PDT) X-Google-Smtp-Source: AAOMgpflw+YssFIB0zmE3qP0ZWL0CrTe2meGduLg58iDbJ+PYk3Sn5Ul/crlo+U+3U5lkezjO4IA X-Received: by 2002:a62:ad1:: with SMTP id 78-v6mr110643pfk.57.1532572580279; Wed, 25 Jul 2018 19:36:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532572580; cv=none; d=google.com; s=arc-20160816; b=YtgC5DXSXgZHF8MDGQ0avbs0qNs5/ZvySI5OuPQ6+COOuNCnVaVvNvCAlmRQKYy4+G fDihfejvA3+0dpX8J3ZPbjDs9Cc80lk2CP0S2ztxF/BQrgG/BdSSbx45G5Q9wMMuFolv UasJG6pOkHQF/oKE+oXbKa2KGbGAOB8moxooeOHQKjWYtrife8g0HkA02Paul0OgdXBn YsvdzOqiMX8cNv381iHAviVVnJhtnZyEvXm6JQLBj2AdFdpE+ZwM24USK6qbfyhkNlXC iF3MbVGlaHQT3UI8NUVa9a27q6QCjiRAxRvW3Q17VxLmPDRNapdXqzBV41g4QGmZMHZw cLwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=8OwQ3Xmo1Hyis0oZJnVT5F94c0nQypAJgT8vpiaO930=; b=nYOeXXrmJNAW2SydUAWN09ZzwC2JTzwyvAsIhAjFlF08mbHiBuKgbSdWNjj+2RioEj kLm1fTVoAFJ7NkeIyEb44wbjSK2PIHU7Aw0uA++db1mCxF8H+rFjBvwbK1cvV8Gnajsa MUgs4mcmmlSpeEkZ2cxBj9XvnWNpV+JBv2x75/CeLmSouVHRaFkRDfHh/CgziW8GBIhx EBeVpzLFhOXmswokRgoca18KrLaxwPeauYBadbXHZvAla0DATuNON0AujWFhpRZayrpS GbAn8X7Lzya3Up3IQ/57sgdJ/ixb/4bDmmbJwDCLb4DgTAGKKApqrEPn1r9A1X/1mLMy KL7g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=googlenew header.b=hN9zRJJz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=REJECT dis=NONE) header.from=arista.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t128-v6si100836pgt.614.2018.07.25.19.36.05; Wed, 25 Jul 2018 19:36:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=googlenew header.b=hN9zRJJz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=REJECT dis=NONE) header.from=arista.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728882AbeGZDqW (ORCPT + 99 others); Wed, 25 Jul 2018 23:46:22 -0400 Received: from mail-ed1-f66.google.com ([209.85.208.66]:33032 "EHLO mail-ed1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728666AbeGZDqU (ORCPT ); Wed, 25 Jul 2018 23:46:20 -0400 Received: by mail-ed1-f66.google.com with SMTP id x5-v6so342494edr.0 for ; Wed, 25 Jul 2018 19:31:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id; bh=8OwQ3Xmo1Hyis0oZJnVT5F94c0nQypAJgT8vpiaO930=; b=hN9zRJJzzazAxEhgBRJFBv/9TRVWJ5RIYN0RH/rSIyi2s+QAAAK2HsbOHtylF1iUjD 1KdfmPK+S6w2L/BaX+pBo28CSf7vFyPevWjrt9yoD9ZWf5CVDUQ/RDRVdymO+FWt8hEZ o8Ntqz7pUWLA3nqNqfZEpIRYdY2Y6wZ4n/yLjLr4/J1oabH+AOO8tswHp1gcgBG4Aigu 6d058PbuJQ20oPzBtnEvMePSPV7oY7b77F2IZ87aeAMJ8KsFfGtChwB/i3/rLuv+WvYR 0Cd+84SAEfa9+zUlF5562/IJMAyL8nHMaMxV+Etq2VbQifGxz1YUR7AB5oTao6B1Hl7a LFvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=8OwQ3Xmo1Hyis0oZJnVT5F94c0nQypAJgT8vpiaO930=; b=RC9iukpDEy5KjmMHJoSiNPKH2f+AEHJd5twXBRywsbbz3kOqvqMJSsKjXCcxexC88x yo3XqAjKN5JdT6GuXi0gltkBlp/UyVmrrNp/RwLrRn6h4wyIWCUmoDCrAlDtXqwk+xa0 Dea9Y4q8NsVh33n/M3unVgZ5jtgAXh86+0iPR+Paf5o9ojTniCJDEPg0Pznh1edgu48f XPJjRUqq/gzi0dI/LWQXHwSXeI6VGRXmz6AL8H25RE4gjG0J79n8iCHwd9VdEluddRXt aSw4hC5+vyuMek3PoDb8uBFcb26kvytZFAEm9KDYEC3Ots12cNrFCJfwA0K/+T6KtJmn OFsw== X-Gm-Message-State: AOUpUlEwK2iDhmwFb68n9T7G7JDuHmBT6blHQ2rsYFt30rcG0hRFkuJv XajTFvzkbQ5AeUxGRj3vAgsqaM6PBAUn7A== X-Received: by 2002:a50:9818:: with SMTP id g24-v6mr518405edb.174.1532572306254; Wed, 25 Jul 2018 19:31:46 -0700 (PDT) Received: from dhcp.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id x13-v6sm241024edx.17.2018.07.25.19.31.44 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 25 Jul 2018 19:31:45 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov , "David S. Miller" , Herbert Xu , Steffen Klassert , Dmitry Safonov <0x7f454c46@gmail.com>, netdev@vger.kernel.org, Andy Lutomirski , Ard Biesheuvel , "H. Peter Anvin" , Ingo Molnar , John Stultz , "Kirill A. Shutemov" , Oleg Nesterov , Stephen Boyd , Steven Rostedt , Thomas Gleixner , x86@kernel.org, linux-efi@vger.kernel.org, Andrew Morton , Greg Kroah-Hartman , Mauro Carvalho Chehab , Shuah Khan , linux-kselftest@vger.kernel.org, Eric Paris , Florian Westphal , Jozsef Kadlecsik , Pablo Neira Ayuso , Paul Moore , coreteam@netfilter.org, linux-audit@redhat.com, netfilter-devel@vger.kernel.org, Fan Du Subject: [PATCH 00/18] xfrm: Add compat layer Date: Thu, 26 Jul 2018 03:31:26 +0100 Message-Id: <20180726023144.31066-1-dima@arista.com> X-Mailer: git-send-email 2.13.6 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Due to some historical mistake, xfrm User ABI differ between native and compatible applications. The difference is in structures paddings and in the result in the size of netlink messages. As it's already visible ABI, it cannot be adjusted by packing structures. Possibility for compatible application to manage xfrm tunnels was disabled by: the commmit 19d7df69fdb2 ("xfrm: Refuse to insert 32 bit userspace socket policies on 64 bit systems") and the commit 74005991b78a ("xfrm: Do not parse 32bits compiled xfrm netlink msg on 64bits host"). By some wonderful reasons and brilliant architecture decisions for creating userspace, on Arista switches we still use 32-bit userspace with 64-bit kernel. There is slow movement to full 64-bit build, but it's not yet here. As the switches need support for ipsec tunnels, the local kernel has reverted mentioned patches that disable xfrm for compat apps. On the top of that there is a bunch of disgraceful hacks in userspace to work around the size check for netlink messages and all that jazz. It looks like, we're not the only desirable users of compatible xfrm, there were a couple of attempts to make it work: https://lkml.org/lkml/2017/1/20/733 https://patchwork.ozlabs.org/patch/44600/ http://netdev.vger.kernel.narkive.com/2Gesykj6/patch-net-next-xfrm-correctly-parse-netlink-msg-from-32bits-ip-command-on-64bits-host All the discussions end in the conclusion that xfrm should have a full compatible layer to correctly work with 32-bit applications on 64-bit kernels: https://lkml.org/lkml/2017/1/23/413 https://patchwork.ozlabs.org/patch/433279/ In some recent lkml discussion, Linus said that it's worth to fix this problem and not giving people an excuse to stay on 32-bit kernel: https://lkml.org/lkml/2018/2/13/752 So, here I add a compatible layer to xfrm. As xfrm uses netlink notifications, kernel should send them in ABI format that an application will parse. The proposed solution is to save the ABI of bind() syscall. The realization detail is to create kernel-hidden, non visible to userspace netlink groups for compat applications. The first two patches simplify ifdeffery, and while I've already submitted them a while ago, I'm resending them for completeness: https://lore.kernel.org/lkml/20180717005004.25984-1-dima@arista.com/T/#u There is also an exhaustive selftest for ipsec tunnels and to check that kernel parses correctly the structures those differ in size. It doesn't depend on any library and compat version can be easy build with: make CFLAGS=-m32 net/ipsec Cc: "David S. Miller" Cc: Herbert Xu Cc: Steffen Klassert Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: netdev@vger.kernel.org Dmitry Safonov (18): x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT compat: Cleanup in_compat_syscall() callers selftest/net/xfrm: Add test for ipsec tunnel net/xfrm: Add _packed types for compat users net/xfrm: Parse userspi_info{,_packed} depending on syscall netlink: Do not subscribe to non-existent groups netlink: Pass groups pointer to .bind() xfrm: Add in-kernel groups for compat notifications xfrm: Dump usersa_info in compat/native formats xfrm: Send state notifications in compat format too xfrm: Add compat support for xfrm_user_expire messages xfrm: Add compat support for xfrm_userpolicy_info messages xfrm: Add compat support for xfrm_user_acquire messages xfrm: Add compat support for xfrm_user_polexpire messages xfrm: Check compat acquire listeners in xfrm_is_alive() xfrm: Notify compat listeners about policy flush xfrm: Notify compat listeners about state flush xfrm: Enable compat syscalls MAINTAINERS | 1 + arch/x86/include/asm/compat.h | 9 +- arch/x86/include/asm/ftrace.h | 4 +- arch/x86/kernel/process_64.c | 4 +- arch/x86/kernel/sys_x86_64.c | 11 +- arch/x86/mm/hugetlbpage.c | 4 +- arch/x86/mm/mmap.c | 2 +- drivers/firmware/efi/efivars.c | 16 +- include/linux/compat.h | 4 +- include/linux/netlink.h | 2 +- include/net/xfrm.h | 14 - kernel/audit.c | 2 +- kernel/time/time.c | 2 +- net/core/rtnetlink.c | 14 +- net/core/sock_diag.c | 25 +- net/netfilter/nfnetlink.c | 24 +- net/netlink/af_netlink.c | 28 +- net/netlink/af_netlink.h | 4 +- net/netlink/genetlink.c | 26 +- net/xfrm/xfrm_state.c | 5 - net/xfrm/xfrm_user.c | 690 ++++++++--- tools/testing/selftests/net/.gitignore | 1 + tools/testing/selftests/net/Makefile | 1 + tools/testing/selftests/net/ipsec.c | 1987 ++++++++++++++++++++++++++++++++ 24 files changed, 2612 insertions(+), 268 deletions(-) create mode 100644 tools/testing/selftests/net/ipsec.c -- 2.13.6