Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp2145002ybi; Sun, 28 Jul 2019 02:59:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqziHMu+jBSEYuwjyL6I/O3lSQd9kBbf3ZQQBGQCWllStW05I3kN4pDRF8mlL1LtY7b4yUUk X-Received: by 2002:a65:6104:: with SMTP id z4mr61505949pgu.27.1564307981561; Sun, 28 Jul 2019 02:59:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564307981; cv=none; d=google.com; s=arc-20160816; b=vh6sx8ytV3ztJ8rdamBZPMF5oXrIkikaeulfYppUis+hD9As0l/Ffnr+4u1IDsZ8bO 5iLdFoIlRXCcwSMjzwo3oRRexjEefHF7D8C15/VoEsnvB3Xlew8beZ8rv/rGCRK6qNTr tjPKoBatWjjhcyo8o9pLS8OgZTsfTUt0YGHjdzCVtO4nOazw4mNka7JyKNOQco/3CcBv fF2qAgU/Zk1Xq9zflZ43mhRw2+QVd3yZxcelKLtJixYf2buTjZYh4yQ6m3wPAoZeXcbE gQzhe2qXDIOO19rLgJ8JmxQyshJbcMAlvrxqu7bpR/40qqsdt+jZqmkZuMWHyXZxtEiZ HOpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version; bh=CgCTbuELxfZzxTSEknSndN9sTLLuOVb2vt6aeuNUBPg=; b=H0zn0jbV6cznYKIR5d5pGbYXcjL43Gl7Oln3km8Y2Bxcn5wutPtQYwnnC4aMh86yHJ X+cMpix5KfdgRV/TwHOcAamA+v85BCk3MtiXfGxchvWFSrf1jG+g/cCRtyL0kmr0n53P m279V3MQJ4PoFuF0IIwzEmb2B/HKd2mJAtToAILFxlAIqZqGV438hwoBC5KygdBKOFJ4 EUxIZBji6XKYAVRD+aj1Kcf/jW0Lid9JtfPZGp3Ks+HcfKW/GJMdKckQX9MupdNqTGJU QkcA/SVLUUOoxncDlef1/g7z9TR3YxQWSOSpkirIFAap+HbOOuQh8YVKyrZbQfsSU+QW mHAg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x18si25733656plm.292.2019.07.28.02.59.25; Sun, 28 Jul 2019 02:59:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726107AbfG1J6R (ORCPT + 99 others); Sun, 28 Jul 2019 05:58:17 -0400 Received: from mail-qt1-f196.google.com ([209.85.160.196]:43524 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726082AbfG1J6R (ORCPT ); Sun, 28 Jul 2019 05:58:17 -0400 Received: by mail-qt1-f196.google.com with SMTP id w17so12550799qto.10 for ; Sun, 28 Jul 2019 02:58:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=CgCTbuELxfZzxTSEknSndN9sTLLuOVb2vt6aeuNUBPg=; b=EGSpe367S6+uVbAREiugKiX0FaDGPBwwp4aooLrTemY5MKdbDUjER8czEU08//2vNn YrPc5zkbxUsTObyHsvvVFUg1OR/iJqsbfNBgrF5ysAozyrAKDiI5Al9hqpgtrJsXebrr bJ6p9qKfq5a9HcExNNPhyeQQ1ei2uHxG/Mh9lGFTg8oKm3eskQbXgMAeQD8q9Z7EnQsD xonGYQa5v7rOf9uhel4vIPUoabrrWbKO83h9ZNttJjitkm8Er/7LNk8PSuA2Cyjn3FrF pH5B48+BYCRoUnYrV9zVg4ozlvmK0FUomtIyLEAWeHLqOzikZnbT7CEMzP8iISYYOsfw J8Jw== X-Gm-Message-State: APjAAAWUbc6LGhWkgRGXGUmFAuciQ69dVJNr00Sw8yKBHEn9XHjWb6mT Gp6qFM7KD/SxhrniGvMlUI+VS6+uip58OF0W5sw= X-Received: by 2002:aed:33a4:: with SMTP id v33mr71591396qtd.18.1564307895541; Sun, 28 Jul 2019 02:58:15 -0700 (PDT) MIME-Version: 1.0 References: <201907221012.41504DCD@keescook> <201907221135.2C2D262D8@keescook> <201907221620.F31B9A082@keescook> <201907231437.DB20BEBD3@keescook> <201907231636.AD3ED717D@keescook> <20190726180103.GE3188@linux.intel.com> In-Reply-To: From: Arnd Bergmann Date: Sun, 28 Jul 2019 11:57:58 +0200 Message-ID: Subject: Re: [5.2 REGRESSION] Generic vDSO breaks seccomp-enabled userspace on i386 To: Andy Lutomirski Cc: Sean Christopherson , Thomas Gleixner , Kees Cook , Vincenzo Frascino , X86 ML , LKML , Paul Bolle Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 27, 2019 at 7:53 PM Andy Lutomirski wrote: > > On Fri, Jul 26, 2019 at 11:01 AM Sean Christopherson > wrote: > > > > On Wed, Jul 24, 2019 at 01:56:34AM +0200, Thomas Gleixner wrote: > > > On Tue, 23 Jul 2019, Kees Cook wrote: > > > > > > > On Wed, Jul 24, 2019 at 12:59:03AM +0200, Thomas Gleixner wrote: > > > > > And as we have sys_clock_gettime64() exposed for 32bit anyway you need to > > > > > deal with that in seccomp independently of the VDSO. It does not make sense > > > > > to treat sys_clock_gettime() differently than sys_clock_gettime64(). They > > > > > both expose the same information, but the latter is y2038 safe. > > > > > > > > Okay, so combining Andy's ideas on aliasing and "more seccomp flags", > > > > we could declare that clock_gettime64() is not filterable on 32-bit at > > > > all without the magic SECCOMP_IGNORE_ALIASES flag or something. Then we > > > > would alias clock_gettime64 to clock_gettime _before_ the first evaluation > > > > (unless SECCOMP_IGNORE_ALIASES is set)? > > > > > > > > (When was clock_gettime64() introduced? Is it too long ago to do this > > > > "you can't filter it without a special flag" change?) > > > > > > clock_gettime64() and the other sys_*time64() syscalls which address the > > > y2038 issue were added in 5.1 > > > > Paul Bolle pointed out that this regression showed up in v5.3-rc1, not > > v5.2. In Paul's case, systemd-journal is failing. > > I think it's getting quite late to start inventing new seccomp > features to fix this. I think the right solution for 5.3 is to change > the 32-bit vdso fallback to use the old clock_gettime, i.e. > clock_gettime32. This is obviously not an acceptable long-term > solution. I think there is something else wrong with the fallback path, it seems to pass the wrong structure in some cases: arch/x86/include/asm/vdso/gettimeofday.h vdso32: static __always_inline long clock_gettime_fallback(clockid_t _clkid, struct __kernel_timespec *_ts) { long ret; asm ( "mov %%ebx, %%edx \n" "mov %[clock], %%ebx \n" "call __kernel_vsyscall \n" "mov %%edx, %%ebx \n" : "=a" (ret), "=m" (*_ts) : "0" (__NR_clock_gettime64), [clock] "g" (_clkid), "c" (_ts) : "edx"); return ret; } arch/x86/include/asm/vdso/gettimeofday.h vdso64: static __always_inline long clock_gettime_fallback(clockid_t _clkid, struct __kernel_timespec *_ts) { long ret; asm ("syscall" : "=a" (ret), "=m" (*_ts) : "0" (__NR_clock_gettime), "D" (_clkid), "S" (_ts) : "rcx", "r11"); return ret; } lib/vdso/gettimeofday.c: static __maybe_unused int __cvdso_clock_gettime32(clockid_t clock, struct old_timespec32 *res) { struct __kernel_timespec ts; int ret; if (res == NULL) goto fallback; ret = __cvdso_clock_gettime(clock, &ts); if (ret == 0) { res->tv_sec = ts.tv_sec; res->tv_nsec = ts.tv_nsec; } return ret; fallback: return clock_gettime_fallback(clock, (struct __kernel_timespec *)res); } So we get an 'old_timespec32' pointer from user space, and cast it to __kernel_timespec in order to pass it to the low-level function that actually fills in the 64-bit structure. On a little-endian machine, the first four bytes are actually correct here, but this is followed by tv_nsec=0 and 8 more bytes that overwrite whatever comes after the user space 'timespec'. [I missed the typecast as an indication of a bug during my review, sorry about that]. I think adding a clock_gettime32_fallback() function that calls __NR_clock_gettime is both the simplest fix for this bug, and the least ugly way to handle it in the long run. We also need to decide what to do about __cvdso_clock_gettime32() once we add a compile-time option to make all time32 syscalls to return an error. Returning -ENOSYS from the clock_gettime32() fallback is probably a good idea, but for consistency the __vdso_clock_gettime() call should either always return the same in that configuration, or be left out from the vdso build endirely. Arnd