Received: by 10.223.164.202 with SMTP id h10csp713225wrb; Tue, 14 Nov 2017 08:26:52 -0800 (PST) X-Google-Smtp-Source: AGs4zMZYafhXWHsDJmUnJcRO6SbEaa5f+z/U8eB9GcZ6rcF2s6kSxSSy8yupMkAzoI/OGj8W3qtc X-Received: by 10.98.211.73 with SMTP id q70mr6663328pfg.107.1510676812053; Tue, 14 Nov 2017 08:26:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510676812; cv=none; d=google.com; s=arc-20160816; b=ivFPjyATpw2NyhogEcLi6VWPOuQjEu7bhX8H2446VeSspIMBcS5AhMcqCmrhg/p9MD saIFwR2Flqxt3CTpbZ6GG+IVI60f5yhutBGbT7PbUV4B2BV5fUssEXRdEboyHk91AmiQ iyfyegpEugQGUeXdJYXSanQ6D9ze4gHNwGfaptwLDxiep9RmLBfgIzn+99ovVkA+MT8P a4Ebw9AJotst9nMLEquUxKozvNryEamiH9UtGJNzYDfiW5tpECshlKEAXk9AbqhZE76x Up8KUIr9L4ZGy+ZVn8F2stX+3Gr//LOrwrdxT7SKXg2dDol3Gx9kx6u0gQc38IlHwzLQ gA+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :dkim-signature:arc-authentication-results; bh=B8czBtVaprsovF/PPV9HLEcu1iVLjc0CA6A5dM+/M3g=; b=TE1cvb13e9PuxvghOtn+ZggzJmp1nYSnvkXnas/vXYT3IXImKuJnbz/VyGp6Oy/zFv UQi1emNfYZR9glArKw8iOkg4EJD9iRX1IiCUl/4k8hPykrwRr4oYM5nWVP3VCio7kukn v+xaNX/CglhRfgBCvZJxvSMV2i4Dzxr6qcn5sfK/A5kmDW3WZySDTexvefAHlNZLSkqX +twucsNMIvkPiqbeoJ9LY+7gSMjFhMIU4Ysbe8PNzqLnCZGa3Dkh/5CzqbwHFPhGr08B 8t2dTEqRD9KBOIqBjaemssRsirFTDqbyudF/OtO36GBaJtCGEYJUD+7FnFlWvDYRqwFR TNnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@scylladb-com.20150623.gappssmtp.com header.s=20150623 header.b=dR1wY3sq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p8si16351259pls.804.2017.11.14.08.26.39; Tue, 14 Nov 2017 08:26:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@scylladb-com.20150623.gappssmtp.com header.s=20150623 header.b=dR1wY3sq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755140AbdKNPm5 (ORCPT + 88 others); Tue, 14 Nov 2017 10:42:57 -0500 Received: from mail-wm0-f47.google.com ([74.125.82.47]:55312 "EHLO mail-wm0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752887AbdKNPms (ORCPT ); Tue, 14 Nov 2017 10:42:48 -0500 Received: by mail-wm0-f47.google.com with SMTP id 9so17321267wme.4 for ; Tue, 14 Nov 2017 07:42:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=scylladb-com.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:organization:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=B8czBtVaprsovF/PPV9HLEcu1iVLjc0CA6A5dM+/M3g=; b=dR1wY3sqeaJrQcB0LfD9OiVv91+7SUA2A1Ehncd+XhAhmIbzmA/VxU/LGdULHM17gn iSyh/GhroO9Ip/Rxi3c9Y/nPsV+C+38T2Xz3hwb28DmY9Gtc7o5ZiETmKwm1XQVhL9OL PtX+ByXMqEWf2VwMh4uHZTZG+kgzta3HAXL5xWtxIv+B/b8/1RkvsuIoqn8UZ4CKFqbx RcTl1eKkhlyIM6JgCMpljcPlIsBUg1CkUiX7bLqpxqXNpJvCrSKYd/UI1rW4gzpIri/4 JXOAoZWh/Tq0g88BBHhj9llsQW3UUpMcXBpQHVtt5ukI5thh24kQOrO27F89F5dKWcPr z/dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-transfer-encoding:content-language; bh=B8czBtVaprsovF/PPV9HLEcu1iVLjc0CA6A5dM+/M3g=; b=RH1kuQQhfye7JOGPltw20ATLY9Y0rnk71hIQd0h2LQZUkkxlnVEeOwR5YkrFeYgV8n Y60WtVkbp6XH+ZTz/sTsvtCNrLIJyXyw/yDYXoDWSEimRlsIbWZEdehaMQGPC+Dfcjvd C8mkn6OAHW4zi5E8Q9L83kNwQXOLZSTmS6Y/Cd2PXICoKNO4S+ikpM0j5Zmfwc8yCtvr nZUa3Nn9Qyqt4lNWz8cWMvcJxQgGxyxC/Nt4XpyATaeDf9OKpYGqfAuRByEi43s2CYEy GWdfyFxgnpinpqtgwzMpRBjvRwzVGTzl6hhbGMeBUTLapIj64DlY2/ybiFDrPliXN08w rrQQ== X-Gm-Message-State: AJaThX5axQPmJQacVOoIs93alsrZPiGKKPXVUjlbY25/stChNDDQpmgW Ompe9vtxtWDsLXJF7bBls8DX9Q== X-Received: by 10.80.166.133 with SMTP id e5mr17788287edc.51.1510674167508; Tue, 14 Nov 2017 07:42:47 -0800 (PST) Received: from avi.cloudius-systems.com ([77.138.249.123]) by smtp.gmail.com with ESMTPSA id i10sm16357003edl.34.2017.11.14.07.42.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Nov 2017 07:42:46 -0800 (PST) Subject: Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration To: Mathieu Desnoyers Cc: Linus Torvalds , Andy Lutomirski , linux-kernel , linux-api , Peter Zijlstra , "Paul E. McKenney" , Boqun Feng , Andrew Hunter , maged michael , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Dave Watson , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Andrea Parri , "Russell King, ARM Linux" , Greg Hackmann , Will Deacon , David Sehr , x86 References: <20171110211249.10742-1-mathieu.desnoyers@efficios.com> <885227610.13045.1510351034488.JavaMail.zimbra@efficios.com> <617343212.13932.1510592207202.JavaMail.zimbra@efficios.com> <4d47fbb8-8f99-19d3-a9cf-66841aeffac3@scylladb.com> <4431530.14831.1510672632887.JavaMail.zimbra@efficios.com> From: Avi Kivity Organization: ScyllaDB Message-ID: <690d1bff-8447-a294-3222-d6f134d276a9@scylladb.com> Date: Tue, 14 Nov 2017 17:42:42 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <4431530.14831.1510672632887.JavaMail.zimbra@efficios.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/14/2017 05:17 PM, Mathieu Desnoyers wrote: > ----- On Nov 14, 2017, at 9:53 AM, Avi Kivity avi@scylladb.com wrote: > >> On 11/13/2017 06:56 PM, Mathieu Desnoyers wrote: >>> ----- On Nov 10, 2017, at 4:57 PM, Mathieu Desnoyers >>> mathieu.desnoyers@efficios.com wrote: >>> >>>> ----- On Nov 10, 2017, at 4:36 PM, Linus Torvalds torvalds@linux-foundation.org >>>> wrote: >>>> >>>>> On Fri, Nov 10, 2017 at 1:12 PM, Mathieu Desnoyers >>>>> wrote: >>>>>> x86 can return to user-space through sysexit and sysretq, which are not >>>>>> core serializing. This breaks expectations from user-space about >>>>>> sequential consistency from a single-threaded self-modifying program >>>>>> point of view in specific migration patterns. >>>>>> >>>>>> Feedback is welcome, >>>>> We should check with Intel. I would actually be surprised if the I$ >>>>> can be out of sync with the D$ after a sysretq. It would actually >>>>> break things like "read code from disk" too in theory. >>>> That core serializing instruction is not that much about I$ vs D$ >>>> consistency, but rather about the processor speculatively executing code >>>> ahead of its retirement point. Ref. Intel Architecture Software Developer's >>>> Manual, Volume 3: System Programming. >>>> >>>> 7.1.3. "Handling Self- and Cross-Modifying Code": >>>> >>>> "The act of a processor writing data into a currently executing code segment >>>> with the intent of >>>> executing that data as code is called self-modifying code. Intel Architecture >>>> processors exhibit >>>> model-specific behavior when executing self-modified code, depending upon how >>>> far ahead of >>>> the current execution pointer the code has been modified. As processor >>>> architectures become >>>> more complex and start to speculatively execute code ahead of the retirement >>>> point (as in the P6 >>>> family processors), the rules regarding which code should execute, pre- or >>>> post-modification, >>>> become blurred. [...]" >>>> >>>> AFAIU, this core serializing instruction seems to be needed for use-cases of >>>> self-modifying code, but not for the initial load of a program from disk, >>>> as the processor has no way to have speculatively executed any of its >>>> instructions. >>> I figured out what you're pointing to: if exec() is executed by a previously >>> running thread, and there is no core serializing instruction between program >>> load and return to user-space, the kernel ends up acting like a JIT, indeed. >> I think that's safe. The kernel has to execute a MOV CR3 instruction >> before it can execute code loaded by exec, and that is a serializing >> instruction. Loading and unloading shared libraries is made safe by the >> IRET executed by page faults (loading) and TLB shootdown IPIs (unloading). > Very good points! Perhaps those guarantees should be documented somewhere ? > >> Directly modifying code in userspace is unsafe if there is some >> non-coherent instruction cache. Instruction fetch and speculative >> execution are non-coherent, but they're probably too short (in current >> processors) to matter. Trace caches are probably large enough, but I >> don't know whether they are coherent or not. > Android guys at Google have reproducers of context synchronization issues > on arm 64 in JIT scenarios. Based on the information I got, flushing the > instruction caches is not enough: they also need to issue a context > synchronizing instruction. > > Perhaps the current Intel processors may have short enough speculative > execution and small enough trace caches, but relying on this without > a clear statement from Intel seems fragile. A small trace cache is still vulnerable, the question is whether it is coherent or not. > I've tried to create a small single-threaded self-modifying loop in > user-space to trigger a trace cache or speculative execution quirk, > but I have not succeeded yet. I suspect that I would need to know > more about the internals of the processor architecture to create the > right stalls that would allow speculative execution to move further > ahead, and trigger an incoherent execution flow. Ideas on how to > trigger this would be welcome. > > Intels resynchronize as soon as you jump (in single-threaded execution), so you need to update ahead of the current instruction pointer to see something. Not sure what quirk you're interested in seeing, executing the old code? That's not very exciting. From 1584058999386777919@xxx Tue Nov 14 16:19:43 +0000 2017 X-GM-THRID: 1583715118317361483 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread