Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp4318929imu; Mon, 14 Jan 2019 20:23:56 -0800 (PST) X-Google-Smtp-Source: ALg8bN5EzxchDceMFc1lHVwCG1Msbqvz/zVRLP3/eViHbxvKnQ4NchYTqSyq3eYOPygr0MLcg7im X-Received: by 2002:a62:5ec5:: with SMTP id s188mr1943673pfb.145.1547526236866; Mon, 14 Jan 2019 20:23:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547526236; cv=none; d=google.com; s=arc-20160816; b=JoheNMqZDPwWtPNpVjxZ1Tyo3DLvye33hh+WNeAoPMh4ksXQnH3OWAlI18nmdp/7mk eHD3zT+UDwSoIA2IxLcNTTLXbZpGmo42ORtCnfl6avAe3APcNvarvhSeVDY0HUEi0nW6 R6Ddtnv00DYUlMUMyQe6aJ7TbJKtitK1bNI0v9Q8CrJJMKzz3bzhI31tX7NqGzSX4iGo EXFUK4VEF+TQWhxh8Yzjl3aSjWu0YnZG2qC+6HxSbtnxfESLgMsvdJAnLPt8GhvmF1Tr R0OFUZ0g1AEq8C/Bwdmk1ygVccJ8g0osPyYfBZP3gUYHB21D/fKLRgz/uECnNAkFCs3v 8dKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=BN09EU6PNwfcr2i2WQ23Pc36nTzfu5lQhxs6MHTKZM8=; b=Lm99hhZEmayXU5nvqszKUkrZV2CQYxonwCvzgRlxjOp+ZtEZ23AJlSeTm8Ny13kaYp pXR/FYCWq09q2P45HPZDr5u7nKGqDv/Eyxm9yvGa8iMtAlIYDUOkYoO6bAYN3jM9RTBF T11H+uz9PeqhWsMYa6ubOFwKH59rm2aTK0MJk/jxip6OerlA/Ju6D5tTIu/oodCkQki0 L7iSNu3PPDSX9awsIjOPyQlBv3Ra/d53YfSk+zeRAlTLBuyOykhhW53x6hL+tyn2ru9j HGpAfS7i/DJ0QU/EZ9Wpd9LX/NBfcqOfOOIf+jnURD0946V3Vxvm1aH+Ce/lNXw3C7j5 pThg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=acOLb3l8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s73si2106739pfs.54.2019.01.14.20.23.41; Mon, 14 Jan 2019 20:23:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=acOLb3l8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727704AbfAODFQ (ORCPT + 99 others); Mon, 14 Jan 2019 22:05:16 -0500 Received: from mail.kernel.org ([198.145.29.99]:60240 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726769AbfAODFQ (ORCPT ); Mon, 14 Jan 2019 22:05:16 -0500 Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7285820859 for ; Tue, 15 Jan 2019 03:05:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1547521514; bh=ntsRCJJSsqDcFItqrFELvZQxlnXnRfx9nI02zcHtLGA=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=acOLb3l8KwiyB9zkBRKcVUUPabcsYVPr7d9gkabzyDjZcqSbPdA6WSoZ5ux0kqxWU euZPbWX4hIJ4A/3gn++DnsyLSNCM9Jw4VgNXc1pgx1B4C8TULm/VuenLLPAVC34Zy4 ReKyX96mYdbDCFshSG0ultTscbURAsKFcFWI7LBA= Received: by mail-wr1-f45.google.com with SMTP id r10so1236246wrs.10 for ; Mon, 14 Jan 2019 19:05:14 -0800 (PST) X-Gm-Message-State: AJcUukcFV6K6869GR4B2HRwKQsKl7YUccaKCKcVT3B475f5Ryni/5NtS 55taDsrNWj/D4XrkGOHRugj09auUdSg/yJ5F/PzsmQ== X-Received: by 2002:adf:ea81:: with SMTP id s1mr1026166wrm.309.1547521512886; Mon, 14 Jan 2019 19:05:12 -0800 (PST) MIME-Version: 1.0 References: <20190110203023.GL2861@worktop.programming.kicks-ass.net> <20190110205226.iburt6mrddsxnjpk@treble> <20190111151525.tf7lhuycyyvjjxez@treble> <12578A17-E695-4DD5-AEC7-E29FAB2C8322@zytor.com> <5cbd249a-3b2b-6b3b-fb52-67571617403f@zytor.com> <207c865e-a92a-1647-b1b0-363010383cc3@zytor.com> <9f60be8c-47fb-195b-fdb4-4098f1df3dc2@zytor.com> <8ca16cca-101d-1d1b-b3da-c9727665fec8@zytor.com> In-Reply-To: <8ca16cca-101d-1d1b-b3da-c9727665fec8@zytor.com> From: Andy Lutomirski Date: Mon, 14 Jan 2019 19:05:01 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v3 0/6] Static calls To: "H. Peter Anvin" Cc: Jiri Kosina , Linus Torvalds , Josh Poimboeuf , Nadav Amit , Andy Lutomirski , Peter Zijlstra , "the arch/x86 maintainers" , Linux List Kernel Mailing , Ard Biesheuvel , Steven Rostedt , Ingo Molnar , Thomas Gleixner , Masami Hiramatsu , Jason Baron , David Laight , Borislav Petkov , Julia Cartwright , Jessica Yu , Rasmus Villemoes , Edward Cree , Daniel Bristot de Oliveira Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 14, 2019 at 2:55 PM H. Peter Anvin wrote: > > I think this sequence ought to work (keep in mind we are already under a > mutex, so the global data is safe even if we are preempted): I'm trying to wrap my head around this. The states are: 0: normal operation 1: writing 0xcc, can be canceled 2: writing final instruction. The 0xcc was definitely synced to all CPUs. 3: patch is definitely installed but maybe not sync_cored. > > set up page table entries > invlpg > set up bp patching global data > > cpu = get_cpu() > So we're assuming that the state is > bp_old_value = atomic_read(bp_write_addr) > > do { So we're assuming that the state is 0 here. A WARN_ON_ONCE to check that would be nice. > atomic_write(&bp_poke_state, 1) > > atomic_write(bp_write_addr, 0xcc) > > mask <- online_cpu_mask - self > send IPIs > wait for mask = 0 > > } while (cmpxchg(&bp_poke_state, 1, 2) != 1); > > patch sites, remove breakpoints after patching each one Not sure what you mean by patch *sites*. As written, this only supports one patch site at a time, since there's only one bp_write_addr, and fixing that may be complicated. Not fixing it might also be a scalability problem. > > atomic_write(&bp_poke_state, 3); > > mask <- online_cpu_mask - self > send IPIs > wait for mask = 0 > > atomic_write(&bp_poke_state, 0); > > tear down patching global data > tear down page table entries > > > > The #BP handler would then look like: > > state = cmpxchg(&bp_poke_state, 1, 4); > switch (state) { > case 1: > case 4: What is state 4? > invlpg > cmpxchg(bp_write_addr, 0xcc, bp_old_value) > break; > case 2: > invlpg > complete patch sequence > remove breakpoint > break; ISTM you might as well change state to 3 here, but it's arguably unnecessary. > case 3: > /* If we are here, the #BP will go away on its own */ > break; > case 0: > /* No patching in progress!!! */ > return 0; > } > > clear bit in mask > return 1; > > The IPI handler: > > clear bit in mask > sync_core /* Needed if multiple IPI events are chained */ I really like that this doesn't require fixups -- text_poke_bp() just works. But I'm nervous about livelocks or maybe just extreme slowness under nasty loads. Suppose some perf NMI code does a static call or uses a static call. Now there's a situation where, under high frequency perf sampling, the patch process might almost always hit the breakpoint while in state 1. It'll get reversed and done again, and we get stuck. It would be neat if we could get the same "no deadlocks" property while significantly reducing the chance of a rollback. This is why I proposed something where we try to guarantee forward progress by making sure that any NMI code that might spin and wait for other CPUs is guaranteed to eventually sync_core(), clear its bit, and possibly finish a patch. But this is a bit gross.