Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp443706imm; Wed, 26 Sep 2018 01:06:09 -0700 (PDT) X-Google-Smtp-Source: ACcGV60fHk7Ail36/1ILquSKXLtHcTNEB2QgILQY5oQHQyWibGvBt9+FOvmeg26biWvcxxYwPK0t X-Received: by 2002:a17:902:5590:: with SMTP id g16-v6mr4891085pli.46.1537949168920; Wed, 26 Sep 2018 01:06:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537949168; cv=none; d=google.com; s=arc-20160816; b=fjdjkaoHOGZGP1fvW0LnCExPO6YhEdrDghOOn7uiFRJfGJsCbQeovt5G+xoB0I8vfG Oqip8/ceirbpl/pqPoXNUmimJpMEBR4xyX54oHoC2vXnYd1E47BQbCNhuLE8FkRAVCS9 Zlp4jLbGNDraYcOLej9Hx6CX+xdp7L8+7Gbm00tVifuBflSqWXHpoLiUx8MpuIsjCn7q BIMy8qnSzqLbYoM3Y88IFs1+Q7+KGugwvu1K19R7jR1G6Xm5HKhbDXAHI+4Sfbc5bQPh cAEIzLTE3sNmbk2zhI9qs1GhR1paaZ97cq/7K4XgwqJZH2UwhbKA5ZbpPrFHRHAsXO8i RyYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=+NrFIsdPOnUIcYCpcwLlVPojSTw09pfx9myz8MCyRhk=; b=OhgUF2I58UY3MahCBgCqClVLF2vaw553tv4LBNBQvXEnXusWF3vL9x3jshhZIm29dv 9FMfBe6kF8CrFscklXWkEoLvx9pks4vj9PvnLMCOWMhdfb0J2Q09FgTItCeq4Q+B5LtE DiXcWkUanGLPTzKlvs55M8XavESS6n4lw5eLRJIuLO0TYp+3kcvjbKBXLWahJEKqjjbZ QAPOXvFO3R5TwJhz0wSv77stgVg3LbwDBHq5ZTDd4xLs/AaX4HWYlj5KVpMPB1iwnfYg zVdGD/LwNN7mKN0Qqg68eyjAPbUHROuV2E0k5GfWUj15ppi7rSB8MNltXbmovh2N6SV7 txmA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 27-v6si4746646pgz.624.2018.09.26.01.05.23; Wed, 26 Sep 2018 01:06:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727443AbeIZOQ3 (ORCPT + 99 others); Wed, 26 Sep 2018 10:16:29 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:48561 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726436AbeIZOQ2 (ORCPT ); Wed, 26 Sep 2018 10:16:28 -0400 Received: from [217.9.97.180] (helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1g54oM-0000yc-5n; Wed, 26 Sep 2018 10:04:42 +0200 Date: Wed, 26 Sep 2018 10:04:38 +0200 (CEST) From: Thomas Gleixner To: Peter Zijlstra cc: Sebastian Andrzej Siewior , linux-kernel@vger.kernel.org, Daniel Wagner , Will Deacon , x86@kernel.org, Linus Torvalds , "H. Peter Anvin" , Boqun Feng , "Paul E. McKenney" Subject: Re: [Problem] Cache line starvation In-Reply-To: <20180926073426.GA31905@hirez.programming.kicks-ass.net> Message-ID: References: <20180921120226.6xjgr4oiho22ex75@linutronix.de> <20180926073426.GA31905@hirez.programming.kicks-ass.net> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 26 Sep 2018, Peter Zijlstra wrote: > On Fri, Sep 21, 2018 at 02:02:26PM +0200, Sebastian Andrzej Siewior wrote: > > Instrumentation show always the picture: > > > > CPU0 CPU1 > > => do_syscall_64 => do_syscall_64 > > => SyS_ptrace => syscall_slow_exit_work > > => ptrace_check_attach => ptrace_do_notify / rt_read_unlock > > => wait_task_inactive rt_spin_lock_slowunlock() > > -> while task_running() __rt_mutex_unlock_common() > > / check_task_state() mark_wakeup_next_waiter() > > | raw_spin_lock_irq(&p->pi_lock); raw_spin_lock(¤t->pi_lock); > > | . . > > | raw_spin_unlock_irq(&p->pi_lock); . > > \ cpu_relax() . > > - . > > *IRQ* > > > > In the error case we observe that the while() loop is repeated more than > > 5000 times which indicates that the pi_lock can be acquired. CPU1 on the > > other side does not make progress waiting for the same lock with interrupts > > disabled. > > I've tried really hard to reproduce this in userspace, but so far have > not had any luck. Looks to be a real tricky thing to make happen. It's probably equally tricky to write a reproducer as it was to instrument the thing. I assume it's a combination of code sequences on both CPUs which involve other (unrelated) lock instructions on the way. Thanks, tglx