Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp3402204ybi; Mon, 10 Jun 2019 09:38:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqyVU7ZmlZ9zxGWvaBpNaPSiF2DHQu0VioR5XkWtGEJRrksUyOoK1uMiCJF3GByHDmYzwMdt X-Received: by 2002:a17:902:d70a:: with SMTP id w10mr62149819ply.251.1560184705534; Mon, 10 Jun 2019 09:38:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560184705; cv=none; d=google.com; s=arc-20160816; b=MVOm7Dn4hbOabs0E5IDaJRCaUWRKcF3rdupHGBoH7KWwvExEfDFMLORUq75I+pODAF xeeKicOpqhZr+9bcZaG1qhZ2ZS9RY5yO8zhlewLi8Es+ktCg0pIkeENfBhpMSpbELerV qiiuq6tY4vsPTtfbVRhQp37T2xoKsLgtOQ2urxinalA4lHJE32jyOTgGmb0UQoAYkzdT j3CzpAhTZZi0tWdbzXn8rhJeFFtxDqmhL+dYo8FZbHXZ/sckyyMsAaIixe6gZoXHXVvP IUU1NRK5OunMODYvYqf2TBupImHLLWtXm1PPIKpQTNRRt6Kt8h6YrBLkRtQE6rhHewzW yU5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-language:accept-language:references:message-id:date :thread-index:thread-topic:subject:cc:to:from:dkim-signature; bh=kwbMNJnkOhPIfyJw26C4kezNa9FvgHLrAP0YEk4Gueo=; b=y1mjusTkm2H8x5y7t5AjL0WkUwHMpIW1azynNWnvg8ddmx6MC9bK6cB9GBVviJ8pcm HN/N0BdOf1J6cJO1HfM3QyaT9P2wBTZx6cS9vFNAfLEejSIv41AqLbnvT1xubBIQdkxD UbqUVqlLfXt8N6kGlh6z3Zu9gc36aydP6ABn0knVm1/jKiPkT26dk+KDhDch7cy+exCT QHZsGIJzTZBOWOfJ0MB9/0Xi3vtO2QVhVoF9nS+krWDAjXwjACS+ij19QDs0QpEZHm6M n+p4vpg7jKm+qbcgcGWpIPVzhP+lpunPhCXr3l8qUxDFfPKTad1sLn+rJZ+Oi1ZUW7Kj r4fw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@synopsys.com header.s=mail header.b=iD93RKJs; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=synopsys.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e6si10225114pfn.273.2019.06.10.09.38.10; Mon, 10 Jun 2019 09:38:25 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@synopsys.com header.s=mail header.b=iD93RKJs; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=synopsys.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404014AbfFJPzo (ORCPT + 99 others); Mon, 10 Jun 2019 11:55:44 -0400 Received: from dc2-smtprelay2.synopsys.com ([198.182.61.142]:44256 "EHLO smtprelay-out1.synopsys.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2404005AbfFJPzn (ORCPT ); Mon, 10 Jun 2019 11:55:43 -0400 Received: from mailhost.synopsys.com (dc2-mailhost2.synopsys.com [10.12.135.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by smtprelay-out1.synopsys.com (Postfix) with ESMTPS id 79D5AC5896; Mon, 10 Jun 2019 15:55:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=synopsys.com; s=mail; t=1560182141; bh=aihNbKp08qv3DocJmNBn+1cqwnR6GWVl1CV2mAjc24U=; h=From:To:CC:Subject:Date:References:From; b=iD93RKJs/EbchZsrL1WjVDrxDFv9IToe8PnqWY1KtMl6yN80aQqdRjpdKJZxNNKHm DbQJz1DWUGgUBkXgmo6VHAOvHU1k+/SY09ttUqAj62JO2uw/ARIIGT+sX25pJA428H /WyHNbF5/9z6l1GzRJrXlWo4wEAhl607nvt30OH4oiNr725gslsq2exmbKFu48sYEB n+bljA7YRJDwb8HwHgOYl8B5G+m1hymXxAmkai7AXwOKsZu8HuFP+N3ckBtUolbRyE 6Si4gqf9ckH/9DOJya1YWB10LUZiHf/fy535HVbW9vFl757aN78pJazA+sRh5aeNDX HrBCAAQzvm6/g== Received: from US01WEHTC2.internal.synopsys.com (us01wehtc2.internal.synopsys.com [10.12.239.237]) (using TLSv1.2 with cipher AES128-SHA256 (128/128 bits)) (No client certificate requested) by mailhost.synopsys.com (Postfix) with ESMTPS id 3B5E5A0276; Mon, 10 Jun 2019 15:55:42 +0000 (UTC) Received: from us01wembx1.internal.synopsys.com ([169.254.1.22]) by US01WEHTC2.internal.synopsys.com ([10.12.239.237]) with mapi id 14.03.0415.000; Mon, 10 Jun 2019 08:55:41 -0700 From: Vineet Gupta To: Eugeniy Paltsev , Cupertino Miranda CC: Alexey Brodkin , Claudiu Zissulescu , arcml , lkml Subject: Re: ARC Assembler: bundle_align_mode directive support Thread-Topic: ARC Assembler: bundle_align_mode directive support Thread-Index: AQHVHicA2N+x/AVgjkyWYOIr/ofXig== Date: Mon, 10 Jun 2019 15:55:41 +0000 Message-ID: References: <3962a9ad199cea45b1cfadb80be551aab83b7028.camel@synopsys.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.13.184.19] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/8/19 11:21 AM, Eugeniy Paltsev wrote:=0A= > Hi Cupertino,=0A= >=0A= > I tried to use ".bundle_align_mode" directive in ARC assembly, but I got = following error:=0A= > ----------------->8--------------=0A= > Assembler messages:=0A= > Error: unknown pseudo-op: `.bundle_align_mode'=0A= > ----------------->8--------------=0A= >=0A= > Is it possible to implement it in ARC assembler?=0A= > There is some context about the reason we want to have it:=0A= >=0A= > I'm trying to add support of jump labels for ARC in linux kernel. Jump la= bels=0A= > provide an interface to generate static branches using self-modifying cod= e.=0A= > This allows us to implement conditional branches where changing branch=0A= > direction is expensive but branch selection is basically 'free'.=0A= >=0A= > There is nuance in current implementation:=0A= > We need to patch code by rewriting 32-bit NOP by 32-bit BRANCH instructio= n (or vice versa).=0A= > It can be easily done with following code:=0A= > ----------------->8--------------=0A= > write_32_bit(new_instruction)=0A= > flush_l1_dcache_range_this_cpu=0A= > invalidate_l1_icache_range_all_cpu=0A= > ----------------->8--------------=0A= >=0A= > I$ update will be atomic in most of cases except the patched instruction = share=0A= > two L1 cache lines (so first 16 bits of instruction are in the one cache = line and=0A= > last 16 bit are in another cache line).=0A= > In such case we can execute half-updated instruction if we are patching l= ive code (and we are unlucky enough :)=0A= =0A= While I understand your need for alignment, I don't see how you can possibl= y=0A= execute stray lines.=0A= dcache flush will be propagated by hardware (SCU) to all cores (as applicab= le) and=0A= the icache cache flush xcall is synchronous and will have to finish on all = cores=0A= before we proceed to execute the cod eitself.=0A= =0A= >=0A= > As of today I simply align by 4 byte instruction which can be patched wit= h ".balign 4" directive:=0A= > ----------------->8--------------=0A= > static __always_inline bool arch_static_branch_jump(struct static_key *ke= y,=0A= > bool branch)=0A= > {=0A= > asm_volatile_goto(".balign 4\n"=0A= > "1:\n"=0A= > "b %l[l_yes]\n" // <- instruction which can be patched=0A= > ".pushsection __jump_table, \"aw\"\n"=0A= > ".word 1b, %l[l_yes], %c0\n"=0A= > ".popsection\n"=0A= > : : "i" (&((char *)key)[branch]) : : l_yes);=0A= >=0A= > return false;=0A= > l_yes:=0A= > return true;=0A= > }=0A= > ----------------->8--------------=0A= >=0A= > In that case patched instruction is aligned with one 16-bit NOP if this i= s required.=0A= > However 'align by 4' directive is much stricter than it actually required= . =0A= =0A= I don't quite understand. Can u write a couple of lines of pseudo assembly = to show=0A= what the issue is.=0A= =0A= > It's enough=0A= > that our 32-bit instruction don't cross l1 cache line boundary.=0A= > That will save us from adding useless NOP padding in most of the cases.= =0A= > It can be implemented with ".bundle_align_mode" directive which isn't sup= ported by ARC AS unfortunately.=0A= =0A= This seems like a reasonable request (contingent to the difficulty of=0A= implementation in binutils). but I can't comprehend why you would need it.= =0A=