Subject: Re: [PATCH] LoongArch: Make -mstrict-align be configurable
To:     Xi Ruoyao <xry111@xry111.site>, WANG Xuerui <kernel@xen0n.name>,
        Huacai Chen <chenhuacai@loongson.cn>,
        Arnd Bergmann <arnd@arndb.de>,
        Huacai Chen <chenhuacai@kernel.org>
Cc:     loongarch@lists.linux.dev, linux-arch@vger.kernel.org,
        Xuefeng Li <lixuefeng@loongson.cn>,
        Guo Ren <guoren@kernel.org>,
        Jiaxun Yang <jiaxun.yang@flygoat.com>,
        linux-kernel@vger.kernel.org
References: <20230202084238.2408516-1-chenhuacai@loongson.cn>
 <5fc85453-1e2c-1f00-7879-1b5fa318c78a@xen0n.name>
 <5303aeda-5c66-ede6-b3ac-7d8ebd73ec70@loongson.cn>
 <b1809500e4d55564a1084a3014fb9603ba3d1438.camel@xry111.site>
From:   Jianmin Lv <lvjianmin@loongson.cn>
Message-ID: <3b17d229-bad4-e6a0-9055-c585dd5a62e4@loongson.cn>
Date:   Mon, 6 Feb 2023 21:13:22 +0800
User-Agent: Mozilla/5.0 (X11; Linux loongarch64; rv:68.0) Gecko/20100101
 Thunderbird/68.7.0
MIME-Version: 1.0
In-Reply-To: <b1809500e4d55564a1084a3014fb9603ba3d1438.camel@xry111.site>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Precedence: bulk


On 2023/2/6 下午7:18, Xi Ruoyao wrote:
> On Mon, 2023-02-06 at 18:24 +0800, Jianmin Lv wrote:
>> Hi, Xuerui
>>
>> I think the kernels produced with and without -mstrict-align have mainly
>> following differences:
>> - Diffirent size. I build two kernls (vmlinux), size of kernel with
>> -mstrict-align is 26533376 bytes and size of kernel without
>> -mstrict-align is 26123280 bytes.
>> - Diffirent performance. For example, in kernel function jhash(), the
>> assemble code slices with and without -mstrict-align are following:
> 
> But there are still questions remaining:
> 
> (1) Is the difference contributed by a bad code generation of GCC?  If
> true, it's better to improve GCC before someone starts to build a distro
> for LA264 as it would benefit the user space as well.
> 
AFAIK, GCC builds to produce unaligned-access-enabled target binary by 
default (without -mstrict-align) for improving user space performance 
(small size and runtime high performance), which is also based the fact 
that the vast majority of LoongArch CPUs support unaligned-access.

> (2) Is there some "big bad unaligned access loop" on a hot spot in the
> kernel code?  If true, it may be better to just refactor the C code
> because doing so will benefit all ports, not only LoongArch.  Otherwise,
> it may be unworthy to optimize for some cold paths.
> 
Frankly, I'm not sure if there is this kind of hot code in kernel, I 
just see the difference from different kernel size and different 
assemble code slice. And I'm afraid that it may be difficult to judge 
whether it is reasonable hot code or not if exists.