Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751758AbdFZUud (ORCPT ); Mon, 26 Jun 2017 16:50:33 -0400 Received: from mail-wm0-f65.google.com ([74.125.82.65]:36432 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751381AbdFZUu0 (ORCPT ); Mon, 26 Jun 2017 16:50:26 -0400 Date: Mon, 26 Jun 2017 22:57:48 +0200 From: Frans Klaver To: Julia Lawall Cc: Joe Perches , Greg Kroah-Hartman , kernel-janitors , Guenter Roeck , Yueyao Zhu , Rui Miguel Silva , Guru Das Srinagesh , Javier Martinez Canillas , devel@driverdev.osuosl.org, "linux-kernel@vger.kernel.org" Subject: Re: endian bitshift defects [ was: staging: fusb302: don't bitshift __le16 type ] Message-ID: <20170626205748.GA1899@bugger> References: <20170616174556.2358-1-fransklaver@gmail.com> <1497653077.10546.23.camel@perches.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.8.3 (2017-05-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2495 Lines: 72 On Fri, Jun 23, 2017 at 07:37:28PM -0400, Julia Lawall wrote: > > > On Sat, 24 Jun 2017, Frans Klaver wrote: > > > Hm. For some reason the great mail filtering scheme decided to push > > this past my inbox :-/ > > > > On Sat, Jun 17, 2017 at 12:44 AM, Joe Perches wrote: > > > On Fri, 2017-06-16 at 19:45 +0200, Frans Klaver wrote: > > >> The header field in struct pd_message is declared as an __le16 type. The > > >> data in the message is supposed to be little endian. This means we don't > > >> have to go and shift the individual bytes into position when we're > > >> filling the buffer, we can just copy the contents right away. As an > > >> added benefit we don't get fishy results on big endian systems anymore. > > > > > > Thanks for pointing this out. > > > > > > There are several instances of this class of error. > > > > There are other smells around __(le|be) types that show up in staging > > that might be worth checking in the rest of the kernel as well. e.g. > > converting to cpu and storing it back into itself (possibly with its > > bytes reversed), direct assignments without conversion and what else > > you might have. sparse obviously already flags anything fishy going on > > with these types, but cannot distinguish between the classes of > > errors. I'll need to acquaint myself with spatch a bit more to be able > > to track that down. > > If you have concrete code examples, even fake ones, illustrating a class > of problem, then that would be great. Alright, I'll describe two fairly simple cases for starters. One class of issue that I have on top of mind is simply __le16 val; val = le16_to_cpu(val); The problem there obviously being that val is supposed to be guaranteed little endian. Sparse will throw a warning at this. It may also appear as (or be 'fixed' as) __le16 val; le16_to_cpus(val); Sparse doesn't flag this second version as an issue, while it causes the same problem. It is especially a potential problem when the value is stored in driver data. Another smell that is prevalent, at least in staging, is u16 in; u16 out; out = cpu_to_le16(in); or in one instance (drivers/staging/fbtft/fbtft-io.c) I saw u64 tmp; *(u64*)dst = cpu_to_be64(tmp); Now these aren't necessarily problematic. Usually this typo of code is preparing the data to be sent out in a specific byte ordering, but again issues may arise if this specifically ordered data is stored somewhere. I'll leave it at that for now. Frans