Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp138351pxb; Tue, 2 Feb 2021 01:04:20 -0800 (PST) X-Google-Smtp-Source: ABdhPJzFRaDPEyEAYiDVsf4A1+vfKBaoMP9oM4dVP6shSUCkPGQGOEvU6mCf2uukBImvQuG2ExMJ X-Received: by 2002:a05:6402:1bcc:: with SMTP id ch12mr22530502edb.283.1612256659867; Tue, 02 Feb 2021 01:04:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612256659; cv=none; d=google.com; s=arc-20160816; b=AwiTDLUhbFbhzxKnbRWq7VeCfH42SS+YdeCJ0EevbgtTOa52Rae6GMzNa9OHuvSeGj RH17fyve+HJz/F1pdjfgq5eylgfThIxovcvHLVTDDxpu/X4LPdbNKy+8+jSRdUGdGtLd enB+YU02SzqBVEaHPRWv1+v6La9O+J/GYo1OyYSPhTiquPWgxP2Nz8XsiBcOsKIE84y2 uFLNrtxr47NQoqWUhW9NAB2h631MAU5E83ubE1ksilcC9y+s04WEC6piKs+8kGVy0Ff/ Lc8l1+hQ7xdHodYNVue5ecsqBaGOWbz8abSnyUV9/l+fFiAtHz23Ql6ZNy0mqqdyHO1u 9sBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=Si/dI5lNziAyVoRmOt6q5oX+Pb8j9QNnXS8AF/vdi/A=; b=WidrnEFF3sdtDXy2vzNH90xMtdCwDhdasVjhtPzNL+XIV3TCtfubMSBS3tCi66w2iC +lGO8J5ryJdnm6Q98n8pWbtlL3lMv9E7DbJAK4Ty5fZe189eatjF/42gU7xXtAK9MEjv /JjN/e5p7wenp2Hu0abNL77+nI1iQWRdYSJMhf5gf9dDxyDCEX0cbGAF9VJqO7JhNjVB zlIB2VzjBLidLYCvk4cEr3sqK6k+iiF4hoUgcTWe+12qoRwppD+kRTPH+zwuZdDDCFye LYWZKeKzo95WJUhEXyTJj3FfTp4b0R4Om+gRVgQcasmPAb2lLrTpDVGMw+9GvdOI3xbb pOQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=F6QniaO9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y9si1823348edp.33.2021.02.02.01.03.30; Tue, 02 Feb 2021 01:04:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=F6QniaO9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232304AbhBBI67 (ORCPT + 99 others); Tue, 2 Feb 2021 03:58:59 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:23240 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231929AbhBBI65 (ORCPT ); Tue, 2 Feb 2021 03:58:57 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612256250; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Si/dI5lNziAyVoRmOt6q5oX+Pb8j9QNnXS8AF/vdi/A=; b=F6QniaO9pvvPoGycjt5q+69xVQC5gYdAWKcDFv7KLDsimPZOIMnlWQPQnCZkTqVc7vQR0P RHHYuN4eQ/8V7A9B/OoeSay36rT42ebxat1w0G5Tb3VzaERQTkAm8IQ4eQ3p8j2CML+qPj ry83KEIwBs0Ut82o8ypSn0J8YVarzoE= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-169-sS1uv3IfMMGh_2fBHstCvQ-1; Tue, 02 Feb 2021 03:57:23 -0500 X-MC-Unique: sS1uv3IfMMGh_2fBHstCvQ-1 Received: by mail-wr1-f69.google.com with SMTP id u15so2608189wrn.3 for ; Tue, 02 Feb 2021 00:57:23 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=Si/dI5lNziAyVoRmOt6q5oX+Pb8j9QNnXS8AF/vdi/A=; b=SgVh4+/S32v7AQB/rv/Ey/PDVFOt4qHfFNwITjmC/NjnuOxC59j/5hz89Pw5rlhG3K O9HQDldk0IT7JJfc6gy/sVsWm9z6eHuQ0tyZplvnAANw+Qe+6IOlTmVLksmreE9PPWfX Q4T63AUGX0V2/62RWc0yqkMnLRZpJJiXMJQuG7pZUn979aIzFLd1/sFQQmHD2q3GJAQR C3du6j75oeqXlqJ3pN1iJxJsv1j+TRfNPlMmLSC4JeC5o1mjmevYblmptM4hlYrQ8jiQ 6TEPw9LMbpVu5gBEZLV3i21rhtMFIxKfN7MTMuvPfVebaSP9xOCFYjwXsOURDEakLmzd gSdQ== X-Gm-Message-State: AOAM530uJvUVDl3wR9CArdmGXfLKVUUjJ4JB7X7CKNEZByERSXla5VJ+ 9PpicLag5d+3vodw3OLFGr2bpn2Aqc+fBBys/nsHjP5vJMclP17XrilAs+Z0fy3WIvv7F/QfQhu CY1xiWvfnL6h+3vXBVgE7ul5c X-Received: by 2002:a05:6000:192:: with SMTP id p18mr22178812wrx.69.1612256242134; Tue, 02 Feb 2021 00:57:22 -0800 (PST) X-Received: by 2002:a05:6000:192:: with SMTP id p18mr22178791wrx.69.1612256241890; Tue, 02 Feb 2021 00:57:21 -0800 (PST) Received: from ?IPv6:2a01:cb14:499:3d00:cd47:f651:9d80:157a? ([2a01:cb14:499:3d00:cd47:f651:9d80:157a]) by smtp.gmail.com with ESMTPSA id i7sm2269412wmq.2.2021.02.02.00.57.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 02 Feb 2021 00:57:21 -0800 (PST) Subject: Re: [RFC PATCH 12/17] gcc-plugins: objtool: Add plugin to detect switch table on arm64 To: Nick Desaulniers , Josh Poimboeuf Cc: Ard Biesheuvel , Mark Brown , Catalin Marinas , Kees Cook , Linux ARM , linux-efi , linux-hardening@vger.kernel.org, LKML , Mark Rutland , Masahiro Yamada , Michal Marek , Peter Zijlstra , raphael.gault@arm.com, Will Deacon , clang-built-linux , Bill Wendling References: <20210120173800.1660730-13-jthierry@redhat.com> <20210127221557.1119744-1-ndesaulniers@google.com> <20210127232651.rj3mo7c2oqh4ytsr@treble> <20210201214423.dhsma73k7ccscovm@treble> From: Julien Thierry Message-ID: <671f1aa9-975e-1bda-6768-259adbdc24c8@redhat.com> Date: Tue, 2 Feb 2021 09:57:19 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2/2/21 12:17 AM, Nick Desaulniers wrote: > On Mon, Feb 1, 2021 at 1:44 PM Josh Poimboeuf wrote: >> >> On Fri, Jan 29, 2021 at 10:10:01AM -0800, Nick Desaulniers wrote: >>> On Wed, Jan 27, 2021 at 3:27 PM Josh Poimboeuf wrote: >>>> >>>> On Wed, Jan 27, 2021 at 02:15:57PM -0800, Nick Desaulniers wrote: >>>>>> From: Raphael Gault >>>>>> >>>>>> This plugins comes into play before the final 2 RTL passes of GCC and >>>>>> detects switch-tables that are to be outputed in the ELF and writes >>>>>> information in an ".discard.switch_table_info" section which will be >>>>>> used by objtool. >>>>>> >>>>>> Signed-off-by: Raphael Gault >>>>>> [J.T.: Change section name to store switch table information, >>>>>> Make plugin Kconfig be selected rather than opt-in by user, >>>>>> Add a relocation in the switch_table_info that points to >>>>>> the jump operation itself] >>>>>> Signed-off-by: Julien Thierry >>>>> >>>>> Rather than tightly couple this feature to a particular toolchain via >>>>> plugin, it might be nice to consider what features could be spec'ed out >>>>> for toolchains to implement (perhaps via a -f flag). >>>> >>>> The problem is being able to detect switch statement jump table vectors. >>>> >>>> For a given indirect branch (due to a switch statement), what are all >>>> the corresponding jump targets? >>>> >>>> We would need the compiler to annotate that information somehow. >>> >>> Makes sense, the compiler should have this information. How is this >>> problem solved on x86? >> >> Thus far we've been able to successfully reverse engineer it on x86, >> though it hasn't been easy. >> >> There were some particulars for arm64 which made doing so impossible. >> (I don't remember the details.) The main issue is that the tables for arm64 have more indirection than x86. On x86, the dispatching jump instruction fetches the target address from a contiguous array of addresses based on a given offset. So the list of potential targets of the jump is neatly organized in a table (and sure, before link time these are just relocation, but still processable). On arm64 (with GCC at least), what is stored in a table is an array of candidate offsets from the jump instruction. And because arm64 is limited to 32bit instructions, the encoding often requires multiple instructions to compute the target address: ldr<*> x_offset, [x_offsets_table, x_index, ...] // load offset adr x_dest_base, // load target branch for offset 0 add x_dest, x_target_base, x_offset, ... // compute final address br x_dest // jump Where this gets trickier is that (with GCC) the offsets stored in the table might or might not be signed constants (and this can be seen in GCC intermediate representations, but I do not believe this information is output in the final object file). And on top of that, GCC might decide to use offsets that are seen as unsigned during intermediate representation as signed offset by sign extending them in the add instruction. So, to handle this we'd have to track the different operation done with the offset, from the load to the final jump, decoding the instructions and deducing the potential target instructions from the table of offsets. But that is error prone as we don't really know how many instructions can be between the ones doing the address computation, and I remember some messy case of a jump table inside a jump table where tracking the instruction touching one or the other offset would need a lot of corner case handling. And this of course is just for GCC, I haven't looked at what it all looks like on Clang's end. > > I think the details are pertinent to finding a portable solution. The > commit message of this commit in particular doesn't document such > details, such as why such an approach is necessary or how the data is > laid out for objtool to consume it. > Sorry, I will need to make that clearer. The next patch explains it a bit [1] Basically, for simplicity, the plugin creates a new section containing tables (one per jump table) of references to the jump targets, similar to what x86 has, except that in this case this table isn't actually used by runtime code and is discarded at link time. I only chose this to minimize what needed to be changed in objtool and because the format seemed simple enough. But I'm open on some alternative, whether it's a -fjump-table-info option added to compilers with a different format to do the links. The important requirement is to be able to know all the candidate targets for a "br " instruction. [1] https://lkml.org/lkml/2021/1/20/910 Thanks, -- Julien Thierry