Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S263558AbTFTRcz (ORCPT ); Fri, 20 Jun 2003 13:32:55 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S263574AbTFTRcy (ORCPT ); Fri, 20 Jun 2003 13:32:54 -0400 Received: from twinlark.arctic.org ([168.75.98.6]:4836 "EHLO twinlark.arctic.org") by vger.kernel.org with ESMTP id S263558AbTFTRcs (ORCPT ); Fri, 20 Jun 2003 13:32:48 -0400 Date: Fri, 20 Jun 2003 10:46:46 -0700 (PDT) From: dean gaudet To: John Bradford , Jeff Garzik cc: nuno.silva@vgertech.com, torvalds@transmeta.com, Eli Carter , linux-kernel@vger.kernel.org, samphan@thai.com, vojtech@suse.cz Subject: Re: Crusoe's persistent translation on linux? In-Reply-To: <20030620165620.GA9164@gtf.org> Message-ID: References: <200306201040.h5KAerPK000431@81-2-122-30.bradfords.org.uk> <3EF33B12.7070901@inet.com> <20030620165620.GA9164@gtf.org> X-comment: visit http://arctic.org/~dean/legal for information regarding copyright and disclaimer. MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2355 Lines: 57 On Fri, 20 Jun 2003, John Bradford wrote: > Would it be possible, (with relevant documentation), to tune the code > morphing software for optimum performance of code generated by a > specific compiler, though? > > If a particular version of GCC favours certain constructs and uses > particular sets of registers for a given piece of code, couldn't we > optimise for those cases, at the expense of others? Maybe a > particular compiler doesn't use certain X86 instructions at all, and > these could be eliminated altogether? very little of the translation overhead is involed with deocding the x86 instructions... most of the translation overhead is in scheduling and optimising for the VLIW. there are some tricks which you can apply at the x86 level which favour CMS much more than they favour other processors... specifically one related to what jeff brought up: On Fri, 20 Jun 2003, Jeff Garzik wrote: > Newer CPUs do register renaming in an attempt to avoid the > register-starved ISA issue. I presume Xmeta would do something > similar... yeah CMS does this internally. one way you can exploit this for performance is within a basic block (or within a code path that is most likely to be executed with a handful of rarely/never taken branch outs) you can express every sub-expression completely without worrying about its schedule -- which gives you access to all the source x86 registers. CMS will reschedule it to fit onto the pipeline for you, and rename to internal registers. this can be a huge help for floating point code when you also unroll the code. for example if you are doing a polynomial expansion, you can simply write this: f0 = a0 + x0*(a1 + x0*(a2 + x0*(a3 + x0*a4))); f1 = b0 + x1*(a1 + x1*(a2 + x1*(a3 + x1*a4))); you don't need to schedule the x87 code at all -- CMS will do it for you. this example is pretty trivial, but if you have sequences which overflow the x87 register set and require stack operations when scheduled for a typical x86 processor you can mostly avoid the stack operations when "scheduling" for CMS. think of it as a huge reorder buffer. -dean - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/