Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site whuxle.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxl!ihnp4!whuxle!mp From: mp@whuxle.UUCP (Mark Plotnick) Newsgroups: net.lang.c Subject: Re: C "optimization" Message-ID: <254@whuxle.UUCP> Date: Thu, 23-Feb-84 11:16:46 EST Article-I.D.: whuxle.254 Posted: Thu Feb 23 11:16:46 1984 Date-Received: Fri, 24-Feb-84 00:42:32 EST References: mi-cec.212 Organization: Bell Labs, Whippany Lines: 180 Dan Klein's comparison between C and Bliss really compared pcc with BLISS-32. Using such results to point out the inferior aspects of C is like saying UNIX's user interface is horrible based on one's experiences with version 6 and 3bsd. I compiled Dan's test programs on the DEC VMS C compiler, version 1.0-09. This compiler shares its back end with other highly optimizing VMS compilers such as PL/1. The generated code is in some cases better than BLISS-32's, and in some cases it's worse than pcc's. Program 1: extern int alpha(); test: .entry test,^mextern int beta(); moval (ap),r2 tstl 14(r2) test(p,q,r,s,t) beql vcg.1 { pushl 10(r2) if (t!=0) pushl 0C(r2) alpha(p,q,r,s); pushl 08(r2) else pushl 04(r2) beta(p,q,r,s); calls #4,alpha } ret vcg.1: pushl 10(r2) pushl 0C(r2) pushl 08(r2) pushl 04(r2) calls #4,beta ret The code here is about the same as pcc's. Note that VMS C likes to keep the address of the base of the argument list in a register, even though it doesn't really buy anything. Program 2: extern int alpha(); test: .entry test,^m moval (ap),r2 test(p,q,r,s,t) tstl 14(r2) { beql vcg.1 if (t!=0) pushl #0 alpha(p,q,r,s,0); pushl 10(r2) else pushl 0C(r2) alpha(p,q,r,s,1); pushl 08(r2) } pushl 04(r2) calls #5,alpha ret vcg.1: pushl #1 pushl 10(r2) pushl 0C(r2) pushl 08(r2) pushl 04(r2) calls #5,alpha ret Well, VMS C loses here. It doesn't share identical code sections. Program 3: test(a) test: .entry test,^m int a; moval (ap),r2 { clrl r1 int i; addl3 #5,04(r2),r0 divl2 #2,r0 for (i=0; i<=(5+a)/2; i++) ; cmpl r1,r0 } bgtr vcg.2 vcg.1: aobleq r0,r1,vcg.1 vcg.2: ret Almost as good as BLISS-32's. It doesn't use BLISS-32's neat trick of setting the value to be one less than the real starting value and flowing into the aobleq. Program 4: extern alpha(); test: .entry test,^m<> pushl #2 test() calls #1,alpha { pushl #5 int a, b; calls #1,alpha ret a = 2; alpha(a); b = 5; alpha(b); } This beats BLISS-32, by pushing the constant values immediately rather than moving each to a register first. Program 5: Identical to BLISS-32's code. Program 6: Identical to BLISS-32's code. Program 7: #define P1 3 test: .entry test,^m<> #define P2 5 cmpl #3,#5 bgeq vcg.1 extern int alpha(); clrl r0 ret test() vcg.1: { movl #1,r0 int loc1 = P1, ret loc2 = P2; if (loc1 < loc2) return 0; else return 1; } This beats BLISS-32 in the same way as it did with Program 4 - it uses constant values in immediate mode rather than moving them to registers first. I wonder why it didn't just go further and collapse the program down to "clrl r0 ; ret". Program 8: typedef struct dp *DP; test: .entry test,^m<> struct dp { movl 08(ap),r0 int d; movl 04(r0),r0 DP p; movl 04(r0),r0 } movl 04(r0),r0 movl @04(r0),(r0) test(a) ret DP a; { a->p->p->p->d = a->p->p->p->p->d; } Identical to BLISS-32's code, except that the arg is at 8(ap) rather than 4(ap). This is because the function is structure-valued, and 4(ap) contains the address of some space set aside in the caller for the return value. Program 9: extern alpha(); test: .entry test,^m<> clrl ap test() movl ap,r0 { pushl #0 register int a, b, c; pushl ap pushl r0 a = b = c = 0; calls #3,alpha alpha(a, b, c); ret } This code looks truly bizarre, and has to do with VMS C's methods of register allocation. Variable "c" has been optimized out of the code, "b" resides in ap (since we don't have any parameters, we can use ap as just another register), and "a" resides in r0. Before you all rush out and switch to VMS because of its C compiler, note the following (all based on version 1.0-09; I don't know if another version has come out that fixes these problems): - it uses registers very heavily, allocating them based on a static analysis of the program. It also IGNORES "register" declarations in your C code. This may lead to nonoptimal code, as well as slower procedure startup (because of all the registers that may need to be saved). - it tries to minimize code space, but not data space. The prime example of this is that it always uses a branch table for implementing switches, no matter how sparsely the cases are distributed. A switch with a "case 0" and a "case 32000" generates an awfully big branch table. - it sometime dies with a fatal parser error when presented with some legal C constructs. But, it's reasonably fast, comes with a very good manual, and can produce compilation listings with storage maps, cross references, etc. Mark Plotnick WH 1C-244 x6955