On Monday 12 November 2007 23:25, Noriaki TAKAMIYA wrote: > Hi, > > sorry, again. > > >> Tue, 13 Nov 2007 15:07:02 +0900 (JST) > >> [Subject: [camellia-oss:00952] Re: [PATCH 5/5] camellia: de-unrolling, > >> 64bit-ization] Noriaki TAKAMIYA wrote... > >> > > > I'd like to hear the opinion of the author. > > > > > > Takamiya-san, what do you think about this change? > > > > For IPsec processing, I think performance is important. > > > > If this fix improves the performance, it is acceptable. > > I misunderstood the meaning. If this fix decreases the performance, > I wouldn't prefer this patch(and the below is also one of the > reason). My preferred solution is to make loop unrolling conditional on CONFIG_CC_OPTIMIZE_FOR_SIZE - and this is what is done in my (first) patch (see attached). This part: +#ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE + while (1) { + i -= 8; + ROUNDS(i); + if (i == 0) + break; + FLS(i); + } +#else + if (i == 32) { + ROUNDS(24); + FLS(24); + } + ROUNDS(16); + FLS(16); + ROUNDS(8); + FLS(8); + ROUNDS(0); +#endif Do you agree that this solution does not look too ugly and would satisfy both "speed" and "size" camps? For reference, size and speed numbers again: All times are in microseconds. Two runs give some idea of test variability. "Setup NN: NNNNNN NNNNNN" - time taken by 100000 key setups (two runs). "Encrypt: NNNNNN NNNNNN" - time taken by 1000 encryptions of 8K buffer. "Decrypt: NNNNNN NNNNNN" - time taken by 1000 decryptions of 8K buffer. "(matches)" - encrypt/decrypt cycle produced non corrupted plaintext. CONFIG_CC_OPTIMIZE_FOR_SIZE is not set: $ ./camellia Setup 16:32779 33169 Encrypt:153582 153740 Decrypt:150985 149811 (matches) Setup 24:49333 48987 Encrypt:197973 198853 Decrypt:201240 197585 (matches) Setup 32:46700 47680 Encrypt:195650 195800 Decrypt:195450 195469 (matches) $ ./camellia5 Setup 16:33417 32968 Encrypt:149195 149095 Decrypt:148593 148661 (matches) Setup 24:50082 50064 Encrypt:201214 199204 Decrypt:197078 197579 (matches) Setup 32:48938 48824 Encrypt:200231 199545 Decrypt:198954 198996 (matches) $ ./camellia_64 Setup 16:22247 22473 Encrypt:152321 149860 Decrypt:149058 148451 (matches) Setup 24:33832 34017 Encrypt:200428 202969 Decrypt:196789 195524 (matches) Setup 32:32884 32821 Encrypt:200414 200640 Decrypt:197857 195987 (matches) $ size camellia.o camellia7.o camellia_64.o ? ?text ? ?data ? ? bss ? ? dec ? ? hex filename ? 24586 ? ? ? 0 ? ? ? 0 ? 24586 ? ?600a camellia.o ? 21714 ? ? ? 0 ? ? ? 0 ? 21714 ? ?54d2 camellia5.o ? 18666 ? ? ? 0 ? ? ? 0 ? 18666 ? ?48ea camellia_64.o Very small speed loss in camellia -> camellia5, noticeably smaller size. Big key setup speedup in 64-bit camellia_64, and it is even smaller. CONFIG_CC_OPTIMIZE_FOR_SIZE is set: $ ./camellia_Os Setup 16:32573 34985 Encrypt:151825 152011 Decrypt:147581 147630 (matches) Setup 24:48528 49250 Encrypt:196223 199056 Decrypt:198811 196394 (matches) Setup 32:46650 47538 Encrypt:197466 196412 Decrypt:196290 196550 (matches) $ ./camellia5_Os Setup 16:33360 34487 Encrypt:154718 154499 Decrypt:157432 157135 (matches) Setup 24:53969 54304 Encrypt:205184 205818 Decrypt:210675 208552 (matches) Setup 32:53064 52904 Encrypt:205350 205439 Decrypt:211654 208468 (matches) $ ./camellia_64_Os Setup 16:24696 25894 Encrypt:155903 155747 Decrypt:157385 155696 (matches) Setup 24:33873 33230 Encrypt:206111 206385 Decrypt:208111 207650 (matches) Setup 32:32799 32325 Encrypt:209715 205973 Decrypt:207578 207644 (matches) $ size camellia_Os.o camellia7_Os.o camellia_64_Os.o text data bss dec hex filename 24586 0 0 24586 600a camellia_Os.o 15906 0 0 15906 3e22 camellia5_Os.o 13098 0 0 13098 332a camellia_64_Os.o ~5% speed loss in camellia -> camellia5, much smaller size. Big key setup speedup in 64-bit camellia_64, and it is even smaller still. -- vda