Return-Path: From: Siarhei Siamashka To: "ext Brad Midgley" Subject: Re: [PATCH] sbc: fix for overflow bug in quantization code Date: Mon, 22 Dec 2008 13:37:53 +0200 Cc: "Luiz Augusto von Dentz" , jaska.uimonen@nokia.com, "ext Marcel Holtmann" , "linux-bluetooth@vger.kernel.org" References: <200812172243.27442.siarhei.siamashka@nokia.com> <2d5a2c100812191423g301a6aebr103d3602e8ec984c@mail.gmail.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Message-Id: <200812221337.53781.siarhei.siamashka@nokia.com> List-ID: On Saturday 20 December 2008 00:51:52 ext Brad Midgley wrote: > Luiz, > > > So what about the subband filters fixes and 16 bit fixed point, do you > > thing it can be done? I left this out since the last patch was only > > dealing with 8 subband and was 32 bit. > > FYI, I made some testing changes some time ago from 32-bit to 16-bit > integers that didn't improve performance on arm at all... There are a few things to note: 1. Subband filter, while using a noticeable amount of CPU, is not the only bottleneck in SBC encoder. Other parts of code also need to be improved in order to see a major overall performance improvement as a result of subband filter optimization 2. If you want to benefit from 16-bit math on ARM, you need to compile the code with right -mcpu/-march settings. Fast single cycle 16-bit multiplication instructions were introduced on ARM with armv5te instruction set and ARM9E core. If you compile the code for armv4, you will hardly see any improvements at all. Also current code explicitly uses 32-bit multiply-accumulate instruction in inline assembly macro, preventing the compiler from using 16-bit instructions even for the cases where it could. Performance improvement of using 16-bit multiplication instructions on ARM for the subband filter function should be really high, if done right. > we shouldn't sacrifice quality for no real improvement in performance. With the patch from http://marc.info/?l=linux-bluetooth&m=122972547004294&w=2 anyone can already estimate the quality difference between using 16-bit fixed point and 32-bit fixed point. If we take http://media.xiph.org/BBB/BigBuckBunny-stereo.flac , convert it to au format and use in testing, we get the following results (encoding was done with bluez sbc encoder, decoding was done with some conformant reference win32 decoder binary, original file was compared with the decoded result using tiny_psnr tool). First we can try to crank up bitrate to the very maximum and encode the file with the following command line: ./sbcenc -j -S -b 255 BigBuckBunny-stereo.au >BigBuckBunny-stereo.sbc bluez 16-bit fixed point: stddev: 2.55 PSNR: 88.16 bytes:114491016/114491308 bluez 32-bit fixed point: stddev: 1.09 PSNR: 95.56 bytes:114491016/114491308 reference encoder: stddev: 1.09 PSNR: 95.56 bytes:114491016/114491308 Looks like a major difference and 16-bit fixed point looks bad in comparison, right? But this is the artificial case when the compressed file is even bigger than the original. If we try to use more realistic settings similar to the recommended high quality settings from SBC specification (Table 4.7): ./sbcenc -j -S -b 51 BigBuckBunny-stereo.au >BigBuckBunny-stereo.sbc bluez 16-bit fixed point: stddev: 43.82 PSNR: 63.48 bytes:114491016/114491308 bluez 32-bit fixed point: stddev: 43.78 PSNR: 63.49 bytes:114491016/114491308 reference encoder: stddev: 43.37 PSNR: 63.57 bytes:114491016/114491308 Now as you see, effects of having not very precise calculations are completely eclipsed by the quantization and compression artefacts. I think that the difference is really negligible. If anybody wants to help, results of blind ABX tests with a real A2DP headset are very much welcome. But I suspect that nobody will be able to tell the difference between 16-bit and 32-bit version. PS. I still wonder why there is a loss when compared to reference encoder. 32-bit fixed point version should be even more precise than single precision floating point. Maybe there could be another minor bug in the code, or it is just a random deviation and there could be a win for other audio files. > The story may be different on a mips etc core, but I think we should > realize real performance improvements in order to justify lowering > fidelity. I don't know anything about mips at all. -- Best regards, Siarhei Siamashka