From: "Mark A. Greer" Subject: Re: [PATCH v2] OMAP: AES: Don't idle/start AES device between Encrypt operations Date: Fri, 17 May 2013 14:14:56 -0700 Message-ID: <20130517211456.GI28484@animalcreek.com> References: <1368500867-7737-1-git-send-email-joelagnel@ti.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-crypto@vger.kernel.org, linux-omap@vger.kernel.org, khilman@linaro.org To: Joel A Fernandes Return-path: Content-Disposition: inline In-Reply-To: <1368500867-7737-1-git-send-email-joelagnel@ti.com> Sender: linux-omap-owner@vger.kernel.org List-Id: linux-crypto.vger.kernel.org On Mon, May 13, 2013 at 10:07:47PM -0500, Joel A Fernandes wrote: > Calling runtime PM API for every block causes serious perf hit to > crypto operations that are done on a long buffer. > As crypto is performed on a page boundary, encrypting large buffers can > cause a series of crypto operations divided by page. The runtime PM API > is also called those many times. > > We call runtime_pm_get_sync only at beginning on the session (cra_init) > and runtime_pm_put at the end. This result in upto a 50% speedup as below. > This doesn't make the driver to keep the system awake as runtime get/put > is only called during a crypto session which completes usually quickly. > > Before: > root@beagleboard:~# time -v openssl speed -evp aes-128-cbc > Doing aes-128-cbc for 3s on 16 size blocks: 13310 aes-128-cbc's in 0.01s > Doing aes-128-cbc for 3s on 64 size blocks: 13040 aes-128-cbc's in 0.04s > Doing aes-128-cbc for 3s on 256 size blocks: 9134 aes-128-cbc's in 0.03s > Doing aes-128-cbc for 3s on 1024 size blocks: 8939 aes-128-cbc's in 0.01s > Doing aes-128-cbc for 3s on 8192 size blocks: 4299 aes-128-cbc's in 0.00s > > After: > root@beagleboard:~# time -v openssl speed -evp aes-128-cbc > Doing aes-128-cbc for 3s on 16 size blocks: 18911 aes-128-cbc's in 0.02s > Doing aes-128-cbc for 3s on 64 size blocks: 18878 aes-128-cbc's in 0.02s > Doing aes-128-cbc for 3s on 256 size blocks: 11878 aes-128-cbc's in 0.10s > Doing aes-128-cbc for 3s on 1024 size blocks: 11538 aes-128-cbc's in 0.05s > Doing aes-128-cbc for 3s on 8192 size blocks: 4857 aes-128-cbc's in 0.03s > > While at it, also drop enter and exit pr_debugs, in related code. tracers > can be used for that. > > Tested on a Beaglebone (AM335x SoC) board. > > Signed-off-by: Joel A Fernandes > --- FWIW, Acked-by: Mark A. Greer