2024-01-03 04:55:19

by Ethan Zhao

[permalink] [raw]
Subject: [Regression] [iommu/iova] iova_rbtree_lock contended seriously causing performance bottleneck (bisect done; commit found)

flame graph


queued_spin_lock_slowpath-------------\
do_raw_spin_lock \
_raw_spin_lock_irqsave \
__alloc_and_insert_iova_range \
alloc_iova_fast \
iommu_dma_alloc_iova \
__iommu_dma_map \
iommu_dma_map_page \
ice_tx_map.isra.0 \
ice_xmit_frame_ring \
dev_hard_start_xmit \
sch_direct_xmit \
__dev_xmit_skb \
__dev_queue_xmit \
ip_finish_output2 \
__ip_queue_xmit \
__tcp_transmit_skb \
tcp_write_xmit \
__tcp_push_pending_frames \
tcp_sendmsg_locked \
tcp_sendmsg \
tcp_sendmsg \
sock_write_iter \
do_iter_readv_writev \
do_iter_write \
vfs_writev \
do_writev \
do_syscall_64 \
entry_SYSCALL_64_after_hwframe \
__GI___writev \
[nginx] \
ngx_linux_sendfile_chain \
ngx_http_write_filter \
ngx_output_chain \
ngx_http_send_response \
ngx_http_script_return_code \
[nginx] \
ngx_http_core_run_phases \
ngx_http_process_request \

Setup:
1. configure server nginx with following nginx.conf (appended to the tail)

2. request server with WRK
./wrk -t 64 -c 1024 -d 40 --latency http://$server_ip:10802/1KB.json

Debugging summary:
Bisect identified "commit 371d7955e310 iommu/iova: Improve restart logic"
as cause.

nginx.conf
user nginx;
worker_processes 16;
error_log /var/log/nginx/error.log crit;
pid /var/run/nginx.pid;
events {
worker_connections 4000;
use epoll;
multi_accept on;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log off;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
keepalive_requests 20480;
gzip_min_length 10240;
gzip_comp_level 1;
gzip_vary on;
gzip_disable msie6;
gzip_proxied expired no-cache no-store private auth;
gzip_types
# text/html is always compressed by HttpGzipModule
text/css
text/javascript
text/xml
text/plain
text/x-component
application/javascript
application/x-javascript
application/json
application/xml
application/rss+xml
application/atom+xml
font/truetype
font/opentype
application/vnd.ms-fontobject
image/svg+xml;

reset_timedout_connection on;
client_body_timeout 10;
send_timeout 2;
include /etc/nginx/conf.d/*.conf;
server {
listen 10802;
server_name localhost;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
#root /usr/share/nginx/html;
root /ramdisk;
}
location /1KB.json {
return 202 '{"status":"success","result":"\
Hello from NGINX, 2KB test\
Nanchang, which was the capital of Yuzhang Prefecture during the HanDynasty, \
now falls under the jurisdiction of Hongzhou. It straddles the borderof the \
influence of the Ye and Zhen constellations , and is adjacent to theHeng \
and the Lu mountains . The three rivers enfold it like the frontpart \
of a garment and the five lakes encircle it like a girdle. Itcontrols \
nature’s jewels. The radiance of its legendary sword shootsdirectly upward \
between the constellations Niu and Dou. Its talented peopleare outstanding,\
and the spirit of intelligence pervades the place. This wasthe place where Xu \
Ru spent the night on his visit to Chen Fan (10). The mightyHongzhou spreads \
meteors chasing one another.\
"}';
}
}
}
--
2.31.1



2024-01-03 04:57:46

by Ethan Zhao

[permalink] [raw]
Subject: Re: [Regression] [iommu/iova] iova_rbtree_lock contended seriously causing performance bottleneck (bisect done; commit found)


On 1/3/2024 12:55 PM, Ethan Zhao wrote:
part of the mail got lost after sent
past here

Issue:
When network throughput is large (>1GB/s), the performance of
application nginx hits bottleneck, most of cpu cycles are cost
by spin lock, it takes a lot of cycles to aquire iova_rbtree_lock
for  __alloc_and_insert_iova_range().
> flame graph
>
>
> queued_spin_lock_slowpath-------------\
> do_raw_spin_lock \
> _raw_spin_lock_irqsave \
> __alloc_and_insert_iova_range \
> alloc_iova_fast \
> iommu_dma_alloc_iova \
> __iommu_dma_map \
> iommu_dma_map_page \
> ice_tx_map.isra.0 \
> ice_xmit_frame_ring \
> dev_hard_start_xmit \
> sch_direct_xmit \
> __dev_xmit_skb \
> __dev_queue_xmit \
> ip_finish_output2 \
> __ip_queue_xmit \
> __tcp_transmit_skb \
> tcp_write_xmit \
> __tcp_push_pending_frames \
> tcp_sendmsg_locked \
> tcp_sendmsg \
> tcp_sendmsg \
> sock_write_iter \
> do_iter_readv_writev \
> do_iter_write \
> vfs_writev \
> do_writev \
> do_syscall_64 \
> entry_SYSCALL_64_after_hwframe \
> __GI___writev \
> [nginx] \
> ngx_linux_sendfile_chain \
> ngx_http_write_filter \
> ngx_output_chain \
> ngx_http_send_response \
> ngx_http_script_return_code \
> [nginx] \
> ngx_http_core_run_phases \
> ngx_http_process_request \
>
> Setup:
> 1. configure server nginx with following nginx.conf (appended to the tail)
>
> 2. request server with WRK
> ./wrk -t 64 -c 1024 -d 40 --latency http://$server_ip:10802/1KB.json
>
> Debugging summary:
> Bisect identified "commit 371d7955e310 iommu/iova: Improve restart logic"
> as cause.
>
> nginx.conf
> user nginx;
> worker_processes 16;
> error_log /var/log/nginx/error.log crit;
> pid /var/run/nginx.pid;
> events {
> worker_connections 4000;
> use epoll;
> multi_accept on;
> }
> http {
> include /etc/nginx/mime.types;
> default_type application/octet-stream;
> log_format main '$remote_addr - $remote_user [$time_local] "$request" '
> '$status $body_bytes_sent "$http_referer" '
> '"$http_user_agent" "$http_x_forwarded_for"';
> access_log off;
> sendfile on;
> tcp_nopush on;
> tcp_nodelay on;
> keepalive_timeout 65;
> keepalive_requests 20480;
> gzip_min_length 10240;
> gzip_comp_level 1;
> gzip_vary on;
> gzip_disable msie6;
> gzip_proxied expired no-cache no-store private auth;
> gzip_types
> # text/html is always compressed by HttpGzipModule
> text/css
> text/javascript
> text/xml
> text/plain
> text/x-component
> application/javascript
> application/x-javascript
> application/json
> application/xml
> application/rss+xml
> application/atom+xml
> font/truetype
> font/opentype
> application/vnd.ms-fontobject
> image/svg+xml;
>
> reset_timedout_connection on;
> client_body_timeout 10;
> send_timeout 2;
> include /etc/nginx/conf.d/*.conf;
> server {
> listen 10802;
> server_name localhost;
> location / {
> root /usr/share/nginx/html;
> index index.html index.htm;
> }
> error_page 500 502 503 504 /50x.html;
> location = /50x.html {
> #root /usr/share/nginx/html;
> root /ramdisk;
> }
> location /1KB.json {
> return 202 '{"status":"success","result":"\
> Hello from NGINX, 2KB test\
> Nanchang, which was the capital of Yuzhang Prefecture during the HanDynasty, \
> now falls under the jurisdiction of Hongzhou. It straddles the borderof the \
> influence of the Ye and Zhen constellations , and is adjacent to theHeng \
> and the Lu mountains . The three rivers enfold it like the frontpart \
> of a garment and the five lakes encircle it like a girdle. Itcontrols \
> nature’s jewels. The radiance of its legendary sword shootsdirectly upward \
> between the constellations Niu and Dou. Its talented peopleare outstanding,\
> and the spirit of intelligence pervades the place. This wasthe place where Xu \
> Ru spent the night on his visit to Chen Fan (10). The mightyHongzhou spreads \
> meteors chasing one another.\
> "}';
> }
> }
> }


Thanks,

Ethan


2024-01-06 04:25:48

by Ethan Zhao

[permalink] [raw]
Subject: Re: [Regression] [iommu/iova] iova_rbtree_lock contended seriously causing performance bottleneck (bisect done; commit found)


On 1/3/2024 12:57 PM, Ethan Zhao wrote:
>
> On 1/3/2024 12:55 PM, Ethan Zhao wrote:
> part of the mail got lost after sent
> past here
>
> Issue:
> When network throughput is large (>1GB/s), the performance of
> application nginx hits bottleneck, most of cpu cycles are cost
> by spin lock, it takes a lot of cycles to aquire iova_rbtree_lock
> for  __alloc_and_insert_iova_range().
>> flame graph
>>
>>
>> queued_spin_lock_slowpath-------------\
>> do_raw_spin_lock                       \
>> _raw_spin_lock_irqsave                  \
>> __alloc_and_insert_iova_range            \
>> alloc_iova_fast                           \
>> iommu_dma_alloc_iova                       \
>> __iommu_dma_map                             \
>> iommu_dma_map_page                           \
>> ice_tx_map.isra.0                             \
>> ice_xmit_frame_ring                            \
>> dev_hard_start_xmit                             \
>> sch_direct_xmit                                  \
>> __dev_xmit_skb                                    \
>> __dev_queue_xmit                                   \
>> ip_finish_output2                                   \
>> __ip_queue_xmit                                     \
>> __tcp_transmit_skb                                   \
>> tcp_write_xmit                                       \
>> __tcp_push_pending_frames                             \
>> tcp_sendmsg_locked                                     \
>> tcp_sendmsg                                             \
>> tcp_sendmsg                                              \
>> sock_write_iter                                           \
>> do_iter_readv_writev                                       \
>> do_iter_write                                               \
>> vfs_writev                                                   \
>> do_writev                                                     \
>> do_syscall_64                                                 \
>> entry_SYSCALL_64_after_hwframe                                 \
>> __GI___writev \
>> [nginx] \
>> ngx_linux_sendfile_chain \
>> ngx_http_write_filter \
>> ngx_output_chain \
>> ngx_http_send_response \
>> ngx_http_script_return_code \
>> [nginx] \
>> ngx_http_core_run_phases \
>> ngx_http_process_request \
>>
>> Setup:
>> 1. configure server nginx with following nginx.conf (appended to the
>> tail)
>>
>> 2. request server with WRK
>>     ./wrk -t 64 -c 1024 -d 40 --latency http://$server_ip:10802/1KB.json
>>
>> Debugging summary:
>>   Bisect identified "commit 371d7955e310 iommu/iova: Improve restart
>> logic"
>>   as cause.
>>
>> nginx.conf
>> user nginx;
>> worker_processes 16;
>> error_log /var/log/nginx/error.log crit;
>> pid /var/run/nginx.pid;
>> events {
>> worker_connections 4000;
>> use epoll;
>> multi_accept on;
>> }
>> http {
>> include /etc/nginx/mime.types;
>> default_type application/octet-stream;
>> log_format main '$remote_addr - $remote_user [$time_local] "$request" '
>>         '$status $body_bytes_sent "$http_referer" '
>>         '"$http_user_agent" "$http_x_forwarded_for"';
>> access_log off;
>> sendfile on;
>> tcp_nopush on;
>> tcp_nodelay on;
>> keepalive_timeout 65;
>> keepalive_requests 20480;
>> gzip_min_length 10240;
>> gzip_comp_level 1;
>> gzip_vary on;
>> gzip_disable msie6;
>> gzip_proxied expired no-cache no-store private auth;
>> gzip_types
>> # text/html is always compressed by HttpGzipModule
>>     text/css
>>     text/javascript
>>     text/xml
>>     text/plain
>>     text/x-component
>>     application/javascript
>>     application/x-javascript
>>     application/json
>>     application/xml
>>     application/rss+xml
>>     application/atom+xml
>>     font/truetype
>>     font/opentype
>>     application/vnd.ms-fontobject
>>     image/svg+xml;
>>
>> reset_timedout_connection on;
>> client_body_timeout 10;
>> send_timeout 2;
>> include /etc/nginx/conf.d/*.conf;
>> server {
>>     listen 10802;
>>      server_name  localhost;
>>      location / {
>>          root   /usr/share/nginx/html;
>>          index  index.html index.htm;
>>      }
>>      error_page   500 502 503 504  /50x.html;
>>      location = /50x.html {
>>          #root   /usr/share/nginx/html;
>>          root   /ramdisk;
>>     }
>>      location /1KB.json {
>>                 return 202 '{"status":"success","result":"\
>>                             Hello from  NGINX, 2KB test\
>>                             Nanchang, which was the capital of
>> Yuzhang Prefecture during the HanDynasty, \
>>                             now falls under the jurisdiction of
>> Hongzhou. It straddles the borderof the \
>>                             influence of the Ye and Zhen
>> constellations , and is adjacent to theHeng \
>>                             and the Lu mountains . The three rivers
>> enfold it like the frontpart \
>>                             of a garment and the five lakes encircle
>> it like a girdle. Itcontrols \
>>                             nature’s jewels. The radiance of its
>> legendary sword shootsdirectly upward \
>>                             between the constellations Niu and Dou.
>> Its talented peopleare outstanding,\
>>                             and the spirit of intelligence pervades
>> the place. This wasthe place where Xu \
>>                             Ru spent the night on his visit to Chen
>> Fan (10). The mightyHongzhou spreads \
>>                             meteors chasing one another.\
>>                             "}';
>>            }
>>   }
>> }
>
>
> Thanks,
>
> Ethan
>
Please ignore this report, looks like negtive, a running out of rcache
case.


Thanks,

Ethan