linux - TCP receiving window size higher than net.core.rmem_max -


i running iperf measurements between 2 servers, connected through 10gbit link. trying correlate maximum window size observe system configuration parameters.

in particular, have observed maximum window size 3 mib. however, cannot find corresponding values in system files.

by running sysctl -a following values:

net.ipv4.tcp_rmem = 4096        87380   6291456 net.core.rmem_max = 212992 

the first value tells maximum receiver window size 6 mib. however, tcp tends allocate twice requested size, maximum receiver window size should 3 mib, have measured it. man tcp:

note tcp allocates twice size of buffer requested in setsockopt(2) call, , succeeding getsockopt(2) call not return same size of buffer requested in setsockopt(2) call. tcp uses space administrative purposes , internal kernel structures, , /proc file values reflect larger sizes compared actual tcp windows.

however, second value, net.core.rmem_max, states maximum receiver window size cannot more 208 kib. , supposed hard limit, according man tcp:

tcp_rmem max: maximum size of receive buffer used each tcp socket. value does not override global net.core.rmem_max. not used limit size of receive buffer declared using so_rcvbuf on socket.

so, how come , observe maximum window size larger 1 specified in net.core.rmem_max?

nb: have calculated bandwidth-latency product: window_size = bandwidth x rtt 3 mib (10 gbps @ 2 msec rtt), verifying traffic capture.

a quick search turned up:

https://github.com/torvalds/linux/blob/4e5448a31d73d0e944b7adb9049438a09bc332cb/net/ipv4/tcp_output.c

in void tcp_select_initial_window()

if (wscale_ok) {     /* set window scaling on max possible window      * see rfc1323 explanation of limit 14      */     space = max_t(u32, sysctl_tcp_rmem[2], sysctl_rmem_max);     space = min_t(u32, space, *window_clamp);     while (space > 65535 && (*rcv_wscale) < 14) {         space >>= 1;         (*rcv_wscale)++;     } } 

max_t takes higher value of arguments. bigger value takes precedence here.

one other reference sysctl_rmem_max made used limit argument so_rcvbuf (in net/core/sock.c).

all other tcp code refers sysctl_tcp_rmem only.

so without looking deeper code can conclude bigger net.ipv4.tcp_rmem override net.core.rmem_max in cases except when setting so_rcvbuf (whose check can bypassed using so_rcvbufforce)


Comments

Popular posts from this blog

c - Bitwise operation with (signed) enum value -

xslt - Unnest parent nodes by child node -

python - Healpy: From Data to Healpix map -