Original post

This has me stumped.

As a learning experience, I wrote an implementation of the cat command and tested it by using another program I wrote to write random bytes to stdout. In developing bothprograms though, I noticed that a buffer size of 4096 offered less performance than 65536, which on my machine seems like a sweet spot since anything above that further degrades throughput. My question is, what factors might be influencing this? I thought 4096 was a common buffer size for file and pipe I/O, even 8192 at times (though that was even worse in my tests). I don’t know if it’s related, but the page size on my system is 4096.

The random byte generator is called ‘fon’, and the testing command chain is as follows:

fon -s 65536 | cat | pv -a > /dev/null

So ‘fon’ is writing 64GiB of nonsense (no, that’s not a typo) into cat, where ‘pv -a’ gets the average throughput from cat to /dev/null. Testing against FreeBSD’s cat, I got 425MiB/s, whereas my cat would get identical speed with a buffer of 4096, but switching to 65536 brings it up to 572MiB/s. Likewise, ‘fon’ performs best at that buffer size with about 600MiB/s average output speed, but a bigger or smaller buffer reduces the speed.

Just for fun, I even wrote a version of ‘yes’ and compared it to FreeBSD’s version. Mine peaked at 6.7GiB/s average whereas FreeBSD peaked at about 4.1GiB/s. Guess what buffer size I used?

So to sum it up, does anyone have an explanation for why 64KiB buffers are outperforming 4KiB, 8KiB, even 128KiB or 1MiB buffers? I know very little about the inner workings of this sort of thing, and I don’t want to fall into a trap of assuming all machines will behave the same way (hence determining buffer size at runtime). If it helps, I’m running FreeBSD 12.1 Release-p6 on a Thinkpad T430 with 8GB RAM and an Intel i5-2520M, version 1.14.6.