This patch is already in qemu master, and help to reduce latencies and boost iops. Tested with librbd, I see an extra boost. But it should help with any block driver (fio-rbd on host show me around 15% speedup with tcmalloc)