Keyboard Latency Testing - infinitemonkeytheorem

2024-03-27

I recently wanted to experiment with home-row mods. I have a QMK enabled keyboard, but I want my mods to be portable for when I'm travelling and don't have my external keyboard. There are several remapping tools that can implement home-row mods on linux, notably KMonad, Kanata, and keyd. While these tools have different feature sets and goals, they all overlap in meeting my needs. The deciding factor for me is latency, I want the tool that imparts smallest additional latency to my typing See Dan Luu's writings about latency . To compare latency between the different tools, I wrote a small python script; which is the subject of this post.

Note about latency testing: robust end-to-end latency testing is done using a circuit that triggers a key-press and a light sensor to catch actual rendering See a cool setup by Tristan Hume . This is a great way to determine actual latency. But for my use, I only care about relative latency (which tool introduces the most latency), so a lightweight method will be suitable.

I would like to directly measure the latency (delay) introduced by the remapping tool from the point it receives my keypress, to the time the application receives the keypress — but I don't know how to do that. What I can do is prompt myself to press a key, and measure how long it takes from the start of the prompt, to when my script receives the keypress. The measured latency includes roughly 3 components.

System/OS latency. This is from my keyboard, the OS, my terminal, etc.
My reaction time. Wikipedia says the fastest human reaction times are somewhere between 100ms and 200ms.
Latency introduced by the remapping tool.

While none of these components will be consistent across every keypress, we can assume they all have a consistent distribution The distribution of human reaction times seems to be consistent at least . Since all the delay distributions are consistent, I can directly compare the mean reaction delay using each tool to determine the relative latencies. The differences in mean reaction time will be the differences in latency of each remapping tool.

Boring Math

\begin{matrix} \begin{array}{ll} l_{s} = System/OS latency \\ l_{h} = Human reaction time \\ l_{t}^{a} = Latency from remapping tool A \\ l_{t}^{b} = Latency from remapping tool B \\ m_{t} = Mean Latency from n trials \\ m_{t} = l_{s} + l_{h} + l_{t} \end{array} \end{matrix}

\begin{matrix} \begin{aligned} Latency Difference & = m_{t}^{a} - m_{t}^{b} \\ = (l_{s} + l_{h} + l_{t}^{a}) - (l_{s} + l_{h} + l_{t}^{b}) \\ = l_{t}^{a} - l_{t}^{b} \end{aligned} \end{matrix}

Distribution of Human Reaction times. Source:Emily Willoughby, CC BY-SA 4.0, via Wikimedia Commons

To measure reaction time, I set up a basic python script that prompts me to press a key If you just want to play around with reactions times, checkout Human Benchmark . The trick is that the prompt comes after a random delay, which prevents me from accidentally finding a rhythm and reflexively pressing early.

This is done with the following python code

time.sleep(random.random() * 1.5 + 1) # 1s - 2.5s delay
start = time.perf_counter()
os.system('read -n 1 -s -r -p "Press any key"')
print(time.perf_counter() - start)

The reaction time from each key press is measured, and then reduced into mean, and median. I would also like to calculate mode, but I didn't feel I was working with enough samples to calculate it accurately This is probably an indication that I don't have enough samples to draw any meaningful conclusion, but ¯\_(ツ)_/¯ . I can then compare these statistics between keyd, kanata, and the baseline of nothing.

While gathering data I did occasionally twitch and get a sub 100ms reaction time, or lose focus and get a 1s reaction time. Outliers were removed with the following code:

# These boundaries were chosen based on my own reaction times
# They might need tuning on other systems 
delays = [d for d in delays if d > 0.1 and d < 0.4]

For the tests I measured the latencies of pressing my home-row mod key (f) on my base system, keyd, and kanata. I did an additional test with kanata using a different key (j). Each metric was calculated based on 50 keypresses done 10 at a time — I should do more, but it's boring.

	Mean (s)	Median (s)
Base (f)	0.2772	0.2732
keyd (f)	0.3112	0.3099
Kanata (f)	0.3216	0.3174
Kanata (j)*	0.2628	0.2602

*Note: The j Kanata test was done the next day after a good sleep. A quick retest of the Base shows a mean of 0.2537s. I didn't want to go through all 50 again, so the discrepancy stands.

While this isn't the most statistically sound test, the results definitely show that adding home-row mods can add latency. While it seems clear that keyd and kanata are adding a delay; the keypress isn't triggered until the key-up event (vs. key-down in the base) for the home-row mod remapping. This means that there is an additional component to the delay (how low it takes me to lift my finger back up after pressing). The j test shows that the tools are not adding meaningful latencies to other characters, which suggest that much of the latency difference is the time it takes me to lift my finger off the key. Based on my testing, there does still seem to be a small difference in the keyd and Kanata latencies. Speed is a core goal of keyd, so I'm not surprised that it performs well here.

At the beginning of this post I said that I would choose a remapping tool based on latencies alone, but I ended up just using kanata. At the time I was setting up my system, keyd didn't quite support my desired configuration, but it does now. I've stayed with Kanata because I like the direction of the project and how responsive jtroo is to new ideas. That said, I admire the design of keyd and it's minimalism, this testing shows that it's worth checking out again.

Please let me know if you find this technique helpful, or if you have any additions to improve it!

Full Code Listing

The up-to-date code, as well as the raw data from my testing can be found on github.

import time
import random
import os
import json

"""
This script measures the latency of the keyboard input by prompting the user for a 
keypress at a random interval between 1 and 3 seconds. The latency is measured as the
time between the prompt and the keypress. The script repeats this process 11 times and
prints the mean, median, max, and min latency.

This is not sufficient for measuring absolute latency. But is useful for comparing
relative latency between different systems (QMK configurations in my case).

"""

delays = []
try:
    for i in range(50):
        time.sleep(random.random() * 1.5 + 1)
        start = time.perf_counter()
        os.system('read -n 1 -s -r -p "Press any key "')
        delay = time.perf_counter() - start
        print(delay)
        delays.append(delay)

        if (i+1) % 10 == 0:
            os.system('read -n 1 -s -r -p "Take a quick break, press a key when you\'re ready to continue "')
            print()

except KeyboardInterrupt:
    pass

delays = [d for d in delays if d > 0.1 and d < 0.4]

mean = sum(delays) / len(delays)
bucketed = [round(d, 2) for d in delays]
# This is the formula for variance of sample, rather than
# variance of a population
variance = sum([(x - mean) ** 2 for x in delays]) / (len(delays) - 1)
print("\nmean:     ", mean)
print("median:   ", sorted(delays)[len(delays) // 2])
print("mode:     ", max(set(bucketed), key=bucketed.count))
print("std. dev.:", variance ** 0.5)
print("max:      ", max(delays))
print("min:      ", min(delays))