Timed Noise: LCG-Based Jitter in x64 Assembly

Why Jitter?

When a periodic task fires at a fixed interval — a retry loop, a polling mechanism, a heartbeat — that regularity often becomes its own problem. Synchronized bursts from multiple clients (the thundering herd), queue pile-ups, and statistical profiling are all side effects of fixed delays.

The solution: added jitter. If the wait time varies on every call, the pattern breaks, load flattens, and predictability disappears.

In this post we walk through a pure x64 Assembly function that applies a Linear Congruential Generator (LCG) scramble over an rdtsc entropy seed, then sleeps for a random duration in the [100ms, 1000ms) range.

🔩 The Full Function: `_lcg_jitter`

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50


_lcg_jitter:
    push rax
    push rbx
    push rcx
    push rdx
    push rsi
    push rdi
    push rbp
    push r8
    push r9
    push r10
    push r11
    push r12
    push r13
    push r14
    push r15

    rdtsc                           ; EAX = TSC low 32 bits
    imul eax, eax, 1664525          ; LCG scramble (increases entropy)
    add  eax, 1013904223
    xor  edx, edx                   ; zero EDX for div (EDX:EAX dividend)
    mov  ecx, 900000000             ; mod 900M  → [0, 900M)
    div  ecx
    add  edx, 100000000             ; shift → [100ms, 1000ms)

    sub  rsp, 32                    ; reserve stack space (aligned)
    mov  qword [rsp],    0          ; tv_sec  = 0
    mov  qword [rsp+8],  rdx        ; tv_nsec = computed value
    mov  rax, 35                    ; sys_nanosleep
    mov  rdi, rsp                   ; req = &timespec
    xor  rsi, rsi                   ; rem = NULL
    syscall
    add  rsp, 32                    ; restore stack

    pop r15
    pop r14
    pop r13
    pop r12
    pop r11
    pop r10
    pop r9
    pop r8
    pop rbp
    pop rdi
    pop rsi
    pop rdx
    pop rcx
    pop rbx
    pop rax
    ret

📦 Part 1: Register Preservation

1
2
3
4
5


    push rax
    push rbx
    ; ... all general-purpose registers
    pop rax
    ret

When this function is called, the caller’s register state must remain untouched. So every general-purpose register is pushed onto the stack at the top and restored in reverse order at the bottom.

Note: Under the System V AMD64 ABI, rbx, rbp, and r12–r15 are callee-saved; the rest are caller-saved. Both sets are saved here — maximum safety, no assumptions about the call site.

⏱️ Part 2: Entropy Source — `rdtsc`

1

    rdtsc   ; EDX:EAX = Time Stamp Counter

The rdtsc (Read Time-Stamp Counter) instruction reads the processor’s 64-bit cycle counter accumulated since boot. The low 32 bits land in EAX, the high 32 in EDX.

This value changes rapidly — a reliable snapshot of hardware state. However it isn’t used raw: lower bits can carry detectable patterns on some microarchitectures. The LCG step that follows solves this.

🎲 Part 3: LCG Scramble

1
2


    imul eax, eax, 1664525
    add  eax, 1013904223

These two lines implement the classic LCG formula from Numerical Recipes:

X_{n+1} = (a × X_n + c) mod 2^32

Parameter	Value	Source
`a`	1664525	Numerical Recipes
`c`	1013904223	Numerical Recipes
`m`	2^32	Implicit (32-bit overflow)

imul eax, eax, 1664525 multiplies EAX by the LCG multiplier — overflow is intentional, giving us modular arithmetic for free.
add eax, 1013904223 adds the increment c.

The result: the raw TSC value is transformed into a pseudo-random number with a statistically flatter distribution.

➗ Part 4: Range Calculation

1
2
3
4


    xor  edx, edx        ; clear EDX (high half of dividend)
    mov  ecx, 900000000  ; divisor = 900M nanoseconds
    div  ecx
    add  edx, 100000000  ; shift to [100M, 1000M) ns

div ecx divides the 64-bit value EDX:EAX by ECX:

EAX ← quotient (discarded)
EDX ← remainder → falls in [0, 900_000_000)

Adding 100_000_000 shifts the window:

[0, 900_000_000) + 100_000_000 = [100_000_000, 1_000_000_000)

In wall-clock terms: a uniformly distributed random sleep between 100 ms and 1000 ms.

😴 Part 5: Sleeping with `sys_nanosleep`

1
2
3
4
5
6
7
8


    sub  rsp, 32
    mov  qword [rsp],   0    ; tv_sec  = 0
    mov  qword [rsp+8], rdx  ; tv_nsec = computed value
    mov  rax, 35             ; syscall number: nanosleep
    mov  rdi, rsp            ; req pointer
    xor  rsi, rsi            ; rem = NULL
    syscall
    add  rsp, 32

nanosleep(2) expects a pointer to struct timespec:

1
2
3
4


struct timespec {
    time_t tv_sec;   // seconds
    long   tv_nsec;  // nanoseconds [0, 999_999_999]
};

The struct is built directly on the stack:

[rsp] → tv_sec = 0 (sub-second sleep only)
[rsp+8] → tv_nsec = EDX (our computed random value)

The sub rsp, 32 reservation keeps the stack 16-byte aligned and provides comfortable headroom. Only 16 bytes are strictly necessary for one timespec, so 32 bytes is a conservative over-allocation — benign and safe.

📊 Distribution Analysis

The LCG + modular arithmetic combination produces a theoretically uniform distribution across the target window:

Range:              [100ms, 1000ms)
Expected mean:      ~550ms
Standard deviation: ~260ms
LCG period:         2^32 ≈ 4.29 billion calls before cycle repeats

For most use cases the period is far longer than needed, so cycle repetition is not a practical concern.

⚠️ Limitations and Alternatives

Concern	This Implementation	Alternative
Cryptographic security	❌ Not suitable	`getrandom(2)` syscall
Multi-threaded safety	⚠️ No shared state to protect here	Thread-local seed for stateful LCG
Reproducibility	✓ Fix the seed	Pass a constant initial value
Precision	✓ Nanosecond resolution	Sufficient for all timing jitter use cases

For non-cryptographic, non-security-sensitive jitter this implementation hits all the marks cleanly.

Conclusion

This small function elegantly combines three ideas:

Hardware entropy — rdtsc gives a unique starting point on every call
LCG scrambling — a two-instruction transform that flattens the distribution
Direct syscall — kernel scheduling infrastructure accessed without any library overhead

The outcome: externally irregular, internally deterministic — a lightweight timing jitter engine that lives entirely within the CPU and the kernel.

Github: https://github.com/JM00NJ/Sectionless-Craft/tree/main/Jitter

Stay Coded!

🎓 Learning Path & Metadata

Why Jitter?#

🔩 The Full Function: _lcg_jitter#

📦 Part 1: Register Preservation#

⏱️ Part 2: Entropy Source — rdtsc#

🎲 Part 3: LCG Scramble#

➗ Part 4: Range Calculation#

😴 Part 5: Sleeping with sys_nanosleep#

📊 Distribution Analysis#

⚠️ Limitations and Alternatives#

Conclusion#