schism devlog - 06/18

The plan coming out of last cycle was simple: stop constructing a Beta distribution for every person who gives birth and find something cheaper. The obvious move was binning. Group adherents by heterodoxy, calculate one distribution per bin, sample it for the whole group. Instead of 75 million Beta draws per tick, maybe 100. That's the idea. Getting it to actually work took the rest of the week, and the results are not what I hoped.

The binning idea

The core insight is that heterodoxy is a float between 0 and 1, so you can divide that range into N buckets and sort every adherent into the nearest one. Bin 20 out of 100 represents adherents with heterodoxy around 0.20. Within a bin, everyone is similar enough that a single Beta distribution is a reasonable approximation for the whole group.

I set 100 bins. Here is what the distribution looked like on a mid-run population:

Bin 0: 1 adherents
Bin 1: 36 adherents
Bin 2: 228 adherents
Bin 3: 728 adherents
Bin 4: 1506 adherents
Bin 5: 2697 adherents
...
Bin 19: 25594 adherents
Bin 20: 25729 adherents
Bin 21: 26044 adherents
...
Bin 62: 30 adherents
Bin 63: 32 adherents
Bin 64: 24 adherents
Bin 65: 15 adherents
Bin 66: 13 adherents
... (single digits per bin)
Bin 99: 0 adherents

As expected, the bulk of the population piles up in the first third of the heterodoxy range. Above bin 65 or so you get single digits per bin, which means the top 35% of bins are nearly empty. Running 100 distributions instead of millions is the goal, but in practice only about 55 of those 100 bins ever have anyone in them at a given tick.

The implementation is straightforward in concept. For each non-empty bin: get the average age, look up the corresponding birth rate, figure out how many children are born from that bin, construct a Beta from the bin's mean heterodoxy, and sample it once per birth. Then partition those births across religions by each religion's share of that bin's population.

Getting it right was tedious

The first version panicked immediately:

thread 'main' (18849250) panicked at src/simulation/adherent.rs:153:22:
called `Option::unwrap()` on a `None` value

Two bugs, found with some debug prints at each step:

First, the or_insert on the bin HashMap was inserting an empty vec![] instead of a vec containing the adherent that triggered the insert. So bins existed but were empty.

Second, the religion totals map had the same issue: or_insert(0) meant the first adherent in a religion per bin was not counted, so you could get division by zero when computing religion percentages.

After those fixes, the sim ran but birth rates were wrong. The religion_percentages map was showing values above 1.0:

[advance_adherents] bin 47: religion ReligionKey(1v1): percentage: 2.0
[advance_adherents] bin 38: religion ReligionKey(1v1): percentage: inf

The percentage was being calculated as adherents-in-bin divided by adherents-in-religion instead of the other way around. Flipping the division fixed it. The inf case was a separate bug where the religion count for a bin was 0 when it should have been 1.

There was also a rounding problem. I was using .ceil() on the number of children born per bin, and on the births-per-religion split. Ceiling on both sides meant I was systematically over-producing births. Switching to .floor() and ignoring the remainder for now simplified it, though I am aware that throws away fractional births without giving them to anyone. Good enough for the moment.

The HashMap was a hidden drag

Once births were working correctly, I ran a flamegraph. The Beta sampling stack had dropped to 18%, down from 29% last cycle, which was progress. But 16% was sitting on the binning step itself, specifically on HashMap operations.

The fix was obvious once I saw it. I know the bin count ahead of time and bins are indexed 0 through N. There is no reason to use a HashMap at all. A Vec<Vec<&Adherent>> pre-allocated to num_bins + 1 slots does the same thing with no hashing:

old: 181.29s user 98% cpu 3:11.27 total  (250 gens, HashMap bins)
new:  77.26s user 98% cpu 1:23.37 total  (250 gens, Vec bins)

That's a real improvement. But comparing generation counts is misleading because the two versions don't produce the same population at generation 250. The birth mechanics changed, so I need to compare time-to-population instead.

I added a panic in the readout to stop the run when it hits roughly 75 million people, matching the endpoint from last cycle's 250-generation run:

old (per-person Beta, 250 gens):  90.78s user 98% cpu 1:37.95 total
new (binned, Vec, to ~75M pop):  149.08s user 98% cpu 2:38.84 total

That is worse. Reaching the same population milestone takes about 60 seconds longer with the binned approach.

Switching the religion partition HashMap to ahash helped a bit:

250 gens, ahash: 64.32s user 99% cpu 1:09.16 total

And sampling the Beta once per bin (for the whole bin's births at once) instead of once per birth shaved another second:

250 gens, batch Beta sample: 63.14s user 4.06s system 99% cpu 1:07.53 total

flamegraph showing Beta sampling and binning as dominant costs after vec refactor

After all of that, the flamegraph still has Beta::sample as the biggest single cost at around 27%. The binning reduced how often we sample, but sampling itself is still expensive and now there's additional overhead from the binning pass on top.

What went wrong

The binning idea was right in principle but the math around births per bin is not equivalent to the original per-person logic. In the original version, each person's birth probability is determined by their exact age. In the binned version, I use the average age of the bin, which collapses a lot of variation. A bin with mostly young children and a few peak-fertility adults will produce almost no births under the average-age model, even though the adults would have produced many under the original. That's probably where the population discrepancy comes from, and it means the two approaches are not interchangeable, they produce different histories.

There is also the floating point instability. The run is seeded, but the results still vary between runs. I think it's the combination of HashMap iteration order (which is not stable) and floating point rounding in the religion percentage math. The Vec change might have helped here since array iteration order is deterministic, but I haven't confirmed that.

What's next

The binning approach as implemented is not clearly better than the original. It's more complex, the math is messier, and the wall-clock improvement at equivalent population sizes is marginal at best.

The note I left myself at the end of the week is probably the right direction: stop simulating individuals at scale entirely. Instead of a list of adherents, use a giant histogram. The population becomes a fixed-size array of buckets, each bucket holding a count of people at that heterodoxy level and age band. A tick updates the counts directly without iterating over any individual person. The cost per tick stays roughly constant regardless of total population size.

That's a significant rewrite of the adherent layer, and I am not sure yet what it means for the per-person features like religion membership and schism mechanics. But continuing to optimize the individual-based model is probably not going to get me to 1000 generations.