schism devlog - 06/16
The main thing this cycle was realizing I had been benchmarking in debug mode the whole time, which is embarrassing. Fixing that gave a free 10x speedup, let me push to 250 generations, and got me far enough to actually profile where the time goes.
--release was sitting right there
I was running cargo run to test the sim. Turns out that compiles without optimizations. Switching to cargo run --release:
cargo run -- run -n 200 85.13s user 0.75s system 98% cpu 1:27.08 total
cargo run --release -- run -n 200 7.37s user 0.57s system 94% cpu 8.395 total
87 seconds down to 8. Nothing changed in the code. I was just building wrong this whole time.
That said, 8 seconds for 200 generations is still a problem. Each generation the population grows, and since births are proportional to living adherents, the headcount compounds exponentially. Eventually even a fast build hits a wall. I want to get to 1000 generations eventually, and right now that isn't realistic.
250 generations, actual results
With the speed headroom I bumped the run to 250 generations and let it go:
{
"totals": {
"people": 75864062,
"religions": 7,
"active": 3,
"extinct": 4,
"new_this_generation": 0,
"mean_heterodoxy": 0.22879343227755156
},
"religions": [
{
"name": "Outer Terrestrial Spiritbinding",
"adherents": 0,
"status": "extinct",
"founding_date": 4220,
"extinction_date": 4340,
"age": 120,
"parent": "Terrestrial Spiritbinding",
"new": false
},
{
"name": "Liberal Terrestrial Spiritbinding",
"adherents": 0,
"status": "extinct",
"founding_date": 3440,
"extinction_date": 3460,
"age": 20,
"parent": "Terrestrial Spiritbinding",
"new": false
},
{
"name": "Monastic Terrestrial Spiritbinding",
"adherents": 0,
"status": "extinct",
"founding_date": 3400,
"extinction_date": 3420,
"age": 20,
"parent": "Terrestrial Spiritbinding",
"new": false
},
{
"name": "Sanctified Spiritbinding",
"adherents": 23792,
"status": "active",
"founding_date": 2600,
"extinction_date": null,
"age": 2400,
"parent": "Spiritbinding",
"new": false
},
{
"name": "Lay Spiritbinding",
"adherents": 0,
"status": "extinct",
"founding_date": 1240,
"extinction_date": 1380,
"age": 140,
"parent": "Spiritbinding",
"new": false
},
{
"name": "Terrestrial Spiritbinding",
"adherents": 32657,
"status": "active",
"founding_date": 980,
"extinction_date": null,
"age": 4020,
"parent": "Spiritbinding",
"new": false
},
{
"name": "Spiritbinding",
"adherents": 75807613,
"status": "active",
"founding_date": 0,
"extinction_date": null,
"age": 5000,
"parent": "none",
"new": false
}
]
}
cargo run --release -- run -n 250 90.78s user 5.94s system 98% cpu 1:37.95 total
Seven religions, a real two-generation lineage under Terrestrial Spiritbinding, several extinctions. The naming system from last cycle earns its keep here — you can read the tree straight from the names. Outer, Liberal, and Monastic all branched off Terrestrial Spiritbinding and died quickly. Sanctified went directly off the root and is still alive at age 2400. The structure is actually legible.
The 250-generation run took about 97 seconds though. Population is at 75 million people by the end, and the sim is iterating over all of them every tick. That's the real problem.
Flamegraph: Beta::sample is the bottleneck
I ran a flamegraph on the release build to find out where time actually goes. The breakdown:
What the stack is doing
Everything funnels through Simulation::run, which splits into three real workloads:
1. Adherent::try_birth — 39% (3,043 running, 720 self). This is your hot path by a mile.
2. Beta::sample — 29% (2,255 running, 1,050 self). rand_distr sampling from a Beta distribution,
the single biggest concrete cost in the whole program.
3. build_generation_readout — 18% (1,405 running, 484 self). Reporting/aggregation.
The expensive child here is BuildHasher::hash_one — 7.6% — SipHasher, the default std::HashMap
hasher. Hashing shows up diffusely across the program, not just in the readout.
try_birth is the hot path, and within that, Beta::sample is doing most of the actual work. Beta under the hood is implemented as two Gamma draws, which is not cheap.
The reason it's expensive is that I'm constructing the distribution fresh per person per tick:
let distr = create_child_heterodoxy_distribution(self, population_mean_heterodoxy, config)
.context("tried creating child het distr")?;
let heterodoxy = UnitInterval::new(distr.sample(rng));
The a and b parameters for the Beta depend on both the parent's own heterodoxy and the population mean, so they vary per adherent. I'm already precomputing the population mean before the loop, but the distribution itself still gets rebuilt for every single person who gives birth. At 75 million people that adds up fast.
Two paths forward: either front-load the distribution construction somehow (hard if the parameters genuinely vary per adherent), or swap in a cheaper approximation. A Beta is the right shape conceptually — bounded on [0, 1], can be skewed — but it doesn't have to be an exact Beta if a cheaper approximation gets close enough. That's the next thing to figure out.
What's next
The population explosion is also going to need addressing separately from the sampling cost. Even if I halve the cost of Beta::sample, 75 million adherents at generation 250 means 1000 generations is still out of reach. At some point I probably need a population cap or a way to run the sim at a coarser granularity so it doesn't track every individual person at scale. Haven't decided what that looks like yet.
The AWS/Docker scaffolding is also in place now but I'm not touching it until the engine itself is in better shape.