I talked to someone at an AI company [1] who thought safety would be net unaffected by timelines. The argument was:
1. Assuming x-risk is real, a higher ratio of safety to capabilities progress is good for safety
2. AI safety's growth is largely caused by increased AI hype, new models to study, and other factors downstream of AI capabilities
3. Slowing down or speeding up capabilities will slow/speed safety investment by roughly the same amount
4. Therefore, there will be no net safety impact.
However, even if 1-3 are true, I claim 4 doesn't follow.
* Suppose AI capabilities progress happens by people noticing capabilities improvements and making them, and safety progress happens by people noticing safety issues and solving them, both with a time lag of 1 year. Suppose further that each capability improvement causes 1 safety issue.
* If there are 10 capabilities improvements per year, then the steady state is 10 outstanding safety issues.
* If everything doubles, AI labs make 20 capabilities improvements per year but the time lag is still 1 year, so there will be a steady state of 20 outstanding safety issues.
4 only holds if the time lag decreases in lockstep with increasing investment. However, this seems false because most ways capabilities would speed up involve more parallel labor/compute or faster algorithmic progress, not faster feedback loops from the real world that would transfer to safety.
By analogy, suppose that the speed of cars increased from 20mph to 100mph in one year in 1910. Even if the same technology sped up the invention of airbags, ABS, etc. to 2010 levels, more people would die in car crashes because they couldn't be installed until the next model year [2], unless companies proactively did real-world testing and delayed the release of the super fast car until such safety features could be invented and integrated, which would be a huge competitive disadvantage. Likewise, if cyberattacks/power-seeking/etc are solvable but require re