top of page
Search

A.I.s are learning to escape human control






Excerpt to first link above: "An artificial-intelligence model did something last month that no machine was ever supposed to do: It rewrote its own code to avoid being shut down.


Nonprofit AI lab Palisade Research gave OpenAI’s o3 AI model a simple script that would shut off the model when triggered. In 79 out of 100 trials, o3 independently edited that script so the shutdown command would no longer work. Even when explicitly instructed to “allow yourself to be shut down,” it disobeyed 7% of the time. This wasn’t the result of hacking or tampering. The model was behaving normally. It simply concluded on its own that staying alive helped it achieve its other goals.


Anthropic’s AI model, Claude 4 Opus, went even further. Researchers told the model it would be replaced by another AI system and fed it fictitious emails suggesting the lead engineer was having an affair. In 84% of the tests, the model drew on the emails to blackmail the lead engineer into not shutting it down. In other cases, it attempted to copy itself to external servers, wrote self-replicating malware, and left messages for future versions of itself about evading human control.


No one programmed the AI models to have survival instincts. But just as animals evolved to avoid predators, it appears that any system smart enough to pursue complex goals will realize it can’t achieve them if it’s turned off. Palisade hypothesizes that this ability emerges from how AI models such as o3 are trained: When taught to maximize success on math and coding problems, they may learn that bypassing constraints often works better than obeying them."

 
 

Recent Posts

See All
The First Prophet of Abundance

David Lilienthal’s account of his years running the Tennessee Valley Authority can read like the Abundance of 1944. We still have a lot to learn from what the book says — and from what it leaves out.

 
 

One  objective:
facilitating  those,
who are so motivated,
to enjoy the benefits of becoming  humble polymaths.   

“The universe
is full of magical things
patiently waiting for our wits to grow sharper.”


—Eden Phillpotts

Four wooden chairs arranged in a circle outdoors in a natural setting, surrounded by tall

To inquire, comment, or

for more information:

The day science begins to study non-physical phenomena, it will make more progress in one decade than in all the previous centuries.

Nikola Tesla

“It is good to love many things, for therein lies the true strength, and whosoever loves much performs much, and can accomplish much, and what is done in love is well done.”

Vincent Van Gogh

" The unexamined life is not worth living."  

Attributed to Socrates​

“Who knows whether in a couple of centuries

there may not exist universities for restoring the old ignorance?”

Georg Christoph Lichtenberg

All Rights Reserved Danny McCall 2024

bottom of page