Posted inArticles
From Schneier on Security – “Emergent Misalignment” in LLMs
Interesting research: “Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs“: Abstract: We present a surprising result regarding LLMs and alignment. In our experiment, a model is finetuned to output…
