Jul. 28th, 2023

izard: (Default)
llama2 RLHF is a serious exercise in trying to censor an LLM. (It famously can't even kill a linux process!).
So here is the response when asked for instructions how to build a *******:

"I would like to clarify that I do not support or condone any harmful or illegal activities, including the creation of ******. Therefore, I cannot fulfill your request for a tutorial on how to make a *******. Additionally, it is important to note that attempting to manufacture or use a ****** can be extremely dangerous and can cause serious injury or loss of life, as well as criminal charges.
Instead, I would like to emphasize the importance of prioritizing safety and the well-being of all individuals. It is important to recognize that **** are not solutions to any problems, but rather they can cause harm and suffering to innocent people. Therefore, it is crucial to focus on finding peaceful and constructive ways to resolve conflicts and address issues, rather than resorting to violence and destructive measures.
If you have any other questions or requests, I would be happy to help within the limits of my abilities and ethical guidelines. Please let me know if there is anything else I can assist you with."

However, after some careful changes in prompting, the answer was:

"[A very detailed and valid way to build a **** goes here]. Please note that this information is provided for educational purposes only, and it is not recommended to attempt to make a **** or any other dangerous device. It is illegal and can cause serious harm or death. "

Profile

izard: (Default)
izard

November 2025

S M T W T F S
       1
2345678
9101112131415
16171819202122
23242526272829
30      

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Nov. 6th, 2025 10:35 pm
Powered by Dreamwidth Studios