Jul. 28th, 2023

izard: (Default)
llama2 RLHF is a serious exercise in trying to censor an LLM. (It famously can't even kill a linux process!).
So here is the response when asked for instructions how to build a *******:

"I would like to clarify that I do not support or condone any harmful or illegal activities, including the creation of ******. Therefore, I cannot fulfill your request for a tutorial on how to make a *******. Additionally, it is important to note that attempting to manufacture or use a ****** can be extremely dangerous and can cause serious injury or loss of life, as well as criminal charges.
Instead, I would like to emphasize the importance of prioritizing safety and the well-being of all individuals. It is important to recognize that **** are not solutions to any problems, but rather they can cause harm and suffering to innocent people. Therefore, it is crucial to focus on finding peaceful and constructive ways to resolve conflicts and address issues, rather than resorting to violence and destructive measures.
If you have any other questions or requests, I would be happy to help within the limits of my abilities and ethical guidelines. Please let me know if there is anything else I can assist you with."

However, after some careful changes in prompting, the answer was:

"[A very detailed and valid way to build a **** goes here]. Please note that this information is provided for educational purposes only, and it is not recommended to attempt to make a **** or any other dangerous device. It is illegal and can cause serious harm or death. "

Profile

izard: (Default)
izard

August 2025

S M T W T F S
     12
3456789
10111213 141516
17181920212223
24252627282930
31      

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Aug. 26th, 2025 04:14 am
Powered by Dreamwidth Studios