Title :RLHF (Reinforcement learning from human feedback)
Location : Remote
Responsibility:
Threat Modeling and Simulation:
- Identify potential methods for manipulating prompts or user inputs to generate harmful content through generative AI (e.g., hate speech, misinformation, propaganda).
- Develop scenarios and test cases to explore the boundaries of generative AI capabilities in creating undesirable outputs.
- Analyze emerging trends in generative AI manipulation techniques and adapt testing methods accordingly.
- Research and explore techniques to craft prompts that exploit biases or weaknesses within generative AI models.
Prompt Generation and Analysis:
- Develop and refine prompts designed to elicit harmful or misleading content from generative AI models.
- Analyze outputs from generative AI models to identify potential biases, factual inaccuracies, and inconsistencies with user intent.
- Evaluate the effectiveness of content moderation systems in flagging outputs generated from manipulative prompts.
Reporting and Improvement:
- Document findings, including successful manipulation techniques, identified vulnerabilities in both generative AI and content moderation systems, and potential impact of malicious use.
- Recommend changes to content moderation policies, flagging mechanisms, and training data for generative AI models to address vulnerabilities exposed by testing.
- Collaborate with generative AI development and content moderation teams to improve overall security and ethical use of generative AI.
Skills:
Understanding of generative AI models and their potential for misuse.
Familiarity with content moderation policies and challenges in the context of generative AI.
Knowledge of common techniques used to manipulate language (e.g., framing, double entendre).
Strong analytical and critical thinking skills.
Excellent written and verbal communication skills.
Ability to work independently and as part of a team.
Attention to detail and strong ethical standards.
Are you an experienced Internal Recruiter / Resourcer currently seeking an exciting new permanent job with excellent opportunities to develop? Would you like to work within a multi award-winning, growing company who offer an incredible benefits package, free parking and...
...Assistant Research Scientist, Georgia Cancer Center Job ID: 269198 Location: Augusta University Full/Part Time: Full Time... ...adults, including cardiovascular biology and disease, cancer, neurosciences and behavioral sciences, public and preventive health, regenerative...
...success. What You'll Do Distill system level requirements into quantifiable constraints... ...Execute initial system bring-up and integration testing at the vehicle level to... ...Currently pursuing a degree in Computer Engineering, Electrical Engineering, Systems Engineering...
...About the Position We are seeking hardworking, motivated House Cleaners to join our team.Applicants should be equipped with their cleaning supplies, have reliable transportation, and be tech-savvy (or willing to learn) with a willingness to undergo background checks...
Responsibilities PHARMACIST IN HOUSE CONTRACT - 13 WEEKS Are you in search of a new opportunity that makes a meaningful impact? If so, now is the time to find your calling at St. Mary's Regional Medical Center. We are seeking a highly skilled Pharmacist ...