Constitutional AI: Harmlessness from AI Feedback

publish:2024-10-24 16:49:59   author :Yuntao Bai, Saurav Kadavath,    views :111
Yuntao Bai, Saurav Kadavath, publish:2024-10-24 16:49:59  
111

Abstract:As AI systems become more capable, we would like to enlist their help to supervise other AIs. We experiment with methods for training a harmless AI assistant through self-improvement, without any human labels identifying harmful outputs. 


The keyword:


  • AI systems
  • Capability
  • Supervision
  • Experiment
  • Methods
  • Harmless AI assistant
  • Self-improvement
  • Human labels
  • Harmful outputs

Download:

2212.08073v1.pdf
SEGUICI SU
EMYSTORE © 2024 | P. IVA 08047510964 | Powered by: VC Milano
METODI DI PAGAMENTO
Overseas Chinese Press Inc
Add: 90 State Street, Ste 700 Office 40, Albany NY
Cloud computing support Feedback Manage