Constitutional AI: Harmlessness from AI Feedback

publish：2024-10-24 16:49:59 author ：Yuntao Bai, Saurav Kadavath, views ：634

Yuntao Bai, Saurav Kadavath, publish：2024-10-24 16:49:59

634

Abstract：As AI systems become more capable, we would like to enlist their help to supervise other AIs. We experiment with methods for training a harmless AI assistant through self-improvement, without any human labels identifying harmful outputs.

The keyword:

AI systems
Capability
Supervision
Experiment
Methods
Harmless AI assistant
Self-improvement
Human labels
Harmful outputs

Download：

2212.08073v1.pdf

Previous : How AI can be a force for good

Next : Creation and evaluation of a pretertiary artificial intelligence (AI) curriculum

Scan and read on your phone

Interdisciplinary Research Perspectives (IRP)

What is AI, anyway?

Automated Testing of Mobile Cloud Disk with Multi-Nodes and Multi-Networks on Docker

Creation and evaluation of a pretertiary artificial intelligence (AI) curriculum

SEGUICI SU

METODI DI PAGAMENTO

Overseas Chinese Press Inc

Add: 90 State Street, Ste 700 Office 40, Albany NY