“This work takes an essential step in the suitable path,” says Douwe Kiela, a researcher at Hugging Face, an AI firm engaged on open-source language fashions. He means that the feedback-driven coaching course of might be repeated over many rounds to enhance the mannequin even additional. In response to Leike, OpenAI may do that by constructing on buyer suggestions.
InstructGPT nonetheless makes easy errors and generally produces irrelevant or nonsensical solutions. For instance, if a immediate is provided that incorporates an untruth, that untruth is assumed to be true. And since it has been skilled to do what individuals ask, InstructGPT will produce much more poisonous language than GPT-3 when instructed to take action.
Ehud Reiter, who works on AI for textual content technology on the College of Aberdeen, UK, welcomes any approach that reduces the quantity of misinformation language fashions produce. However he notes that for some functions, similar to AI giving medical recommendation, no stage of untruth is suitable. Reiter questions whether or not massive language fashions based mostly on black-box neural networks may ever assure person security. Due to this, he favors a mixture of neural networks and symbolic AI, hard-coded guidelines constraining what a mannequin can and can’t say.
Whatever the method, a lot stays to be performed. “We’re not even near fixing this downside,” says Kiela.