Generating Human-Like Goals by Synthesizing Reward-Producing Programs

Davidson, G., Todd, G., Gureckis, T. M., Togelius, J., & Lake, B. M. (2023). Generating Human-Like Goals by Synthesizing Reward-Producing Programs. Intrinsically Motivated Open-Ended Learning Workshop @ NeurIPS 2023.


Abstract

Humans show a remarkable capacity to generate novel goals, for learning and play alike, and modeling this human capacity would be a valuable step toward more generally-capable artificial agents. We describe a computational model for generating novel human-like goals represented in a domain-specific language (DSL). We learn a `human-likeness' fitness function over expressions in this DSL from a small (<100 game) human dataset collected in an online experiment. We then use a Quality-Diversity (QD) approach to generate a variety of human-like games with different characteristics and high fitness. We demonstrate that our method can generate synthetic games that are syntactically coherent under the DSL, semantically sensible with respect to environmental objects and their affordances, but distinct from human games in the training set. We discuss key components of our model and its current shortcomings, in the hope that this work helps inspire progress toward self-directed agents with human-like goals.


Bibtex entry:

@inproceedings{davidson2023goals,
	abstract = {Humans show a remarkable capacity to generate novel goals, for learning and play alike, and modeling this human capacity would be a valuable step toward more generally-capable artificial agents. We describe a computational model for generating novel human-like goals represented in a domain-specific language (DSL). We learn a `human-likeness' fitness function over expressions in this DSL from a small (<100 game) human dataset collected in an online experiment. We then use a Quality-Diversity (QD) approach to generate a variety of human-like games with different characteristics and high fitness. We demonstrate that our method can generate synthetic games that are syntactically coherent under the DSL, semantically sensible with respect to environmental objects and their affordances, but distinct from human games in the training set. We discuss key components of our model and its current shortcomings, in the hope that this work helps inspire progress toward self-directed agents with human-like goals.},
	author = {Davidson, G. and Todd, G. and Gureckis, T. M. and Togelius, J. and Lake, B. M.},
	booktitle = {Intrinsically Motivated Open-ended Learning Workshop @ NeurIPS 2023.},
	title = {Generating Human-Like Goals by Synthesizing Reward-Producing Programs},
	year = {2023}}


QR Code:


Download SVG