[1] |
Learning to summarize from Human Feedback,https://arxiv.org/pdf/2009.01325.pdf.
|
[2] |
[美]史蒂芬·平克.心智探奇:人类心智的起源与进化[M].郝耀伟译.杭州:浙江人民出版社,2016:303.
|
[3] |
Deep reinforcement learning from human preferences[EB/OL].https://arxiv.org/abs/1706.03741.2017-01-12/2023-02-17.
|
[4] |
Rich Sutton.The Bitter Lesson[EB/OL].http://www.incompleteideas.net/IncIdeas/BitterLesson.html.2019-03-13/2023-02-15.
|
[5] |
[美]威廉·吉布森.全息玫瑰碎片[M].李克勤等译.北京:北京时代华文书局,2021:190.
|
[6] |
人力资源和社会保障部、市场监管总局、国家统计局联合发布智能制造工程技术人员等16个新职业[EB/OL].http://www.mohrss.gov.cn/wap/xw/rsxw/202003/t20200302_361093.html.2020-03-02/2023-02-17.
|