John Simpson (journalist)
Wikipedia
https://en.wikipedia.org › wiki › John_Simpson_(journ...
John Cody Fidler-Simpson CBE (born 9 August 1944) is an English foreign correspondent who is currently the world affairs editor of BBC News.
But the pampa is so immense that “finding the needle in the haystack becomes practically impossible without the help of automation,” said Marcus Freitag, an IBM physicist who collaborated on the project.
“It took nearly a century to discover a total of 430 figurative geoglyphs,” said Masato Sakai, an archaeologist at Yamagata University in Japan who has studied the lines for 30 years.
Dr. Sakai is the lead author of a survey published in September in the Proceedings of the National Academy of Sciences that found 303 previously uncharted geoglyphs in only six months, almost doubling the number that had been mapped as of 2020. The researchers used artificial intelligence in tandem with low-flying drones that covered some 243 square miles. Their conclusions also provided insights into the symbols’ enigmatic purpose.
+++++Huang Gino 是非題對人腦比較難,容易被情緒影響,但選擇題AI確實比較不行,我覺得是因為焦點被分散,比較不好收斂,但還是頗厲害,水準以上。計算題(或者說應用題),除非很標準的問題,AI常大崩潰。
前天(9/25)《自然》期刊登出一篇慘烈的AI研究:
隨著大型語言模型(LLM)的訓練參數越來越龐大,#AI卻越來越不可靠。
研究團隊發現,早年的AI模型,遇到不懂的問題比較會迴避,但升級過後的版本,更容易胡謅出一個有模有樣(但錯誤)的答案。
他們分析了三個大型語言模型:OpenAI的 #GPT、Meta(臉書)的 #LLaMA、還有 #BLOOM(BigScience專案開發的全球最大開源語言模型)。發現雖然越大的語言模型回答的精確度確實有上升,但在另外那些不精確的答案中,錯誤的比例卻更上升。
而且這種傾向隨著提出的問題越難、也會越嚴重,尤其像GPT-4,幾乎所有問題都硬要回答、裝得有模有樣。(如圖)
最慘的是,研究團隊讓人類來給予AI模型評價,區辨這些AI的答案是對是錯,結果真人把AI的錯誤答案當成正確答案的比例,大約介於10% - 40%之間。也就是說人類辨識AI答案真假的能力還蠻差的。
▌
⋯⋯看完這篇研究,是否覺得AI跟人類超像:
1. 自以為博學多聞,遇到不懂的問題,卻越來越難以承認「我不知道」 🤷
2. 即使嚴重誇誇其談、裝作無所不知,都還是會有很多人相信他 🤷
(好吧,至少證明AI還是人類文化養出來的小孩無誤)
▌ Zhou, L., Schellaert, W., Martínez-Plumed, F. et al. Larger and more instructable language models become less reliable. Nature (2024).h