NLP

All posts under tag "NLP"

19 posts total
Sorted by date
Classifying long legal documents using short random chunks

Classifying long legal documents using short random chunks

์ด ๋…ผ๋ฌธ์€ ๋ฒ•๋ฅ  ๋ฌธ์„œ์™€ ๊ฐ™์ด ํ…์ŠคํŠธ ๊ธธ์ด๊ฐ€ ์ˆ˜์ฒœ ํ† ํฐ์— ๋‹ฌํ•˜๋Š” ๋„๋ฉ”์ธ์—์„œ Transformer ๋ชจ๋ธ์˜ ์ž…๋ ฅ ์ œํ•œ์„ ์šฐํšŒํ•˜๊ธฐ ์œ„ํ•œ ์‹ค์šฉ์ ์ธ ์ ‘๊ทผ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ๊ธฐ์กด ์—ฐ๊ตฌ๋“ค์€ ๋ณดํ†ต ์ „์ฒด ๋ฌธ์„œ๋ฅผ ์Šฌ๋ผ์ด๋”ฉ ์œˆ๋„์šฐ ๋ฐฉ์‹์œผ๋กœ ๋‚˜๋ˆ„๊ฑฐ๋‚˜, ํ•ต์‹ฌ ๋ฌธ์žฅ์„ ์ถ”์ถœํ•˜๋Š” ์ „์ฒ˜๋ฆฌ ๋‹จ๊ณ„์— ์˜์กดํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์Šฌ๋ผ์ด๋”ฉ ์œˆ๋„์šฐ๋Š” ์—ฐ์‚ฐ๋Ÿ‰์ด ๊ธ‰์ฆํ•˜๊ณ , ํ•ต์‹ฌ ๋ฌธ์žฅ ์ถ”์ถœ์€ ๋„๋ฉ”์ธ ํŠนํ™”๋œ ์š”์•ฝ ๋ชจ๋ธ์ด ํ•„์š”ํ•ด ์ถ”๊ฐ€ ๋น„์šฉ์ด ๋ฐœ์ƒํ•œ๋‹ค. ์ €์ž๋“ค์€ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ โ€œ๋ฌด์ž‘์œ„ ์ฒญํฌ ์ƒ˜ํ”Œ๋งโ€์ด๋ผ๋Š” ๊ฐ„๋‹จํ•˜์ง€๋งŒ ํšจ๊ณผ์ ์ธ ์ „๋žต์œผ๋กœ ํ•ด๊ฒฐํ•œ๋‹ค. 48๊ฐœ์˜ ์ฒญํฌ๋ฅผ ๋ฌด์ž‘์œ„๋กœ ์„ ํƒํ•จ์œผ๋กœ์จ ๋ฌธ์„œ ์ „์ฒด์˜ ๋‹ค์–‘

Computer Science NLP
No Image

Entropy of Telugu

์ด ๋…ผ๋ฌธ์€ ์ธ๋„์–ด ๋ฌธ์ž์™€ ๊ทธ์— ๋”ฐ๋ฅธ ์ฒ ์ž๋ฒ•์„ ๋ฐ”ํƒ•์œผ๋กœ ํ…”๋ฃจ๊ตฌ์–ด์˜ ์—”ํŠธ๋กœํ”ผ๋ฅผ ๋ถ„์„ํ•˜๋Š” ๋ฐ ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ์žˆ๋‹ค. ๋จผ์ €, ์ธ๋„ ๋ฌธ์ž์˜ ๊ธฐ์›๊ณผ ๋ฐœ์ „ ๊ณผ์ •์„ ์‚ดํŽด๋ณด๋ฉด, ๋ธŒ๋ผ๋ฏธ ๋ฌธ์ž๋Š” 3์ฒœ๋…„ ์ „ ์ธ๋”์Šค ๋ฌธ์ž์—์„œ ์ง„ํ™”ํ•œ ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์ ธ ์žˆ์œผ๋ฉฐ, ์ดํ›„ ๋‹ค์–‘ํ•œ ์ธ๋„ ํ˜„๋Œ€ ๋ฌธ์ž๋กœ ๋ถ„ํ™”๋˜์—ˆ๋‹ค. ์ด๋“ค ๋ฌธ์ž๋Š” ๊ตฌ์กฐ์ ์œผ๋กœ ๋ฐ€์ ‘ํ•˜๊ฒŒ ์—ฐ๊ด€๋˜์–ด ์žˆ์ง€๋งŒ, ๋ชจ์–‘์€ ๋‹ค์–‘ํ•˜๋‹ค. ์ธ๋„์–ด ์•ŒํŒŒ๋ฒณ์€ ์ž์Œ, ๋ชจ์Œ ๋ฐ ๊ธฐํƒ€ ๊ธฐํ˜ธ๋กœ ๊ตฌ์„ฑ๋˜๋ฉฐ, ํ•œ ์Œ์ ˆ(akshara)์€ 0๊ฐœ์—์„œ 3๊ฐœ์˜ ์ž์Œ๊ณผ ๋ชจ์Œ ๋˜๋Š” ๊ธฐํƒ€ ๊ธฐํ˜ธ๋กœ ์ด๋ฃจ์–ด์ง„๋‹ค. ๊ฐ akshara๋Š” ๋…๋ฆฝ์ ์œผ๋กœ ๋ฐœ์Œ๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋ชจ๋“ 

Computer Science NLP
Do LLMs Judge Distantly Supervised Named Entity Labels Well? Constructing the JudgeWEL Dataset

Do LLMs Judge Distantly Supervised Named Entity Labels Well? Constructing the JudgeWEL Dataset

judgeWEL ๋…ผ๋ฌธ์€ ์ €์ž์› ์–ธ์–ด์ธ ๋ฃฉ์…ˆ๋ถ€๋ฅดํฌ์–ด์— ๋Œ€ํ•œ NER ๋ฐ์ดํ„ฐ ๊ตฌ์ถ•์ด๋ผ๋Š” ์‹ค์งˆ์ ์ธ ๋ฌธ์ œ์— ๋Œ€ํ•ด ์ฐฝ์˜์ ์ธ ํ•ด๊ฒฐ์ฑ…์„ ์ œ์‹œํ•œ๋‹ค. ๊ฐ€์žฅ ํฐ ๊ฐ•์ ์€ ๋‘ ๊ฐ€์ง€ ์ธก๋ฉด์—์„œ ์•ฝํ•œ ๊ฐ๋…์„ ํ™œ์šฉํ•œ๋‹ค๋Š” ์ ์ด๋‹ค. ์ฒซ์งธ, ์œ„ํ‚คํ”ผ๋””์•„ ๋‚ด๋ถ€ ๋งํฌ์™€ ์œ„ํ‚ค๋ฐ์ดํ„ฐ์˜ ๊ตฌ์กฐํ™”๋œ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ์—ฐ๊ฒฐํ•จ์œผ๋กœ์จ ์—”ํ„ฐํ‹ฐ ์œ ํ˜•์„ ์ž๋™์œผ๋กœ ์ถ”๋ก ํ•œ๋‹ค๋Š” ์•„์ด๋””์–ด๋Š” ๊ธฐ์กด์˜ ๊ทœ์น™ ๊ธฐ๋ฐ˜ ํ˜น์€ ์‚ฌ์ „ ๋งคํ•‘ ๋ฐฉ์‹๋ณด๋‹ค ํ™•์žฅ์„ฑ์ด ๋›ฐ์–ด๋‚˜๋‹ค. ์œ„ํ‚คํ”ผ๋””์•„๋Š” ์ง€์†์ ์œผ๋กœ ์—…๋ฐ์ดํŠธ๋˜๋ฉฐ ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ์„ ํฌ๊ด„ํ•˜๋ฏ€๋กœ, ์ด ์ ‘๊ทผ๋ฒ•์€ ์ƒˆ๋กœ์šด ์—”ํ„ฐํ‹ฐ๊ฐ€ ๋“ฑ์žฅํ•ด๋„ ๋น„๊ต์  ์‰ฝ๊ฒŒ ๋ฐ˜์˜๋  ์ˆ˜ ์žˆ๋‹ค. ๋‘˜์งธ, ์ž๋™ ๋ผ๋ฒจ๋ง ๋‹จ

Computer Science NLP Data
AdaGReS:Adaptive Greedy Context Selection via Redundancy-Aware Scoring for Token-Budgeted RAG

AdaGReS:Adaptive Greedy Context Selection via Redundancy-Aware Scoring for Token-Budgeted RAG

AdaGReS ๋…ผ๋ฌธ์€ ํ˜„์žฌ RAG ์‹œ์Šคํ…œ์ด ์ง๋ฉดํ•œ ๋‘ ๊ฐ€์ง€ ํ•ต์‹ฌ ๋ฌธ์ œโ€”ํ† ํฐ ์˜ˆ์‚ฐ์˜ ์ œํ•œ๊ณผ ์ปจํ…์ŠคํŠธ ์ค‘๋ณตโ€”๋ฅผ ๋™์‹œ์— ํ•ด๊ฒฐํ•˜๋ ค๋Š” ์‹œ๋„๋กœ ๋ˆˆ์— ๋ˆ๋‹ค. ์ „ํ†ต์ ์ธ topโ€‘k ๊ฒ€์ƒ‰์€ ๋‹จ์ˆœํžˆ ์ ์ˆ˜ ์ˆœ์œผ๋กœ ์ฒญํฌ๋ฅผ ์„ ํƒํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ์˜๋ฏธ์ ์œผ๋กœ ๊ฑฐ์˜ ๋™์ผํ•œ ๋ฌธ์žฅ์ด ์—ฌ๋Ÿฌ ๋ฒˆ ํฌํ•จ๋  ๊ฒฝ์šฐ ๋ถˆํ•„์š”ํ•œ ํ† ํฐ์„ ์†Œ๋ชจํ•œ๋‹ค. ์ด๋Š” ํŠนํžˆ ์ œํ•œ๋œ ์ปจํ…์ŠคํŠธ ๊ธธ์ด๋ฅผ ๊ฐ–๋Š” ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ(Large Language Model, LLM)์—์„œ ์‹ฌ๊ฐํ•œ ์„ฑ๋Šฅ ์ €ํ•˜ ์š”์ธ์œผ๋กœ ์ž‘์šฉํ•œ๋‹ค. AdaGReS๋Š” ์ด๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด โ€œ๊ด€๋ จ๋„โ€‘์ค‘๋ณต ๋ณตํ•ฉ ๋ชฉํ‘œ ํ•จ์ˆ˜โ€๋ฅผ ์ •์˜ํ•œ๋‹ค. ๋ชฉํ‘œ ํ•จ์ˆ˜๋Š” (1

Computer Science NLP
No Image

Do Large Language Models Know What They Are Capable Of?

์ด ๋…ผ๋ฌธ์€ โ€œ๋ฉ”ํƒ€โ€‘์ธ์ง€โ€๋ผ๋Š” ๊ด€์ ์—์„œ LLM์˜ ์ž๊ธฐ ํ‰๊ฐ€ ๋Šฅ๋ ฅ์„ ์ฒด๊ณ„์ ์œผ๋กœ ๊ฒ€์ฆํ•œ๋‹ค๋Š” ์ ์—์„œ ์˜๋ฏธ๊ฐ€ ํฌ๋‹ค. ๋จผ์ € ์—ฐ๊ตฌ์ง„์€ โ€œ์„ฑ๊ณต ์˜ˆ์ธกโ€์ด๋ผ๋Š” ์ด์ง„ ํŒ๋‹จ์„ ํ†ตํ•ด ๋ชจ๋ธ์ด ์ž์‹ ์˜ ํ•œ๊ณ„๋ฅผ ์–ผ๋งˆ๋‚˜ ์ •ํ™•ํžˆ ์ธ์‹ํ•˜๋Š”์ง€๋ฅผ ์ธก์ •ํ•˜์˜€๋‹ค. ์—ฌ๊ธฐ์„œ ์‚ฌ์šฉ๋œ ํ‰๊ฐ€์ง€ํ‘œ๋Š” ๋‹จ์ˆœ ์ •ํ™•๋„๋ฟ ์•„๋‹ˆ๋ผ ROCโ€‘AUC์™€ ๊ฐ™์€ ๊ตฌ๋ณ„๋ ฅ ์ง€ํ‘œ์ด๋ฉฐ, ์ด๋Š” ๋ชจ๋ธ์ด ๊ณผ์‹ (overโ€‘confidence)๊ณผ ๊ณผ์†Œ์‹ (underโ€‘confidence) ์‚ฌ์ด์—์„œ ์–ด๋А ์ •๋„ ๊ท ํ˜•์„ ์žก๋Š”์ง€๋ฅผ ๋ณด์—ฌ์ค€๋‹ค. ๊ฒฐ๊ณผ๋Š” ๋Œ€๋ถ€๋ถ„์˜ ์ตœ์‹  LLM์ด ๋†’์€ ํ™•์‹ ์„ ๋ณด์ด์ง€๋งŒ, ๋ฌด์ž‘์œ„๋ณด๋‹ค ๋†’์€ AUC๋ฅผ ๊ธฐ๋กํ•œ๋‹ค๋Š” ์ ์ด๋‹ค

Computer Science NLP Model
R-Debater: Retrieval-Augmented Debate Generation through Argumentative Memory

R-Debater: Retrieval-Augmented Debate Generation through Argumentative Memory

Rโ€‘Debater๋Š” โ€œ๋…ผ์ฆ ๋ฉ”๋ชจ๋ฆฌโ€๋ผ๋Š” ๊ฐœ๋…์„ ํ† ๋ก  ์ƒ์„ฑ์— ์ ์šฉํ•จ์œผ๋กœ์จ ๊ธฐ์กด LLM ๊ธฐ๋ฐ˜ ํ† ๋ก  ์‹œ์Šคํ…œ์ด ๊ฐ–๋Š” ๋ช‡ ๊ฐ€์ง€ ๊ทผ๋ณธ์ ์ธ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•œ๋‹ค. ์ฒซ์งธ, ์ผ๋ฐ˜์ ์ธ LLM์€ ๋Œ€๊ทœ๋ชจ ์‚ฌ์ „ํ•™์Šต์„ ํ†ตํ•ด ํ’๋ถ€ํ•œ ์–ธ์–ด ๋Šฅ๋ ฅ์„ ๋ณด์œ ํ•˜์ง€๋งŒ, ํŠน์ • ์ฃผ์žฅ์ด๋‚˜ ์ฆ๊ฑฐ๋ฅผ ์ผ๊ด€๋˜๊ฒŒ ์ธ์šฉํ•˜๋Š” ๋Šฅ๋ ฅ์€ ์ œํ•œ์ ์ด๋‹ค. ์ด๋Š” ํŠนํžˆ ๋‹ค์ค‘ ํ„ด ํ† ๋ก ์—์„œ โ€˜์ž…์žฅ ์ผ๊ด€์„ฑโ€™๊ณผ โ€˜์ฆ๊ฑฐ ๊ธฐ๋ฐ˜ ์ฃผ์žฅโ€™์ด ์š”๊ตฌ๋  ๋•Œ, ๋ชจ๋ธ์ด ์•ž์„  ๋ฐœ์–ธ์„ ๋ง๊ฐํ•˜๊ฑฐ๋‚˜ ๋ถ€์ •ํ™•ํ•œ ์ •๋ณด๋ฅผ ์‚ฝ์ž…ํ•˜๋Š” ์˜ค๋ฅ˜๋ฅผ ์ดˆ๋ž˜ํ•œ๋‹ค. Rโ€‘Debater๋Š” ๋ณ„๋„์˜ ํ† ๋ก  ์ง€์‹๋ฒ ์ด์Šค๋ฅผ ๊ตฌ์ถ•ํ•ด ์‚ฌ๋ก€โ€‘ํ˜• ์ฆ๊ฑฐ์™€ ๊ณผ๊ฑฐ ํ† ๋ก  ์ „๊ฐœ๋ฅผ ์ธ๋ฑ์‹ฑํ•˜๊ณ ,

Computer Science NLP
No Image

Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling

๋ณธ ๋…ผ๋ฌธ์€ ๋‹ค๋‹จ๊ณ„ RAG ์‹œ์Šคํ…œ์—์„œ ๋ฉ”๋ชจ๋ฆฌ์˜ ์—ญํ• ์„ ๊ทผ๋ณธ์ ์œผ๋กœ ์žฌ์ •์˜ํ•œ๋‹ค๋Š” ์ ์—์„œ ํ•™์ˆ ์ ยท์‹ค์šฉ์  ์˜์˜๊ฐ€ ํฌ๋‹ค. ๊ธฐ์กด ์—ฐ๊ตฌ๋“ค์€ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ โ€œ์ˆ˜๋™์  ์ €์žฅ์†Œโ€๋กœ ๊ฐ„์ฃผํ•˜๊ณ , ๊ฒ€์ƒ‰๋œ ํ…์ŠคํŠธ ์กฐ๊ฐ๋“ค์„ ๋‹จ์ˆœํžˆ ์••์ถ•ํ•˜๊ฑฐ๋‚˜ ์ˆœ์ฐจ์ ์œผ๋กœ ์—ฐ๊ฒฐํ•˜๋Š” ๋ฐฉ์‹์— ๋จธ๋ฌผ๋ €๋‹ค. ์ด๋Ÿฌํ•œ ์ ‘๊ทผ์€ ๊ฐœ๋ณ„ ์‚ฌ์‹ค์„ ๋‚˜์—ดํ•˜๋Š” ์ˆ˜์ค€์— ๊ทธ์น˜๋ฉฐ, ์‚ฌ์‹ค ๊ฐ„์˜ ๋ณตํ•ฉ์  ๊ด€๊ณ„โ€”์˜ˆ๋ฅผ ๋“ค์–ด, ์ธ๊ณผ๊ด€๊ณ„, ๊ณตํ†ต ์›์ธ, ์ƒํ˜ธ ๋ณด์™„์  ์ฆ๊ฑฐ ๋“ฑโ€”๋ฅผ ํฌ์ฐฉํ•˜์ง€ ๋ชปํ•œ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ์žฅ๊ธฐ ๋ฌธ๋งฅ์—์„œ ์—ฌ๋Ÿฌ ๋‹จ๊ณ„์— ๊ฑธ์นœ ์ถ”๋ก ์ด ๋‹จ์ ˆ๋˜๊ณ , ์ „์—ญ์  ์˜๋ฏธ๋ง์„ ํ˜•์„ฑํ•˜๋Š” ๋ฐ ํ•œ๊ณ„๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค. HGMEM์€ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜

Computer Science NLP Model
Notes on Electronic Lexicography

Notes on Electronic Lexicography

๋ณธ ๋…ผ๋ฌธ์€ ์ „์ž ์‚ฌ์ „์˜ ๋ณธ์งˆ๊ณผ ๊ทธ ์ค‘์š”์„ฑ์„ ๋‹ค๊ฐ๋„๋กœ ๋ถ„์„ํ•˜๋ฉฐ, ๋””์ง€ํ„ธ ์‹œ๋Œ€์—์„œ ์–ธ์–ด ์ž๋ฃŒ ํ‘œํ˜„ ๋ฐฉ์‹์— ๋Œ€ํ•œ ์ƒˆ๋กœ์šด ๊ด€์ ์„ ์ œ์‹œํ•œ๋‹ค. ์ฃผ์š” ๋‚ด์šฉ์„ ์„ธ ๊ฐ€์ง€ ํฐ ๋ฒ”์ฃผ๋กœ ๋‚˜๋ˆ„์–ด ์‚ดํŽด๋ณด๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 1. ์ „์ž ์‚ฌ์ „์˜ ๋ณธ์งˆ๊ณผ ์˜๋ฏธ ์žฌํ•ด์„ ์ „์ž ์‚ฌ์ „์€ ๋‹จ์ˆœํžˆ ์ข…์ด ์‚ฌ์ „์˜ ๋””์ง€ํ„ธ ๋ณ€ํ˜•์ฒด๊ฐ€ ์•„๋‹ˆ๋ผ, ์ƒˆ๋กœ์šด ์˜๋ฏธ์™€ ๊ธฐ๋Šฅ์„ ์ง€๋‹Œ ๋…ํŠนํ•œ ์–ธ์–ด ์ž๋ฃŒ๋กœ ์ •์˜๋œ๋‹ค. ์ด๋Š” '์ข…์ด ์ „์ž' ์ด๋ถ„๋ฒ•์—์„œ ๋ฒ—์–ด๋‚˜ ํ…์ŠคํŠธ์™€ ๋งค์ฒด๋ฅผ ๋ถ„๋ฆฌํ•˜์ง€ ์•Š๋Š” ๋ณธ์งˆ์ ์ธ ๊ด€์ ์„ ์ œ์‹œํ•œ๋‹ค. ์ „์ž ์‚ฌ์ „์€ ๋””์ง€ํ„ธ ํ™˜๊ฒฝ์—์„œ ์˜๋ฏธ ์ƒ์„ฑ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ํ†ตํ•ด ์„ธ๋ถ„ํ™”๋œ ์˜๋ฏธ๋ฅผ ์ œ๊ณตํ•˜๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ๋…

NLP Computer Science
Multi-Dimensional Prompt Chaining to Improve Open-Domain Dialogue Generation

Multi-Dimensional Prompt Chaining to Improve Open-Domain Dialogue Generation

๋ณธ ๋…ผ๋ฌธ์€ ์ตœ๊ทผ ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ(LLM)์ด ๋Œ€ํ™” ์‹œ์Šคํ…œ์—์„œ ๋ณด์—ฌ์ฃผ๋Š” ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ๊ณผ๋Š” ๋‹ฌ๋ฆฌ, ์†Œํ˜• ์–ธ์–ด ๋ชจ๋ธ(SLM)์ด ๊ฐ–๋Š” ๋ฐฐํฌยท์šด์˜์ƒ์˜ ์žฅ์ ์„ ์‚ด๋ฆฌ๋ฉด์„œ๋„ ํ’ˆ์งˆ ๊ฒฉ์ฐจ๋ฅผ ๋ฉ”์šฐ๊ธฐ ์œ„ํ•œ ์‹ค์šฉ์ ์ธ ์ ‘๊ทผ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ํ•ต์‹ฌ ์•„์ด๋””์–ด๋Š” โ€˜ํ”„๋กฌํ”„ํŠธ ์ฒด์ด๋‹(prompt chaining)โ€™์ด๋ผ๋Š” ๊ธฐ๋ฒ•์„ ๋‹ค์ฐจ์›์ ์œผ๋กœ ํ™•์žฅํ•˜์—ฌ, ๊ฐ๊ฐ์˜ ๋Œ€ํ™” ํ’ˆ์งˆ ์š”์†Œ์ธ ์ž์—ฐ์Šค๋Ÿฌ์›€(Naturalness), ์ผ๊ด€์„ฑ(Coherence), ํฅ๋ฏธ์„ฑ(Engagingness)์„ ๋…๋ฆฝ์ ์œผ๋กœ ๊ฐ•ํ™”ํ•˜๊ณ , ์ตœ์ข… ์‘๋‹ต์—์„œ ์ด๋“ค์„ ์กฐํ™”๋กญ๊ฒŒ ๊ฒฐํ•ฉํ•˜๋„๋ก ์„ค๊ณ„ํ•œ ๊ฒƒ์ด๋‹ค. 1. ํ”„๋ ˆ์ž„์›Œํฌ ์„ค๊ณ„ N

Computer Science NLP
Can Large Language Models Still Explain Themselves? Investigating the Impact of Quantization on Self-Explanations

Can Large Language Models Still Explain Themselves? Investigating the Impact of Quantization on Self-Explanations

๋ณธ ๋…ผ๋ฌธ์€ ์–‘์žํ™”๊ฐ€ ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ(Large Language Model, LLM)์˜ ์ž๊ธฐ์„ค๋ช…(selfโ€‘explanations, SE) ๋Šฅ๋ ฅ์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ์ฒด๊ณ„์ ์œผ๋กœ ์กฐ์‚ฌํ•œ ์ตœ์ดˆ์˜ ์—ฐ๊ตฌ๋ผ ํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ธฐ์กด ์—ฐ๊ตฌ์—์„œ๋Š” ์–‘์žํ™”๊ฐ€ ๋ชจ๋ธ์˜ ์ถ”๋ก  ์†๋„์™€ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ํฌ๊ฒŒ ๊ฐœ์„ ํ•œ๋‹ค๋Š” ์ ์— ์ดˆ์ ์„ ๋งž์ถ”์—ˆ์ง€๋งŒ, SE์™€ ๊ฐ™์ด ๋ชจ๋ธ ๋‚ด๋ถ€์˜ ์ถ”๋ก  ๊ณผ์ •์„ ์™ธ๋ถ€์— ์„ค๋ช…ํ•˜๋„๋ก ์š”๊ตฌ๋˜๋Š” ๊ณ ์ฐจ์› ์ž‘์—…์— ๋Œ€ํ•œ ์˜ํ–ฅ์€ ๊ฐ„๊ณผ๋˜์–ด ์™”๋‹ค. ์ด ์ ์„ ๋ฉ”์šฐ๊ธฐ ์œ„ํ•ด ์ €์ž๋“ค์€ ๋‘ ๊ฐ€์ง€ SE ์œ ํ˜•, ์ฆ‰ ์ž์—ฐ์–ด ์„ค๋ช…(NLE)๊ณผ ๋ฐ˜์‚ฌ์‹ค ์˜ˆ์‹œ(counterfactual exa

Computer Science NLP Model
Defensive M2S: Training Guardrail Models on Compressed Multi-turn Conversations

Defensive M2S: Training Guardrail Models on Compressed Multi-turn Conversations

Defensive M2S๋Š” ๊ธฐ์กด ๊ฐ€๋“œ๋ ˆ์ผ ๋ชจ๋ธ์ด ์ „์ฒด ๋Œ€ํ™” ํžˆ์Šคํ† ๋ฆฌ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„์•ผ ํ•˜๋Š” ๊ตฌ์กฐ์  ํ•œ๊ณ„๋ฅผ ๊ทผ๋ณธ์ ์œผ๋กœ ํ•ด๊ฒฐํ•œ๋‹ค๋Š” ์ ์—์„œ ์˜๋ฏธ๊ฐ€ ํฌ๋‹ค. ๋‹ค์ค‘ํ„ด ๋Œ€ํ™”๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ํ† ํฐ ์ˆ˜๊ฐ€ O(nยฒ) ์ˆ˜์ค€์œผ๋กœ ๊ธ‰์ฆํ•˜๋Š”๋ฐ, ์ด๋Š” ํŠนํžˆ 10ํ„ด ์ด์ƒ์œผ๋กœ ๊ธธ์–ด์ง€๋Š” ์‹ค์ œ ์„œ๋น„์Šค ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ GPU ๋ฉ”๋ชจ๋ฆฌ์™€ ์—ฐ์‚ฐ ์‹œ๊ฐ„์˜ ๋ณ‘๋ชฉ์„ ์ดˆ๋ž˜ํ•œ๋‹ค. ๋…ผ๋ฌธ์€ ์ด๋ฅผ โ€˜Multiโ€‘turn to Singleโ€‘turn (M2S)โ€™ ์••์ถ•์ด๋ผ๋Š” ๊ฐ„๋‹จํ•˜์ง€๋งŒ ํšจ๊ณผ์ ์ธ ๋ณ€ํ™˜ ๊ทœ์น™์œผ๋กœ ์ „ํ™˜ํ•œ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ๊ฐ ํ„ด์˜ ํ•ต์‹ฌ ๋ฐœํ™”๋งŒ์„ ๋‚จ๊ธฐ๊ณ , ๋Œ€ํ™” ํ๋ฆ„์„ ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ํ•˜์ดํ”ˆ(โ€“),

Computer Science NLP Model
No Image

Language as Mathematical Structure: Examining Semantic Field Theory Against Language Games

์ด ๋…ผ๋ฌธ์€ ์ตœ๊ทผ ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ(Large Language Models, LLM)์˜ ๊ธ‰๊ฒฉํ•œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์ด ์˜๋ฏธ๋ก  ์—ฐ๊ตฌ์— ๋ฏธ์น˜๋Š” ํ•จ์˜๋ฅผ ๋‘ ์ถ•์œผ๋กœ ๋‚˜๋ˆ„์–ด ๊ณ ์ฐฐํ•œ๋‹ค. ์ฒซ ๋ฒˆ์งธ ์ถ•์€ ๋ฃจํŠธ๋น„ํžˆ ๋น„ํŠธ๊ฒ์Šˆํƒ€์ธ์˜ ํ›„๊ธฐ ์ฒ ํ•™์— ๊ธฐ๋ฐ˜ํ•œ ์‚ฌํšŒ๊ตฌ์„ฑ์ฃผ์˜์  โ€˜์–ธ์–ด๊ฒŒ์ž„โ€™ ์ ‘๊ทผ์ด๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” ์˜๋ฏธ๊ฐ€ ํ™”์ž ๊ฐ„์˜ ๊ด€์Šต์  ์ƒํ˜ธ์ž‘์šฉ๊ณผ ์‚ฌ์šฉ ์ƒํ™ฉ์— ์˜ํ•ด ํ˜•์„ฑ๋œ๋‹ค๊ณ  ๋ณด๋ฉฐ, ์–ด๋– ํ•œ ํ˜•์‹์  ๊ทœ์น™๋„ ์˜๋ฏธ๋ฅผ ์™„์ „ํžˆ ์„ค๋ช…ํ•  ์ˆ˜ ์—†๋‹ค๊ณ  ์ฃผ์žฅํ•œ๋‹ค. ๋‘ ๋ฒˆ์งธ ์ถ•์€ ์ €์ž๊ฐ€ ์ œ์•ˆํ•œ โ€˜์˜๋ฏธ์žฅ ์ด๋ก (Semantic Field Theory)โ€™์œผ๋กœ, ์–ธ์–ด๋ฅผ ์—ฐ์†์ ์ธ ์˜๋ฏธ ๊ณต๊ฐ„ ์•ˆ์—์„œ ์„œ๋กœ ์–ฝํžŒ

Computer Science NLP
Robust Uncertainty Quantification for Factual Generation of Large Language Models

Robust Uncertainty Quantification for Factual Generation of Large Language Models

์ด ๋…ผ๋ฌธ์€ LLM์˜ โ€˜ํ™˜๊ฐโ€™ ๋ฌธ์ œ๋ฅผ ๋ถˆํ™•์‹ค์„ฑ ์ •๋Ÿ‰ํ™”๋ผ๋Š” ๊ด€์ ์—์„œ ์ ‘๊ทผํ•œ๋‹ค๋Š” ์ ์—์„œ ์˜๋ฏธ๊ฐ€ ํฌ๋‹ค. ๊ธฐ์กด์˜ ๋ถˆํ™•์‹ค์„ฑ ์ถ”์ • ๊ธฐ๋ฒ•โ€”์˜ˆ๋ฅผ ๋“ค์–ด ๋ฒ ์ด์ง€์•ˆ ์‹ ๊ฒฝ๋ง, MCโ€‘Dropout, ์—”์‚ผ๋ธ” ๋ฐฉ๋ฒ•โ€”์€ ์ฃผ๋กœ ์ •ํ˜•ํ™”๋œ QA ๋ฐ์ดํ„ฐ์…‹์—์„œ ๊ฒ€์ฆ๋˜์—ˆ์œผ๋ฉฐ, ์งˆ๋ฌธ์ด ์˜๋„์ ์œผ๋กœ ํ˜ผ๋™์„ ์ฃผ๋Š” ํ˜•ํƒœ์ผ ๋•Œ๋Š” ์‹ ๋ขฐ๋„ ์ ์ˆ˜๊ฐ€ ๊ธ‰๊ฒฉํžˆ ์™œ๊ณก๋˜๋Š” ํ•œ๊ณ„๋ฅผ ๋ณด์˜€๋‹ค. ์ €์ž๋“ค์€ ์ด๋Ÿฌํ•œ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด โ€˜ํ•จ์ • ์งˆ๋ฌธ(trap question)โ€™์ด๋ผ๋Š” ์ƒˆ๋กœ์šด ํ‰๊ฐ€ ๋„๊ตฌ๋ฅผ ์„ค๊ณ„ํ–ˆ๋Š”๋ฐ, ์—ฌ๊ธฐ์—๋Š” ์‹ค์ œ ์กด์žฌํ•˜์ง€ ์•Š๋Š” ์ธ๋ฌผ๋ช…์ด๋‚˜ ํ—ˆ์œ„ ์‚ฌ์‹ค์ด ์‚ฝ์ž…๋˜์–ด ๋ชจ๋ธ์ด ์‚ฌ์‹ค์„ ์ƒ์„ฑํ•˜๋„๋ก ์œ ๋„ํ•œ๋‹ค.

Computer Science NLP Model
Modeling Language as a Sequence of Thoughts

Modeling Language as a Sequence of Thoughts

์ด ๋…ผ๋ฌธ์€ ํ˜„์žฌ ๊ฐ€์žฅ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ํŠธ๋žœ์Šคํฌ๋จธ ๊ธฐ๋ฐ˜ ์–ธ์–ด ๋ชจ๋ธ์ด โ€œํ‘œ๋ฉด์ โ€ ํ† ํฐ ์—ฐ๊ด€์„ฑ์— ๊ณผ๋„ํ•˜๊ฒŒ ์˜์กดํ•œ๋‹ค๋Š” ๊ทผ๋ณธ์ ์ธ ํ•œ๊ณ„๋ฅผ ์งš๊ณ  ์žˆ๋‹ค. ํ† ํฐ ์ˆ˜์ค€์—์„œ๋งŒ ํ•™์Šต์ด ์ด๋ฃจ์–ด์ง€๋ฉด ๋ชจ๋ธ์€ ๋ฌธ๋งฅ ์ „์ฒด์— ๊ฑธ์นœ ์ผ๊ด€๋œ ์—”ํ‹ฐํ‹ฐยท์‚ฌ๊ฑด ํ‘œํ˜„์„ ๊ตฌ์ถ•ํ•˜์ง€ ๋ชปํ•œ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ โ€œ์—ญ์ „ ์ €์ฃผโ€(์˜ˆ: โ€œ์•„๋ฒ„์ง€๊ฐ€ ์•„๋“ค์„ ๋‚ณ์•˜๋‹คโ€์™€ โ€œ์•„๋“ค์ด ์•„๋ฒ„์ง€๋ฅผ ๋‚ณ์•˜๋‹คโ€๋ฅผ ๊ตฌ๋ถ„ํ•˜์ง€ ๋ชปํ•จ)์™€ ๊ฐ™์€ ๊ด€๊ณ„ ์ผ๋ฐ˜ํ™” ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜๊ณ , ๋™์ผํ•œ ์˜๋ฏธ๋ฅผ ๊ฐ€์ง„ ๋‹ค์–‘ํ•œ ํ‘œํ˜„์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด ๋ถˆํ•„์š”ํ•˜๊ฒŒ ๋งŽ์€ ๋ฐ์ดํ„ฐ๊ฐ€ ์š”๊ตฌ๋œ๋‹ค. ์ธ์ง€๊ณผํ•™ ์—ฐ๊ตฌ์—์„œ๋Š” ์ธ๊ฐ„์ด ์–ธ์–ด๋ฅผ ์ฒ˜๋ฆฌํ•  ๋•Œ ์ž…๋ ฅ ์ŠคํŠธ๋ฆผ์„ ์ผ์‹œ์ ์ธ ํ‘œ

Computer Science NLP Model
No Image

Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time

๋ณธ ๋…ผ๋ฌธ์€ ํ˜„์žฌ LLM์ด ๋ณต์žกํ•œ ๋ฌธ์ œ ํ•ด๊ฒฐ์— ํ”ํžˆ ์‚ฌ์šฉํ•˜๋Š” CoT(Chainโ€‘ofโ€‘Thought) ๋ฐฉ์‹์ด โ€œ๊ณผ๋‹ค ํ† ํฐ ์ƒ์„ฑโ€๊ณผ โ€œ๋ถˆ์•ˆ์ •ํ•œ ์‚ฌ๊ณ  ํ๋ฆ„โ€์ด๋ผ๋Š” ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ๋ณ‘๋ชฉ์„ ์•ˆ๊ณ  ์žˆ๋‹ค๋Š” ์ ์„ ์ •ํ™•ํžˆ ์งš์–ด๋‚ธ๋‹ค. ์ €์ž๋“ค์€ ๋จผ์ € ๋Œ€๊ทœ๋ชจ ๋ชจ๋ธ(์˜ˆ: GPTโ€‘NeoX, LLaMA)์—์„œ ์ถ”๋ก  ์‹œ ์ƒ์„ฑ๋˜๋Š” ํ† ํฐ ์‹œํ€€์Šค๋ฅผ ๋‹จ๊ณ„๋ณ„๋กœ ๋ถ„์„ํ•˜๊ณ , ๊ฐ ๋‹จ๊ณ„๊ฐ€ ์–ด๋–ค ์ธ์ง€์  ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•˜๋Š”์ง€ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐํ™”ํ•œ๋‹ค. ์ด ๊ณผ์ •์—์„œ ํŠนํžˆ โ€˜๊ฒ€์ฆ(verification)โ€™ ๋‹จ๊ณ„์™€ โ€˜์—ญ์ถ”์ (backtracking)โ€™ ๋‹จ๊ณ„๊ฐ€ ๋ณ„๋„์˜ ์–ดํ…์…˜ ํ—ค๋“œ์— ์ง‘์ค‘๋˜์–ด ์žˆ๋‹ค๋Š” ์‚ฌ์‹ค์„

Computer Science NLP Model
Comparing Approaches to Automatic Summarization in Less-Resourced Languages

Comparing Approaches to Automatic Summarization in Less-Resourced Languages

์ด ๋…ผ๋ฌธ์€ ์ž์›์ด ๋ถ€์กฑํ•œ ์–ธ์–ด(LRL, Lessโ€‘Resourced Languages)์—์„œ ์ž๋™ ์š”์•ฝ ๊ธฐ์ˆ ์˜ ํ˜„ํ™ฉ๊ณผ ํ•œ๊ณ„๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ์กฐ๋ช…ํ•œ๋‹ค. ๋จผ์ €, ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ(LLM)์˜ ์ œ๋กœ์ƒท ํ”„๋กฌํ”„ํŠธ ๋ฐฉ์‹์„ ๋‹ค์–‘ํ•œ ๋ชจ๋ธ ํฌ๊ธฐ(์˜ˆ: GPTโ€‘3.5, LLaMAโ€‘7B ๋“ฑ)์™€ ํ•จ๊ป˜ ์‹คํ—˜ํ–ˆ๋Š”๋ฐ, ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๊ฐ€ ๋น„์Šทํ•˜๋”๋ผ๋„ ์‚ฌ์ „ ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ ์–ธ์–ด ๋‹ค์–‘์„ฑ, ํ† ํฌ๋‚˜์ด์ € ์„ค๊ณ„, ๊ทธ๋ฆฌ๊ณ  ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง ์ฐจ์ด์— ๋”ฐ๋ผ ์„ฑ๋Šฅ ํŽธ์ฐจ๊ฐ€ ํฌ๊ฒŒ ๋‚˜ํƒ€๋‚ฌ๋‹ค. ์ด๋Š” LLM์ด ๊ณ ์ž์› ์–ธ์–ด์— ์ตœ์ ํ™”๋œ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์–ด, LRL์— ๋Œ€ํ•œ ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ์ด ์ œํ•œ์ ์ž„์„ ์‹œ์‚ฌํ•œ๋‹ค.

Computer Science NLP
iCLP: Large Language Model Reasoning with Implicit Cognition Latent Planning

iCLP: Large Language Model Reasoning with Implicit Cognition Latent Planning

iCLP๋Š” ๊ธฐ์กด โ€œ์ฒด์ธโ€‘์˜ค๋ธŒโ€‘์ƒ๊ฐโ€(CoT) ์ ‘๊ทผ๋ฒ•์˜ ๋‘ ๊ฐ€์ง€ ํ•œ๊ณ„๋ฅผ ๋™์‹œ์— ํ•ด๊ฒฐํ•˜๋ ค๋Š” ์‹œ๋„์ด๋‹ค. ์ฒซ ๋ฒˆ์งธ๋Š” ์ธ๊ฐ„์ด ๋ฌธ์ œ๋ฅผ ํ’€ ๋•Œ ํ…์ŠคํŠธ๋กœ ๋ช…์‹œ์ ์ธ ๊ณ„ํš์„ ์„ธ์šฐ์ง€ ์•Š๋”๋ผ๋„, ๊ณผ๊ฑฐ ๊ฒฝํ—˜์—์„œ ์ถ”์ถœ๋œ ์••์ถ•๋œ ํŒจํ„ด์„ ๋ฌด์˜์‹์ ์œผ๋กœ ํ™œ์šฉํ•œ๋‹ค๋Š” ์ ์ด๋‹ค. ์ด๋Ÿฌํ•œ ์•”๋ฌต์  ์ธ์ง€๋Š” LLM์ด ์ง์ ‘ ํ…์ŠคํŠธ ๊ณ„ํš์„ ์ƒ์„ฑํ•  ๋•Œ ๋ฐœ์ƒํ•˜๋Š” โ€œํ™˜๊ฐโ€(hallucination) ๋ฌธ์ œ๋ฅผ ํšŒํ”ผํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ€๋Šฅ์„ฑ์„ ์ œ๊ณตํ•œ๋‹ค. ๋‘ ๋ฒˆ์งธ๋Š” ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ๊ณผ ์งˆ๋ฌธ ํ˜•ํƒœ์— ๋Œ€ํ•ด ์ผ๊ด€๋œ ํ…์ŠคํŠธ ๊ณ„ํš์„ ์„ค๊ณ„ํ•˜๋Š” ๊ฒƒ์ด ๋น„ํ˜„์‹ค์ ์ด๋ผ๋Š” ์ ์ด๋‹ค. iCLP๋Š” ๋ช…์‹œ์  ๊ณ„ํš์„ ๋จผ์ € ์ˆ˜์ง‘ํ•˜๊ณ , ์ด

Computer Science NLP Model

< Category Statistics (Total: 743) >

Electrical Engineering and Systems Science
7
General
272
General Relativity
7
HEP-EX
5
HEP-PH
12
HEP-TH
5
MATH-PH
3
NUCL-TH
1
Quantum Physics
10

Start searching

Enter keywords to search articles

โ†‘โ†“
โ†ต
ESC
โŒ˜K Shortcut