ใน ep นี้เราจะเรียนรู้งานที่สำคัญอีกอย่างหนึ่งใน NLP คือ งานแปลภาษาด้วยเครื่อง หรือ Machine Translation หรือ Neural Machine Transation โดยใช้โมเดลแบบ Sequence to Sequence Recurrent Neural Network (RNN)

Sequence to Sequence Model คืออะไร

sequence to sequence network, in which two recurrent neural networks work together to transform one sequence to another. Credit https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html

โมเดล Seq2Seq จะประกอบด้วย 2 ฝั่ง เรียกว่า

Encoder ภายในเป็น โมเดลแบบ RNN ทำหน้าที่รับข้อความภาษาต้นทางมา แล้วแปลงให้อยู่ในรูปของ Vector Representation (Encoder Vector, Encoder State, Context)
Decoder ภายในเป็น โมเดลแบบ RNN เช่นกัน ทำหน้าที่รับ Vector Representation ไปสร้างเป็นข้อความภาษาปลายทาง ที่ต้องการ

เปรียบได้ง่าย ๆ ว่า เป็นโมเดลแบบ RNN ( LSTM, GRU ) 2 ตัว ต่อกัน รวมกันเป็นตัวเดียว

มีเทคนิคการสร้างโมเดล อีกหลายแบบ เช่น ใช้ Attention, ป้อนข้อความย้อนหลัง, ป้อนข้อความสองรอบ, โมเดลสองทิศทาง, เพิ่มความลึกของโมเดล, etc. และนำไปประยุกต์ใช้ได้อีกหลายงาน เช่น Multimodal Machine Translation, Multilingual Machine Translation, Speech Recognition, Video Captioning, Etc. จะอธิบายต่อไป

Teacher Forcing คืออะไร

1839 caricature by George Cruikshank of a school flogging. Credit https://commons.wikimedia.org/wiki/File:%27February_-_Cutting_Weather_-_Squally%27_-_George_Cruikshank,_1839_-_BL.jpg

ในการเทรน Decoder ที่เป็น RNN ตามปกติจะนำ Output มา Feed กลับเป็น Input สำหรับคำต่อไป แต่ถ้าโมเดลยังไม่ค่อยเก่ง Predict Output ออกมาผิด แล้วเรานำ Output ที่ผิดนั้นไป Feed กลับมาเป็น Input ทำให้ Output ต่อ ๆ ไป ผิดเป็นโดมิโนไปหมด วิธีหนึ่งที่จะช่วยให้โมเดล เรียนรู้ได้ดีขึ้น คือ Teacher Forcing

Teacher Forcing คือ การเทรนด้วยแทนที่ จะ Feed Output จากโมเดล เป็น Input อย่างเดียว เราจะ Feed ผสม Output ที่ถูกต้อง (Label) กับ Output ของโมเดล (Prediction) เข้าด้วยกัน ตามสัดส่วนที่กำหนด

แล้วค่อย ๆ ปรับสัดส่วนเพิ่ม Output จากโมเดลขึ้นเรื่อย ๆ ลด Label ลง จน Feed ด้วย Output อย่างเดียว Teacher Forcing เป็นเทคนิควิธีการช่วยโมเดลให้เรียนรู้ได้ดีขึ้นในช่วงแรก

เรามาเริ่มกันเลยดีกว่า

Check it out on github Last updated: 28/02/2024 04:27:02

แชร์ให้เพื่อน:

Surapong Kanoktipsatharporn

Solutions Architect at Bua Labs

The ultimate test of your knowledge is your capacity to convey it to another.

โมเดล Sequence to Sequence คืออะไร Neural Machine Translation แปลภาษาฝรั่งเศส เป็นภาษาอังกฤษ ด้วย Sequence to Sequence RNN Model เทรนด้วย Teacher Forcing – NLP ep.10

Sequence to Sequence Model คืออะไร

Teacher Forcing คืออะไร

เรามาเริ่มกันเลยดีกว่า

แชร์ให้เพื่อน:

Published by Surapong Kanoktipsatharporn

Sequence to Sequence Model คืออะไร

Teacher Forcing คืออะไร

เรามาเริ่มกันเลยดีกว่า

แชร์ให้เพื่อน:

บทความที่เกี่ยวข้อง:

Published by Surapong Kanoktipsatharporn