Artificial Intelligence Archives - Page 8 of 13

Convolutional Neural Network คืออะไร ภาษาไทย ตัวอย่างการทำงาน CNN, ConvNet กับชุดข้อมูล MNIST – ConvNet ep.1

A filter (=kernel, neuron) in a convolutional artificial neural network. The input to the filter is three features thick. The three features come from three separate filters in the previous layer of the deep neural network. Credit https://commons.wikimedia.org/wiki/File:Convolutional_Neural_Network_NeuralNetworkFeatureLayers.gif

ใน ep ที่แล้ว Neural Network ep.13 ที่เราได้สร้างโมเดล Deep Neural Network ที่ใช้ Linear Layer + ReLU Activation Function เราได้สร้าง Training Loop ที่มีความ Flexible จาก Callback ทำให้เราสามารถ Schedule Hyperparameter ได้ตามต้องการ แต่ไม่ว่าจะเทรนอย่างไร เราก็จำแนก MNIST ได้ Accuracy สูงสุดแค่ 97% เท่านั้น เนื่องจากข้อจำกัดของ Model Architecture แล้วเราจะแก้ปัญหานี้อย่างไรดี

Schedule Hyperparameter ในการเทรน Machine Learning เทรนโมเดล Deep Neural Network ด้วย Learning Rate ไม่คงที่ One Cycle – Neural Network ep.13

hyperparameter learning rate scheduler cosine function

หลังจากที่ใน ep ก่อน เราได้ใช้ LR_Find Callback หา Learning Rate ที่ดีที่สุดได้แล้ว แล้วเราจะนำมาใช้อย่างไร Learning Rate ถือว่าเป็นหนึ่งใน Hyperparameter ที่สำคัญที่สุดในการเทรน Machine Learning มีแนวคิดจากหลากหลาย Paper ที่ว่า ในแต่ละ State ของการเทรน Deep Neural Network นั้นต้องการ Hyperparameter ต่างกันไป แล้วเราจะ Schedule Hyperparameter ของเราได้อย่างไร

จัดการหมวดหมู่เล็ก ๆ ยิบย่อย รวมข้อมูลหมวดหมู่ Category เล็ก ๆ เป็นหมวดหมู่ Other ก่อนป้อนเทรน Machine Learning – Preprocessing ep.4

Pie chart of visible minorities in Toronto, based on Visible Minorities of Toronto.png. Credit https://en.wikipedia.org/wiki/File:Visible_minorities_in_Toronto.png

ในหลาย ๆ Dataset เราจะพบว่าข้อมูลแบบ Category มีการแตกยิบย่อยมากเกินไป เช่น บาง Category มีแค่ 1 หรือ 2 Record เท่านั้น หรือ Category เล็ก จำนวน Record แตกต่างกับ Category ใหญ่ ๆ หลายร้อย หลายพันเท่า ข้อมูล Category เล็ก ๆ ยิบย่อยเหล่านี้ อาจจะไม่ได้ช่วยโมเดล Machine Learning ในการเรียนรู้ก็ได้ ทางแก้คือ เราจะ Group รวม Category เล็ก ๆ เหล่านั้นรวมออกมาเป็น Category ใหม่ ตั้งชื่อว่า Other

lr_find หา Learning Rate ที่ดีที่สุดในการเทรน Machine Learning โมเดล Deep Neural Network ด้วย Callback – Neural Network ep.12

Gradient descent with small (top) and large (bottom) learning rates. Source: Andrew Ng’s Machine Learning course on Coursera

จาก ep ก่อน เราได้รู้จัก Hyperparameter ที่สำคัญที่สุดในการเทรน Machine Learning ชื่อ Learning Rate แต่ปัญหาคือ ถ้าเรากำหนดค่า Learning น้อยไปก็ทำให้เทรนได้ช้า แต่ถ้ามากเกินไปก็ทำให้ไม่ Converge หรืออาจจะ Error ไปเลย แล้วเราจะมีวิธีใด ที่จะหาค่า Learning Rate ที่ดีที่สุด มาใช้เทรน Deep Neural Network ของเรา

ตัวอย่าง Callback ในการเทรน Machine Learning คำนวน Metrics, Recorder บันทึก Loss, Learning Rate – Neural Network ep.11

Metrics ของโมเดล เช่น Accuracy, Training Loss, Validation Loss

จาก ep ที่แล้ว ที่เราประยุกต์ใช้ Callback กับ Training Loop ในการเทรน Machine Learning ด้วยอัลกอริทึม Gradient Descent สร้างเป็น Runner Class ที่มี Callback ในทุก ๆ Event ที่เป็นไปได้ ในการเทรน Deep Neural Network แล้วเราจะใช้ประโยชน์จาก ระบบ Callback อันแสนยืดหยุ่นนี้ อย่างไรได้บ้าง ใน ep นี้เราจะมาดูตัวอย่างการสร้าง Callback แบบง่าย ๆ แต่มีประโยชน์ คือการ คำนวน Metrics และ บันทึกค่า Loss, Learning Rate

การประยุกต์ใช้ Callback เพิ่มความยืดหยุดให้ Training Loop รองรับการเทรนด้วย Algorithm ซับซ้อนขึ้น – Neural Network ep.10

จาก ep ก่อน ๆ เราจะได้ Training Loop ที่สามารถเทรน Neural Network ได้อย่างถูกต้อง สมบูรณ์ ได้ผลลัพธ์เป็นที่หน้าพอใจ แต่ถ้าเราต้องการเพิ่มเติม Logic การเทรนที่ซับซ้อนยิ่งขึ้น เราจะต้องแก้โค้ดนี้ แทรกตามบรรทัดต่าง ๆ เช่น ก่อนเริ่มเทรน, ก่อนเริ่ม Epoch, หลังจากจบ 1 Epoch, etc. ข้อเสียของการแทรกโค้ดแบบนี้ คือ ทำให้โค้ดใน Loop นี้ก็จะบวมขึ้นเรื่อย ๆ ส่งผลให้มีปัญหาในการ Maintain แล้วเราจะแก้ปัญหานี้อย่างไรดี

Refactor โค้ด Neural Network สร้าง DataBunch และ Learner ปรับปรุง Training Loop – Neural Network ep.9

Roses and Lillies by Henri Fantin-Latour (1888). Credit https://www.wikiart.org/en/henri-fantin-latour/roses-and-lilies-1888

ใน ep ที่แล้วเราได้ Neural Network และ Training Loop ที่ทำงานได้ดีพอสมควร มีการวัดผล Metrics กับข้อมูลใน Validation Set เพื่อให้แน่ใจว่าโมเดลทำงานได้ถูกต้องกับข้อมูลที่ไม่เคยเห็นมาก่อน แต่โค้ด Training Loop ของเรายังมีความซับซ้อนเกินไป ใช้ Parameter จากภายนอกถึง 6 ตัว ซึ่งมากเกินไป ทำให้ยากต่อการต่อยอดเทรนในอัลกอริทึมที่ซับซ้อนยิ่งขึ้น แล้วเราจะแก้ไขอย่างไร

สำรวจข้อมูล Exploratory Data Analysis (EDA) ด้วย Pandas Profiling วิเคราะห์ Pandas DataFrame – Pandas ep.6

Graph representing the metadata of thousands of archive documents, documenting the social network of hundreds of League of Nations personals. Credit https://commons.wikimedia.org/wiki/File:Social_Network_Analysis_Visualization.png

เมื่อเราได้ Dataset ใหม่มา สิ่งแรกที่เราควรทำ คือ Exploratory Data Analysis (EDA) ทำความเข้าใจข้อมูล ในแต่ละ Feaure เช่น ข้อมูลเป็นชนิดอะไร, ข้อมูลเป็นแบบต่อเนื่องหรือไม่ต่อเนื่อง, ช่วงของข้อมูลกว้างแค่ไหน, การกระจายของข้อมูลเป็นอย่างไร, มีข้อมูลขาดหายไปเยอะแค่ไหน, แต่ละ Feature เชื่อมโยงกันอย่างไร การวิเคราะห์ทั้งหมดนี้ค่อนข้างซับซ้อน และซ้ำซ้อนเหมือนกันในทุก ๆ Dataset จะมีวิธีไหนที่จะทำให้งานซ้ำ ๆ เหล่านี้ง่ายขึ้น

MNIST คืออะไร

MNIST Sample Data. Credit http://yann.lecun.com/exdb/mnist/

MNIST Database คือ ชุดข้อมูลรูปภาพของตัวเลขอารบิก 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ที่เขียนด้วยลายมือ 70,000 รูป MNIST คือ ชุดข้อมูลสำหรับไว้เทรน Artificial Intelligence (AI) เกี่ยวกับ Computer Vision / Image Processing

Categorize การเตรียมข้อมูลหมวดหมู่ Categorical Data ด้วย One-Hot Encoding, Map ก่อนเทรน Machine Learning – Preprocessing ep.3

SGR-1806 is an ultra-magnetic neutron star, called a magnetar, located about 50,000 light years away from Earth in the constellation Sagittarius. Credit https://da.m.wikipedia.org/wiki/Fil:SGR_1806-20_108685main_SRB1806_20rev2.jpg

นอกเหนือจากข้อมูลตัวเลข Cardinal ค่าต่อเนื่อง (Continuous) เราจะพบ Feature ที่เป็นข้อมูลค่าไม่ต่อเนื่อง (Discrete) ในรูปแบบตัวเลขแบบ Ordinal, Nominal หรือข้อความ คือ มีค่าที่เป็นไปได้จำกัด ระบุว่าอยู่หมวดหมู่ไหน เช่น วันในสัปดาห์ 1 จันทร์, 2 อังคาร, 3 พุธ, … คือ 1 ใน 7 ค่าเท่านั้น เราจะไม่สามารถทำ Rescale, Normalize แบบใน ep 2 ได้ แล้วเราจะเตรียมข้อมูลชนิดนี้อย่างไรดี ถึงจะป้อนให้ Machine Learning ใช้เทรนได้