การรู้จำตัวอักษรไทยโดยใช้ซัพพอร์ตเวกเตอร์แมชชีนและเคอร์เนล

พัฒนชัย เบศรภิญโญวงศ์, 2521-

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/1388

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	บุญเสริม กิจศิริกุล	-
dc.contributor.author	พัฒนชัย เบศรภิญโญวงศ์, 2521-	-
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์	-
dc.date.accessioned	2006-08-03T02:21:24Z	-
dc.date.available	2006-08-03T02:21:24Z	-
dc.date.issued	2545	-
dc.identifier.isbn	9741716214	-
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/1388	-
dc.description	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2545	en
dc.description.abstract	ปรับปรุงความถูกต้องในการรู้จำของโปรแกรมโอซีอาร์ภาษาไทย โดยได้นำเอาเทคนิคของซัพพอร์ตเวกเตอร์แมชชีน (เอสวีเอ็ม) และเคอร์เนลเข้ามาประยุกต์ใช้ในส่วนของการวิเคราะห์องค์ประกอบสำคัญของข้อมูล ซึ่งเป็นกระบวนการที่สำคัญในการดึงเอาลักษณะสำคัญของข้อมูลรูปภาพตัวอักษร ก่อนที่จะส่งข้อมูลที่ได้ไปยังส่วนรู้จำของโปรแกรมโอซีอาร์ เพื่อแยกแยะว่าเป็นตัวอักษรชนิดใดต่อไป โดยเรียกเทคนิคการวิเคราะห์องค์ประกอบสำคัญของข้อมูลแบบใหม่นี้เรียกว่า การวิเคราะห์องค์ประกอบสำคัญของข้อมูลแบบเคอร์เนล ในวิทยานิพนธ์ฉบับนี้ ได้แบ่งรูปภาพที่ใช้ทดสอบออกเป็นสองกลุ่ม คือรูปภาพชุดเรียนรู้จำนวน 8,544 ตัว และรูปภาพชุดทดสอบจำนวน 1,424 ตัว ประกอบด้วยตัวอักษรแบบ AngsanaUPC, BrowalliaUPC, CordiaUPC, DilleniaUPC, EucrosiaUPC และ FreesiaUPC แต่ละแบบประกอบด้วยตัวอักษรขนาด 14, 16, 18, 20, 22, 24, 28 และ 36 จุด ผลของการทดสอบพบว่า ผลของการรู้จำของโปรแกรมโอซีอาร์ภาษาไทย ที่ใช้เทคนิคของการวิเคราะห์องค์ประกอบสำคัญของข้อมูลแบบเคอร์เนล ให้ผลการรู้จำที่ดีขึ้นจากโปรแกรมโอซีอาร์ภาษาไทยตัวเดิม อย่างไรก็ตาม วิธีใหม่นี้กลับใช้หน่วยความจำและเวลาที่เพิ่มขึ้นจากเดิม	en
dc.description.abstractalternative	To improve the accuracy of a Thai Optical Character Recognition (Thai-OCR) program. We extend the Principal Component Analysis method, which is used to extract features from character images, to a new method called Kernel Principal Component Analysis by using Support Vector Machines and Kernels. In this thesis, we divided the data into 2 groups: the training set of 8,544 character images and the test set of 1,424 character images. In our experiment, the data set consists of character images from 6 fonts: AngsanaUPC, BrowalliaUPC, CordiaUPC, DilleniaUPC, EucrosiaUPC and FreesiaUPC each font composed of size 14, 16, 18, 20, 22, 24, 28 and 36 points. The experimental results show that Thai-OCR which uses Kernel Principal Component Analysis gives better results than the previous one using the original Principal Component Analysis. However, the new method consumes more memory space and processing time.	en
dc.format.extent	947000 bytes	-
dc.format.mimetype	application/pdf	-
dc.language.iso	th	en
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.subject	การรู้จำอักขระ (คอมพิวเตอร์)	en
dc.subject	การรู้จำอักขระด้วยวิธีการทางแสง	en
dc.subject	ภาษาไทย--ตัวอักษร	en
dc.title	การรู้จำตัวอักษรไทยโดยใช้ซัพพอร์ตเวกเตอร์แมชชีนและเคอร์เนล	en
dc.title.alternative	Thai character recognition using Support Vector Machines and Kernels	en
dc.type	Thesis	en
dc.degree.name	วิทยาศาสตรมหาบัณฑิต	en
dc.degree.level	ปริญญาโท	en
dc.degree.discipline	วิทยาศาสตร์คอมพิวเตอร์	en
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.email.advisor	[email protected], [email protected]	-
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
Patanachai.pdf		1.05 MB	Adobe PDF	View/Open

Show simple item record