การพัฒนาโปรแกรมการถอดอักษรภาษาอังกฤษเป็นภาษาไทยโดยใช้คลังคำทับศัพท์ของราชบัณฑิตยสถาน

วัลย์วรา ไชยฤกษ์, 2523-

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/2714

Title:	การพัฒนาโปรแกรมการถอดอักษรภาษาอังกฤษเป็นภาษาไทยโดยใช้คลังคำทับศัพท์ของราชบัณฑิตยสถาน
Other Titles:	The development of an English-Thai transliteration program based on transliterated-word corpus of the Royal Institute
Authors:	วัลย์วรา ไชยฤกษ์, 2523-
Advisors:	วิโรจน์ อรุณมานะกุล
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะอักษรศาสตร์
Advisor's Email:	[email protected]
Subjects:	ภาษาไทย--การถอดตัวอักษร--โปรแกรมคอมพิวเตอร์
Issue Date:	2547
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	วิทยานิพนธ์ฉบับนี้มีวัตถุประสงค์เพื่อรวบรวมคำทับศัพท์ภาษาอังกฤษของราชบัณฑิตยสถาน และเปรียบเทียบกฎการทับศัพท์ของราชบัณฑิตยสถานกับข้อมูลจริงที่พบ พร้อมทั้งสร้างโปรแกรมการถอดอักษรภาษาอังกฤษเป็นภาษาไทยโดยใช้คลังคำทับศัพท์ดังกล่าวเพื่อสร้างตารางกฎการถอดอักษรภาษาอังกฤษเป็นภาษาไทยให้กับคอมพิวเตอร์ การวิจัยนี้ได้เก็บรวบรวมคำภาษาอังกฤษและคำทับศัพท์ภาษาไทยจำนวน 10,060 คำจากเอกสารที่เผยแพร่โดยราชบัณฑิตยสถานเพื่อนำมาใช้เป็นคลังคำทับศัพท์ ผลการวิจัยพบว่า กฎการถอดอักษรของราชบัณฑิตยสถานสอดคล้องกับข้อมูลในคลังทับศัพท์ดังนี้ กฎสำหรับสระมีความสอดคล้อง 92.68% สำหรับพยัญชนะต้นสอดคล้อง 97.35% สำหรับพยัญชนะท้ายสอดคล้อง 97.34% และเมื่อทดลองโปรแกรมที่พัฒนาขึ้นโดยใช้กฎที่สร้างขึ้นจากข้อมูล 80% ของคลังคำทับศัพท์ ผลการทดลองกับข้อมูลอีก 20% ปรากฏว่าโปรแกรมสามารถถอดอักษรในระดับคำได้ถูกต้อง 39.53% และถูกต้อง 79.8% ในระดับอักษร ซึ่งแสดงว่าประสิทธิภาพการถอดอักษรของโปรแกรมยังไม่ดีเท่าที่ควร เนื่องมาจากสาเหตุหลายประการ เช่น โปรแกรมไม่ได้นำขอบเขตพยางค์มาพิจารณา ไม่ได้พิจารณาเสียงภาษาอังกฤษ รวมถึงความหลากหลายของคำทับศัพท์ของราชบัณฑิตยสถานที่รวบรวมไว้ในคลังข้องมูล และยังพบว่าในการถอดอักษรจริงนั้น มีการถอดอักษรแบบที่นอกเหนือไปจากที่ราชบัณฑิตยสถานให้ไว้ในตารางเทียบอักษรอยู่จำนวนหนึ่ง
Other Abstract:	The purpose of this thesis is to collect transliterated words published by the Royal Institute and compare the transliteration regulation of the Royal Institute with the collected data. Furthermore, an English-Thai transliteration program based on transliteratedword corpus of the Royal Institute was developed in order to generate tables of English-Thai transliteration rules for computers. The transliterated-word corpus is composed of 10,060 pairs of English and transliterated Thai words published by the Royal Institute. The result shows that the transliteration regulation of the Royal Institute conforms with the collected data in the corpus as follows: The transliteration rules for vowels have 92.68% accordance, the transliteration rules for initial consonants have 97.35% accordance and the transliteration rules for last consonants have 97.34% accordance. When the developed program was trained by the rules generated from 80% of the corpus, the result of testing on 20% remainder shows 39.53% accuracy in wordlevel and 79.8% accuracy in alphabetical level. This suggests that the transliteration program yields quite a poor result due to considerable reasons, for instance syllable boundaries and phonetic transcriptions were not considered including various versions of transliterated words published by the Royal Institute collected in the corpus. It is also found that a number of transliteration rules are not yet included in the Royal Institute's table of transliteration.
Description:	วิทยานิพนธ์ (อ.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2547
Degree Name:	อักษรศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	ภาษาศาสตร์
URI:	http://cuir.car.chula.ac.th/handle/123456789/2714
URI:	http://doi.org/10.14457/CU.the.2004.875
ISBN:	9741764898
metadata.dc.identifier.DOI:	10.14457/CU.the.2004.875
Type:	Thesis
Appears in Collections:	Arts - Theses

Files in This Item:

File	Description	Size	Format
Wanwara.pdf		1.16 MB	Adobe PDF	View/Open

Show full item record