การคัดแยกไวรัสคอมพิวเตอร์จากรหัสฐานสอง

ประสิทธิ์ อุษาฟ้าพนัส

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/59622

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	เกริก ภิรมย์โสภา	-
dc.contributor.author	ประสิทธิ์ อุษาฟ้าพนัส	-
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์	-
dc.date.accessioned	2018-09-14T05:10:03Z	-
dc.date.available	2018-09-14T05:10:03Z	-
dc.date.issued	2560	-
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/59622	-
dc.description	วิทยานิพนธ์ (วศ.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2560	-
dc.description.abstract	งานวิจัยนี้นำเสนอการใช้การเรียนรู้แบบมีผู้สอนเพื่อตรวจจับไฟล์ไวรัสคอมพิวเตอร์ที่ไม่เคยพบมาก่อนแบบ static ผู้วิจัยได้ทดสอบกับตัวแยกประเภทจำนวน 3 แบบ คือ random forest, multilayer perceptron และ extreme gradient boosting ชุดข้อมูลประกอบด้วย 6319 ไฟล์ executable แต่ละไฟล์ถูกสกัดด้วย objdump แล้วจัดเรียงตามคะแนน TF-IDF เพื่อหา feature ที่เหมาะสม ผลลัพธ์เปรียบเทียบด้วย F1-score คือ สามารถใช้ตัวแยกประเภทแบบ random forest ร่วมกับข้อมูลที่มี 20 attribute ได้ 0.937 F1-score ซึ่งมากกว่าบรรทัดฐานอยู่ 0.031 F1-score และ สามารถใช้ตัวแยกประเภทแบบ extreme gradient boosting ร่วมกับข้อมูลที่มี 500 attribute ได้ 0.962 F1-score ซึ่งมากกว่าบรรทัดฐานอยู่ 0.041 F1-score จึงสรุปได้ว่าวิธีการในงานวิจัยนี้สามารถเพิ่ม precision และ recall ของการแยกประเภทได้	-
dc.description.abstractalternative	This thesis proposes a supervised machine learning model for detecting (unseen) viruses files. Our main focus is on static analysis approach. To find the best method, we experiment with difference types of feature extraction and three classifier algorithms including extreme gradient boosting, random forest and multilayer perceptron. Our data set contains 6,319 executable files. Each file is extracted with objdump and sorted with TF-IDF score to find best features. The F1-score shows slightly better performance than those of the baselines. Random forest with 20 attributes yields 0.937 F1 score which is 0.031 more than that of the baseline . The extreme gradient boosting method with 500 attributes achieve 0.962 F1 score, 0.041 more than that of the baseline. We conclude that our approach can improve the precision and recall of the classification.	-
dc.language.iso	th	-
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	-
dc.relation.uri	http://doi.org/10.58837/CHULA.THE.2017.1374	-
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	-
dc.subject	ไวรัสคอมพิวเตอร์	-
dc.subject	Computer viruses	-
dc.title	การคัดแยกไวรัสคอมพิวเตอร์จากรหัสฐานสอง	-
dc.title.alternative	Classification of Computer Viruses from binary code	-
dc.type	Thesis	-
dc.degree.name	วิศวกรรมศาสตรมหาบัณฑิต	-
dc.degree.level	ปริญญาโท	-
dc.degree.discipline	วิศวกรรมคอมพิวเตอร์	-
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	-
dc.email.advisor	[email protected],[email protected]	-
dc.identifier.DOI	10.58837/CHULA.THE.2017.1374	-
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
5870411921.pdf		2.96 MB	Adobe PDF	View/Open

Show simple item record