การค้นพบโมทีฟความยาวแปรผันสำหรับข้อมูลอนุกรมเวลา

ปวัน นันทานิช

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/27610

Title:	การค้นพบโมทีฟความยาวแปรผันสำหรับข้อมูลอนุกรมเวลา
Other Titles:	Variable length motif discovery for time series data
Authors:	ปวัน นันทานิช
Advisors:	โชติรัตน์ รัตนามหัทธนะ
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์
Advisor's Email:	[email protected]
Subjects:	การวิเคราะห์อนุกรมเวลา ดาต้าไมนิง เหมืองข้อมูลรูปแบบความต่อเนื่อง
Issue Date:	2554
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	การค้นพบโมทีฟของข้อมูลอนุกรมเวลาเป็นสาขาหนึ่งของงานวิจัยการทำเหมืองข้อมูลอนุกรมเวลาที่ทำหน้าที่ในการค้นหารูปแบบที่น่าสนใจที่เรียกว่าโมทีฟ โดยโมทีฟคือคู่ของลำดับย่อยในข้อมูลอนุกรมเวลาที่รูปร่างคล้ายกัน โดยทั่วไปแล้วในกระบวนการเบื้องต้นเมื่อเริ่มค้นหาโมทีฟ จะต้องกำหนดค่าของพารามิเตอร์ความยาวโมทีฟเสมอ ซึ่งงานวิจัยต่าง ๆ ไม่ได้คำนึงถึงมากนักเมื่อผู้ใช้งานต้องกำหนดความยาวโมทีฟที่จะค้นหาโดยไม่รู้แน่ชัดว่าควรกำหนดขนาดเป็นเท่าใด การกำหนดความยาวโมทีฟที่แตกต่างกันออกไปจะนำไปสู่การค้นพบรูปแบบของโมทีฟหลากหลายรูปแบบ ซึ่งมีงานจำนวนน้อยมากที่กล่าวถึงปัญหาความยาว โมทีฟและนำเสนออัลกอริทึมในการแก้ปัญหา อย่างไรก็ตามอัลกอริทึมเหล่านี้ยังต้องกำหนดค่าความยาวโมทีฟเริ่มต้นเป็นพารามิเตอร์และยังมีพารามิเตอร์อื่นเพิ่มเติมขึ้นมาอีกทำให้มีความซับซ้อนในการใช้งานรวมไปถึงยังต้องกำหนดความยาวโมทีฟเริ่มต้นอยู่ดี โดยต้องกำหนดให้มีความยาวใกล้เคียงกับรูปแบบที่น่าสนใจในข้อมูลอนุกรมเวลา ดังนั้น ปัญหาความยาวโมทีฟจึงยังคงไม่ได้รับการแก้ไข งานวิจัยในวิทยานิพนธ์ฉบับนี้จึงได้นำเสนออัลกอริทึมในการแก้ปัญหาความยาวโมทีฟซึ่งไม่ต้องการพารามิเตอร์ใด ๆ เพิ่มเติมในการใช้งานและให้ผลลัพธ์เป็นเซตของ "โมทีฟที่ดี” โดยมีวิธีการวัดคุณภาพของผลลัพธ์โมทีฟและประสิทธิภาพของอัลกอริทึมที่ชัดเจน อัลกอริทึมที่นำเสนอจะมีเพียงข้อมูลอนุกรมเวลาเป็นข้อมูลนำเข้าและได้ผลลัพธ์เป็นเซตของ “โมทีฟที่ดี” ที่ทำการจัดอันดับไว้ให้เลือกไปใช้งาน โดยอัลกอริทึมที่นำเสนอสามารถค้นพบรูปแบบที่น่าสนใจที่ทำการฝังตัวลงไปได้ทั้งหมด โดยมีคุณภาพของผลลัพธ์โมทีฟที่สูงและสามารถที่จะลดจำนวนของโมทีฟที่เป็นไปได้มากกว่า 99 เปอร์เซ็นต์
Other Abstract:	Time series motif discovery is an increasingly popular research area in time series mining whose main objective is to search for interesting patterns or motifs. A motif is a pair of time series subsequences, or two subsequences whose shapes are very similar to each other. Typical motif discovery algorithm requires a predefined motif length as its parameter. Discovering motif with arbitrary lengths introduces another problem, where selecting a suitable length for the motif is non-trivial since domain knowledge is often required. Only a few works were aware of this motif length and proposed some algorithms to resolve the problem. However, these algorithms still require an initial motif length parameter and many additional pre-defined parameters which cause a lot more complication for using and especially the motif length parameter is still remain. Thus, this work proposes the first parameter-free motif discovery algorithm which requires no parameter as input, and as a result returns a set of all “Best Motif” that are ranked by a proposed scoring function which is based on similarity of motif locations and similarity of motif shapes. The experimental results show that the algorithm can efficiently discover all planted patterns with high quality and are able to reduce a number of all possible motifs with more than 99 percent.
Description:	วิทยานิพนธ์ (วศ.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 255
Degree Name:	วิศวกรรมศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	วิศวกรรมคอมพิวเตอร์
URI:	http://cuir.car.chula.ac.th/handle/123456789/27610
URI:	http://doi.org/10.14457/CU.the.2011.1421
metadata.dc.identifier.DOI:	10.14457/CU.the.2011.1421
Type:	Thesis
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
pawan_nu.pdf		2.44 MB	Adobe PDF	View/Open

Show full item record