การเปรียบเทียบวิธีการแบ่งข้อมูลอย่างสุ่ม และวิธีบูตสแตรปในการปรับค่าพี-แวลูของสัมประสิทธิ์การถดถอยที่มีมิติสูง

บงกชพร เนาวนัติ

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/43950

Title:	การเปรียบเทียบวิธีการแบ่งข้อมูลอย่างสุ่ม และวิธีบูตสแตรปในการปรับค่าพี-แวลูของสัมประสิทธิ์การถดถอยที่มีมิติสูง
Other Titles:	A COMPARISON ON P-VALUE ADJUSTMENT BETWEEN RANDOM – SPLIT AND BOOTSTRAP METHODS IN HIGH DIMENSIONAL REGRESSION
Authors:	บงกชพร เนาวนัติ
Advisors:	วิฐรา พึ่งพาพงศ์
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะพาณิชยศาสตร์และการบัญชี
Advisor's Email:	[email protected]
Subjects:	เซตสุ่ม สถิติวิเคราะห์ Random sets
Issue Date:	2556
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	การวิจัยครั้งนี้มีวัตถุประสงค์เพื่อศึกษาและเปรียบเทียบแนวทางในการเลือกใช้วิธี Random Split และวิธีบูตสแตรปในการปรับค่า p-value ของสัมประสิทธิ์การถดถอยที่มีมิติสูง อีกทั้งเพื่อศึกษาและเปรียบเทียบประสิทธิภาพในการคัดเลือกตัวแปรระหว่างวิธี Random Split และวิธีบูตสแตรปในการปรับค่า p-value ของสัมประสิทธิ์การถดถอยที่มีมิติสูง ซึ่งเกณฑ์ที่ใช้ในการเปรียบเทียบ คือจำนวนความผิดพลาดในการตรวจจับเชิงบวก จำนวนความผิดพลาดในการตรวจจับเชิงลบ และจำนวนสัมประสิทธิ์การถดถอยที่ไม่เท่ากับศูนย์จากการทดสอบสมมติฐานของสัมประสิทธิ์แต่ละตัว โดยข้อมูลที่ใช้ในการศึกษาได้จากการจำลองข้อมูลโดยมีขนาดตัวอย่างต่อจำนวนตัวแปรอิสระเป็น 10:20, 10:50, 10:100, 100:200, 100:500, 100:1,000, 200:400, 200:1,000 และ 200:2,000 ตามลำดับด้วยจำนวนสัมประสิทธิ์จริงที่ไม่เท่ากับศูนย์ 0.1 เท่า, 0.25 เท่า และ 0.45 เท่าของขนาดตัวอย่างที่ระดับความสัมพันธ์ของตัวแปรอิสระเป็น 0, 0.5 และ 0.9 จากผลการศึกษาโดยเปรียบเทียบจำนวนความผิดพลาดในการตรวจจับเชิงบวก พบว่าการแบ่งข้อมูลด้วยวิธี Random Split มีประสิทธิภาพในการปรับค่า p-value ของสัมประสิทธิ์การถดถอยที่มีมิติสูงมากกว่าการแบ่งข้อมูลด้วยวิธีบูตสแตรป แต่ในแง่ของจำนวนความผิดพลาดในการตรวจจับเชิงลบและจำนวนสัมประสิทธิ์การถดถอยที่ไม่เท่ากับศูนย์จากการทดสอบสมมติฐานของสัมประสิทธิ์แต่ละตัว พบว่ากรณีส่วนใหญ่การแบ่งข้อมูลด้วยวิธีบูตสแตรปจะมีประสิทธิภาพในการปรับค่า p-value ของสัมประสิทธิ์การถดถอยที่มีมิติสูงมากกว่าการแบ่งข้อมูลด้วยวิธี Random Split
Other Abstract:	The objective of this research is to study and compare on p-value adjustment between Random – Split and Bootstrap methods in high dimensional regression, include studying and comparing efficiency in variable selection on p-value adjustment between Random – Split and Bootstrap methods in high dimensional regression. The number of false positive, the number of false negative and the number of nonzero coefficient are three criteria using for comparison. The data in this study under several situations which are the ratio of sample size to the number of independent variables are 10:20, 10:50, 10:100, 100:200, 100:500, 100:1,000, 200:400, 200:1,000 and 200:2,000 with true nonzero coefficients are 0.1, 0.25 and 0.45 of sample size which correlation level of independent variables are 0, 0.5 and 0.9 Based on the simulation results by comparing the number of false positive show that data splitting with Random – Split method is more efficient than Bootstrap method on p-value adjustment in high dimensional regression. However, the number of false negative and the number of nonzero coefficients, overall, data splitting with Bootstrap method is more efficient than Random – Split method on p-value adjustment in high dimensional regression.
Description:	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2556
Degree Name:	วิทยาศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	สถิติ
URI:	http://cuir.car.chula.ac.th/handle/123456789/43950
URI:	http://doi.org/10.14457/CU.the.2013.1403
metadata.dc.identifier.DOI:	10.14457/CU.the.2013.1403
Type:	Thesis
Appears in Collections:	Acctn - Theses

Files in This Item:

File	Description	Size	Format
5581561926.pdf		2.71 MB	Adobe PDF	View/Open

Show full item record