diff --git a/Yaoli Mao_clean data set_fake sample_video, eye fixation and understanding.csv b/Yaoli Mao_clean data set_fake sample_video, eye fixation and understanding.csv new file mode 100644 index 0000000..9222424 --- /dev/null +++ b/Yaoli Mao_clean data set_fake sample_video, eye fixation and understanding.csv @@ -0,0 +1 @@ +timeframe_Start(ms),subjectID,subjectID_origin,subjectMale_1(Female_0),"videoID(carol-1,simon-2)","itemID(1-42,carol1-21,simon22-42)",itemSuccess_1,fixationYes_1,fixationCounts,fixationDuration_ms,eyeLocation_face,eyeLocation_gesture1,eyeLocation_gesture2,eyeLocation_gesture3,eyeLocation_gesture4,eyeLocation_specialGestureWithRhetoricalQuestions,eyeLocation_face,eyeLocation_visualsupports,eyeLocation_other,overallitemScore_carol(range0-21),overallitemScore_simon(range0-21),overallitemScore(range0-42) 22000,1,klt1,0,1,1,0,1,2,1000,1,0,1,0,0,0,1,0,0,15,14,29 23000,1,klt1,0,1,1,0,1,2,500,1,0,0,0,1,0,1,0,0,15,14,29 24000,1,klt1,0,1,1,0,1,3,800,1,0,0,0,0,0,1,0,0,15,14,29 25000,1,klt1,0,1,1,0,1,1,900,1,0,1,1,0,0,1,0,0,15,14,29 26000,1,klt1,0,1,1,0,1,4,600,1,0,0,0,0,0,0,0,0,15,14,29 27000,1,klt1,0,1,1,0,1,3,700,1,0,0,0,1,0,1,0,0,15,14,29 28000,1,klt1,0,1,1,0,1,2,560,1,0,0,0,0,1,1,0,1,15,14,29 29000,1,klt1,0,1,1,0,1,3,562,1,0,1,1,1,1,1,0,0,15,14,29 30000,1,klt1,0,1,1,0,1,5,900,1,0,0,0,0,0,1,0,0,15,14,29 31000,1,klt1,0,1,1,0,1,1,1000,1,1,1,1,0,0,0,0,0,15,14,29 32000,1,klt1,0,1,1,0,1,4,780,0,0,0,0,1,0,0,0,1,15,14,29 33000,1,klt1,0,1,1,0,1,5,620,0,0,0,0,1,0,0,0,1,15,14,29 34000,1,klt1,0,1,1,0,1,3,750,0,1,1,1,1,0,0,0,1,15,14,29 35000,1,klt1,0,1,1,0,1,1,300,0,0,0,1,1,1,0,0,0,15,14,29 36000,1,klt1,0,1,1,0,1,5,870,0,0,0,1,0,0,1,0,1,15,14,29 37000,1,klt1,0,1,1,0,1,2,400,1,0,0,1,0,0,1,0,1,15,14,29 38000,1,klt1,0,1,1,0,1,4,550,0,1,1,0,0,0,1,0,0,15,14,29 39000,1,klt1,0,1,1,0,1,2,400,1,0,0,0,1,1,1,0,0,15,14,29 40000,1,klt1,0,1,1,0,1,2,1000,1,0,0,0,0,0,0,0,0,15,14,29 41000,1,klt1,0,1,1,0,1,2,100,0,0,0,1,0,0,0,0,0,15,14,29 42000,1,klt1,0,1,1,0,1,1,200,0,0,0,0,1,0,0,0,0,15,14,29 43000,1,klt1,0,1,1,0,1,1,300,0,0,0,0,0,0,1,0,0,15,14,29 44000,1,klt1,0,1,1,0,1,3,900,0,0,0,1,1,0,1,0,0,15,14,29 22000,1,klt1,0,1,1,1,1,2,1000,1,0,1,0,0,0,1,0,0,15,14,29 23000,1,klt1,0,1,1,1,1,2,500,1,0,0,0,1,0,1,0,0,15,14,29 24000,1,klt1,0,1,1,1,1,3,800,1,0,0,0,0,0,1,0,0,15,14,29 25000,1,klt1,0,1,1,1,1,1,900,1,0,1,1,0,0,1,0,0,15,14,29 26000,1,klt1,0,1,1,1,1,4,600,1,0,0,0,0,0,0,0,0,15,14,29 27000,1,klt1,0,1,1,1,1,3,700,1,0,0,0,1,0,1,0,0,15,14,29 28000,1,klt1,0,1,1,1,1,2,560,1,0,0,0,0,1,1,0,1,15,14,29 29000,1,klt1,0,1,1,1,1,3,562,1,0,1,1,1,1,1,0,0,15,14,29 30000,1,klt1,0,1,1,1,1,5,900,1,0,0,0,0,0,1,0,0,15,14,29 31000,1,klt1,0,1,1,1,1,1,1000,1,1,1,1,0,0,0,0,0,15,14,29 32000,1,klt1,0,1,1,1,1,4,780,0,0,0,0,1,0,0,0,1,15,14,29 33000,1,klt1,0,1,1,1,1,5,620,0,0,0,0,1,0,0,0,1,15,14,29 34000,1,klt1,0,1,1,1,1,3,750,0,1,1,1,1,0,0,0,1,15,14,29 35000,1,klt1,0,1,1,1,1,1,300,0,0,0,1,1,1,0,0,0,15,14,29 36000,1,klt1,0,1,1,1,1,5,870,0,0,0,1,0,0,1,0,1,15,14,29 37000,1,klt1,0,1,1,1,1,2,400,1,0,0,1,0,0,1,0,1,15,14,29 38000,1,klt1,0,1,1,1,1,4,550,0,1,1,0,0,0,1,0,0,15,14,29 39000,1,klt1,0,1,1,1,1,2,400,1,0,0,0,1,1,1,0,0,15,14,29 40000,1,klt1,0,1,1,1,1,2,1000,1,0,0,0,0,0,0,0,0,15,14,29 41000,1,klt1,0,1,1,1,1,2,100,0,0,0,1,0,0,0,0,0,15,14,29 42000,1,klt1,0,1,1,1,1,1,200,0,0,0,0,1,0,0,0,0,15,14,29 43000,1,klt1,0,1,1,1,1,1,300,0,0,0,0,0,0,1,0,0,15,14,29 44000,1,klt1,0,1,1,1,1,3,900,0,0,0,1,1,0,1,0,0,15,14,29 \ No newline at end of file diff --git a/Yaoli Mao_data description and obstacle_fake sample_video, eye fixation and understanding.txt b/Yaoli Mao_data description and obstacle_fake sample_video, eye fixation and understanding.txt new file mode 100644 index 0000000..71bc0c5 --- /dev/null +++ b/Yaoli Mao_data description and obstacle_fake sample_video, eye fixation and understanding.txt @@ -0,0 +1 @@ +Yaoli: Project description/Logic model: Educational goal/Assumption: if they pay attention(fixate on the ¡°right¡± thing, e.g. might be semantically related) at certain point of time, they would understand the content better. I¡¯m developing a model that discovers students¡¯ eye fixation pattern on various video features regarding the speaker(speech-related gestures, facial expression...) and camera shooting that would predict their understanding of the video contents. Description of data: The data set will contain 28 subjects¡¯ 1) eye fixation data(duration and counts) over the time of watching 2 ted talk videos(each around 10 minutes long); 2)and their scoring on comprehension questions after watching the videos. Aggregation from multiple datasets: -Fixation reports for each subject containing start and end/duration for each fixation over the timeline of two videos. One file contains one subject watching one video, 56 in total. -Comprehension survey items containing the correctness for each item, total item score by video and the time frame that each item corresponds to the timeline of the video content. The file contains item-level score and a second file contains item mapping onto the video timeline. Considering a lot of manual work to be done to combine multiple datasets, the uploaded data set is a fake example of the first subject, female, got wrong on first item, which corresponds to the time frame 22000ms-44000ms¡¯ content of the first video. During this time, she fixated on gesture 1 and face. And her total item score is 15 for the first video(carol). Obstacle: I have coded video features but haven¡¯t coded how each subject¡¯s fixation is loaded onto these features over the video length. I might need to manually use a visualization tool to see if subjects¡¯ fixation x- y-coordinates maps onto visuals in the video. And since the comprehension questions can be locked down to certain time frames/time points in the video, I¡¯m thinking about doing the question-level model(score will be binary, right or wrong, and thus logistic) or the student-level model(score will be continuous from 0-42). Chad Response: Instead of manually using a visualization to see where a subjects eyes are looking during the instruction, maybe you can try some EDM techniques to extract general trends/features instead. Here are some articles that come to mind on how this can be done: Q-Matrix Mining http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.112.9876&rep=rep1&type=pdf Model Discovery http://educationaldatamining.org/EDM2012/uploads/procs/Full_Papers/edm2012_full_10.pdf LFA http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.75.7043&rep=rep1&type=pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.61.9734&rep=rep1&type=pdf Additionally, I think building a binary two-class logistic model would be the best way to go for now. Depending on how the information is displayed in the video, you could first build one model to detect what time frame and eye location best correlates with success on one item and then from there you can build the model to fit the other assessment items. Advice taken: Use item-level binary model for now. Thanks! More Obstacles: 1. How to coordinate the fixation boundary with item-to-video boundary and absolute start and end arbitrarily set for the video(every 1000 ms), and how to conduct automatic extraction of fixation counts and duration calculation from individual fixation reports? 2. How to efficiently to code eye location in the videos manually? Possible to code time on each location manually? 3. Discuss possible problems with one single item corresponding to multiple time frames and multiple items corresponding to one single time frame. \ No newline at end of file