Technical Report 2003-05-01
Internetworking and Media Communications Research Laboratories
Department of Computer Science,
Kent State University
http://medianet.kent.edu/technicalreports.html


 

Thesis Title: 

DYNAMIC GAZE SPAN WINDOW BASED FOVEATION FOR PERCEPTUAL

MEDIA STREAMING

Oleg Komogortsev
okomogor@cs.kent.edu

Supervisor: Prof. Javed I. Khan
Department of
Computer Science
Kent State University

Date Submitted: May 2003


Abstract:

Human vision provides numerous opportunities for video data-compression. Human vision extends about 140 degree, but only about 2 degrees have sharp vision. A fascinating body of research exists in vision and psychology geared towards the understanding of the human visual perception system. This thesis presents a novel eye-gaze enhanced media transcoding system for streaming video. This scheme includes a video server, a real-time performance capable media transcoder, a video player and an eye tracker. The system intakes live perceptual information related to a subject’s eye position. Eye and head movements are detected via an eye-tracker, and a magnetic head tracker. A unique challenge of this real time perceptual adaptation scheme is the incorporation of fast eye movement mechanisms into a complex MPEG-2 transcoding scheme. An important factor in this perceptually adaptive encoding method is the delay between the time an eye-gaze sample is taken and the time the coding response arrives on the screen. This delay is particularly critical if the video involves network transmission. The delay also usually increases when large format media is to be perceptually transformed due to the coding complexity. This thesis investigates this feedback delay compensation problem and proposes a novel gaze interaction based foveation windowing scheme to solve it. The proposed technique is able to contain 90% of the gazes within 20-25% window coverage area. The media transcoder developed on this scheme is one of the first eye-gaze based perceptual transcoders. It can be used between the server and the video player in a networked environment. The architecture of the transcoder is designed to allow transmission of both stored and live media. Though the architecture is independent of any media type, this system currently handles ISO/IEC 13818-2 MPEG-2 standard.


·        “Dynamic Reflex Window Prediction for Perceptual Coding of Wide Format Video Stream With Eye Gaze Tracking,” Javed Khan, Oleg Komogortsev. Proceedings of Circuits, Signal and Systems conference, May 19-21, 2003, Cancun, Mexico.

·        Contour Approximation can Lead to Faster Object Based Transcoding with Higher Perceptual Quality,” [technical report: January 2003].

·        Dynamic Gaze Span Window based Foveation  for Perceptual Media Streaming,” [technical report: November 2002].

·        Perceptually Encoded Video Set from Dynamic Reflex Windowing,” [technical report: July 2002].

·        Resource Adaptive Netcentric Systems on Active Network: A Self-Organizing Video Stream that Auto Morphs Itself while in Transit via a Quasi-Active Network,” Javed I. Khan, Seung S. Yang, Darsan Patel, Oleg Komogortsev, Wansik Oh, Zhong Guo, Q. Gu and Patrick Mail. Proceedings of the DARPA Active Networks Conference and Exposition 2002, San Francisco, CA, May 2002.

·        Resource Adaptive Netcentric Systems: A Case Study with SONET – a Self – Organizing Network Embedded Transcoder.” Javed I. Khan, Seung S. Yang, Qiong Gu, Darshan Patel, Patrik Mail, Oleg Komogortsev, Wansik Oh, & Zhong Guo. Proceedings of the ACM International Conference on Multimedia, ACM MM’2001, Ottawa, Canada, October 2001.

·        ‘Impact on Stream Compression and Video Quality Motion Vector Reuse Transcoding” [technical report: February 2002]

  • Other Related Links:

Last Modified: May 2003.