Technical Report 2006-05-01
Internetworking and Media Communications Research Laboratories
Department of Computer Science, Kent State University
http://medianet.kent.edu/technicalreports.html



Encoded Test Video Set for Bird-of-Flock Real-Time Video Object Detection



Zhong Guo and Javed I. Khan

Perceptual Engineering Laboratories
Internetworking and Media Communications Research Laboratories
Department of Computer Science
Kent State University, 233 MSB, Kent, OH 44242

Last Revised May 02, 2006


Abstract

Object based bit allocation can result in significant improvement in the perceptual quality of extremely compressed video. However, real-time video object detection in large format high fidelity video is computationally daunting. Most such algorithm begins with extensive use of classical bit analysis, and thus remains computationally heavy. Based on some recent results in human visual perception, in this paper we present an experimental visual region tracking algorithm particularly designed for perceptual stream coding.

This exploits the cue order observed in human visual perception to achieve very high computation speed as well as tracking efficiency. Rather than begin processing from pixel level or using any pixel level processing at all, it employs  high level motion cue and block shape cue analysis to identify signatures of various relative movements between object of interest, scene background and the camera on the motion vector set, and from there it identifies objects. It then uses predictive cue designed on Kalman filters to track the regions. The result is a fast yet highly effective perceptual region tracking algorithm that can operate in stream rate and track regions of perceptually significant object despite camera movements such as zoom, panning and translation. We have implemented this algorithm in a live H.263/MPEG-2 perceptual transcoder. In this paper we share the performance of this implementation. This fast object aware video rate transcoder is particularly suitable for live streaming and can convert a regular stream into a perceptually coded video stream.

This report contains experiment clips used in testing the performance of this system. The videos are MPEG-2 ISO 13818-2 streams.  The detail of the experiments are in main publications.

*The technical detail of the algorithms are not included here.

Tracking Efficiency Test Set:

Video Clip
Content Description Video Clip
Toycar Fixed camera, still background, one object tracked, well-textured, fast-moving, heavy shadow Toycar.mv2
Mycar-in_parking_lot Fixed camera, some movements in background, one object tracked, poor-textured, slow-moving  Mycar_in_parkinglot.mv2
Two_tractors Fixed camera, some movements in background, two objects tracked, well-textured, slow-moving, partial exclusion Two_tractors.mv2
Walking_people  Fixed camera, still background, three objects tracked, well-textured, deformable shape, slow-moving, illumination change Walking_people.mv2
Tractor_with_moving_camera  Fast smooth camera movement, well-textured background, one object tracked, well-textured, slow-moving, partial exclusion Tractor_with_moving_camera.mv2
Plane Smallest object spanning only few macroblocks, very shaky camera movement, fast moving background, one object tracked, poor-textured, irregular-moving.
Plane.mv2
Mower Smooth camera movement, well-textured background, one object tracked, well-textured, slow-moving, exclusion Mower.mv2
Shaking_camera  Irregular camera movement, poor-textured  background, one object tracked, poor-textured, fast-moving Shaking_camera.mv2



Perceptully Encoded Video Test Set:

Test Description
Sample Clip Name
Detection of One Object
A_one_object_detection.mv2
Detection of Two Objects
B_two_object_detection.mv2
Video with Perceptual Encoding Applied (showing  detection)
C_two_object_quality.mv2
Video after Perceptual Encoding Applied (invisible)
D_two_object_quality_noline.mv2
Stream Trancoding Ratio (1/1) with detection
F1_worker2_1_1.mv2
Stream Trancoding Ratio (1/1)  (invisible)
F2_worker2_1_1_noline.mv2
Stream Trancoding Ratio (4/1) with detection G1_workers2_5_1.mv2
Stream Trancoding Ratio (4/1)  (invisible)
G2_workers2_5_1_noline.mv2