Welcome to Multimedia Processing Lab

Multimedia Processing Lab  in Department of Computer Science at National Taiwan Ocean Uerversity is founded in 2009 and directed by Dr. Jun-Wei Hsieh.The lab dedicates to intelligent image/video processing, multimedia standards, and relevant technologies.At present, 3 Ph.D students, 15 full-time graduate students, and 13 part-time graduate students associate the research group.Our research topics include: 


  • Artificial Intelligence, AI
  • Deep Learning
  • 3D Printing
  • Pattern Recognition
  • Video Retrieval
  • Video Surveillance
  • Fine Granularity Scalability 
  • MPEG 7/ 21 & H.26LMultimedia over IP
  • Behavior Analysis
  • Embedded System
  • Intelligent Transport Systems, ITS



Recent Research

Modeling and Recognizing Action Contexts in Persons Using Sparse Representation

Journal of Visual Communication and Image Representation Volume 30, July 2015, Pages 252–265

This paper proposes a novel dynamic sparse representation-based classification scheme to treat the problem of interaction action analysis between persons using sparse representation. The occlusion problem and the difficulty to model complicated interactions are the major challenges in person-to-person action analysis. To address the occlusion problem, the proposed scheme represents an action sample in an over-complete dictionary whose base elements are the training samples themselves. This representation is naturally sparse and makes errors (caused by different environmental changes like lighting or occlusions) sparsely appear in the training library. Because of the sparsity, it is robust to occlusions and lighting changes. The difficulty of complicated action modeling can be tackled by adding more examples to the over-complete dictionary. Thus, even though the interaction relations are complicated, the proposed method still works successfully to recognize them and can be easily extended to analyze action events among multiple persons.


Human movement analysis around a view circle using time-order similarity distributions

Journal of Visual Communication and Image Representation Volume 30, July 2015, Pages 22–34

This paper presents a new behavior classification system to analyze human movements around a view circle using time-order similarity distributions. To maintain the view in-variance, an action is represented not only from its spatial domain but also its temporal domain. After that, a novel alignment scheme is proposed for aligning each action to a fixed view. With the best view, the task of behavior analysis becomes a string matching problem. One novel idea proposed in this paper is to code a posture using not only its best matched key posture but also other unmatched key postures to form various similarity distributions. Then, recognition of two actions becomes a problem of matching two time-order distributions which can be very effectively solved by comparing their KL distance via a dynamic programming scheme.


Vehicle make and model recognition using sparse representation and symmetrical SURFs

 Pattern Recognition Volume 48, Issue 6, June 2015, Pages 1979–1998

This paper presents a new symmetrical SURF descriptor to detect vehicles on roads and then proposes a novel sparsity-based classification scheme to recognize their makes and models. First, for vehicle detection, this paper proposes a symmetry transformation on SURF points to detect all possible matching pairs of symmetrical SURF points. Then, each desired ROI of vehicle can be located very accurately from the set of symmetrical matching pairs through a projection technique. The advantages of this scheme are no need of background subtraction and its extreme efficiency in real-time detection tasks. After that, two challenges in vehicle make and model recognition (MMR) should be addressed, i.e., the multiplicity and ambiguity problems. The multiplicity problem stems from one vehicle model often having different model shapes on the road. The ambiguity problem means vehicles even made from different companies often share similar shapes. To treat the two problems, a dynamic sparse representation scheme is proposed to represent a vehicle model in an over-complete dictionary whose base elements are the training samples themselves. With the dictionary, a novel Hamming distance classification scheme is proposed to classify vehicle makes and models to detailed classes. Because of the sparsity of the representation and the nature of Hamming code highly tolerant to noise, different vehicle makes and models can be recognized with high accuracy.