Video data mining-
An Overview
1. Introduction
It is the advancement in multimedia acquisition and storage technology that has led to a tremendous
growth in multi- media databases. Multimedia mining deals with the extraction of implicit knowledge, multimedia data relationships
or other patterns not explicitly stored
in the multimedia data. The management of multimedia data is one of the
crucial tasks in the data mining owing to the non-structured
nature of the multimedia data. The main challenge is to handle
the multimedia data with a complex structure
such as images, multimedia text, video and audio data.
Nowadays people
have accessibility to a tremendous amount of video both on television and internet. So, there
is a
great potential for video-based
applications in many areas
including security and surveillance, personal entertainment, medicine,
sports, news video, educational programs and movies and so on. Video data contains
several kinds of data such as video,
audio and text. The video consists of a sequence
of images with some temporal information. The audio consists
of speech, music
and various special
sounds whereas the textual information represents its linguistic
form.
The
video content may be classified into three categories,
namely .
(i) Low-level feature information that includes
features such as color, texture,
shape
and so on,
and so on,
(ii) Syntactic information that describes
the contents of video, including salient
objects, their spatial-temporal position and spatial- temporal relations between
them, and
objects, their spatial-temporal position and spatial- temporal relations between
them, and
(iii) semantic
information, which describes what is happening in the video along
with what is perceived by the users
with what is perceived by the users
2. Video data mining
It is video data mining that deals with the extraction of implicit knowledge,
video data relationships, or other patterns not explicitly stored in the video databases
considered as an extension of still image mining by including mining
of temporal image sequences. It is a process which not only automatically extracts content and structure of video, features of moving objects,
spatial or temporal correlations of those
features, but also discovers patterns of video structure, object
activities, video events from vast amounts of video data with
a little assumption of their contents.
Video mining involves three main tasks
. They are: (1) Video preprocessing with high quality video objects
such as blocks
of pixels, key frames,
segments, scenes, moving objects
and description text; (2) The extracting of the features
and semantic information of video objects such as physical features,
motion features, relation features and semantic descriptions
of these features, and (3) Video patterns and knowledge
discovery using video,
audio and text features.
3. Key problems in video data mining
Video data mining
is an emerging
field that can be defined as the unsupervised discovery of patterns in audio
visual contents. Mining
video data is even more complicated
than mining still image data requiring
tools for discovering
relationships between objects or segments within the video components, such as classifying video images based on their contents, extracting patterns in sound, categorizing
speech and music, and recognizing
and tracking objects in video streams. The existing
data-mining tools pose various problems
while applied to video database. They are:
(a)
Data- base model problem
in which video documents are generally
unstructured in semantics
and cannot be represented
easily via the relational data model demanding
a good video data- base model that is crucial to support more efficient
video database
management and mining
.
(b)
The retrieval results solely based on the low level feature extraction are mostly unsatisfactory and unpredictable. It is the semantic gap between
the low level visual features
and the high level user domain that happens to the one of the hurdles
for the development
of a video data-mining system.
(c) Maintaining data integrity
and security in video database
management structure. These
challenges have led to a lot of research
and development
in the area of video data mining. The main objective of video mining is to extract the significant objects, characters and scenes by
determining their frequency of re-occurrence.
4. Video data mining approaches
Recently, there has been a trend
of
employing various
data-mining approaches in exploring
knowledge from the video database.
Consequently, many video mining approaches
have been proposed which can be roughly
classified into five categories. They are: Video pattern
mining, Video clustering and classification,
Video association mining, Video content structure mining and Video motion
mining.
4.1 Video structure mining
Since, video data is a kind of unstructured stream an efficient access to video is not an easy task.
Therefore the main objective of the video structure mining is the
identification of the content structure and patterns to carry out the fast random
access of the video database.
As video structure
represents the syntactic
level composition of the video content, its basic structure is represented
as a
hierarchical structure constituted by the video program,
scene, shot and key-frame . Video structure
mining is defined as the process of discovering the fundamental logic structure from the preprocessed video program adopting data-mining method such as classification, clustering and
association rule.
4.2 Video clustering and classification
Video clustering and classification
are used to cluster and classify video units into different
categories. Therefore
clustering is a significant
unsupervised learning technique
for the discovery of certain knowledge from a dataset.
Clustering video sequences in order to infer and extract activities from a single video stream is an extremely
important problem and so it has a significant potential in video indexing,
surveillance, activity discovery and event recognition. In the video surveillance
systems, it is to find the
patterns and groups
of moving objects that the clustering analysis is used. Clustering similar
shots into one unit eliminates redundancy and as a result, produces a
more concise video content
summary. Clustering algorithms are categorized into partitioning methods,
hierarchical methods,
density-based methods, grid based methods and model-based
methods.
4.3 Video association mining
Video association mining
is the process of discovering associations in a given video.
The video knowledge is explored
in a
two stages, the first being the video content
processing in which the video clip is segmented into certain
analysis units extracting their representative features and the second being the video association mining that extracts the knowledge
from
the feature descriptors. In video
association mining, the video processing and the existing
data-mining algorithms are seamlessly integrated into mine video knowledge.
4.4 Video motion mining
Motion is a key feature that essentially characterizes the contents of the video, representing the temporal information of videos and more objective and consistent
compared to other features such as color, texture and so on. There have
been some approaches to extract camera
motion and motion activity in video sequences.
While dealing with the problem
of object tracking,
algorithms are always proposed on the basis of known object region in the frames and so the most challenging problem in the visual information retrieval is the recognition
and detection of the objects
in the moving videos.
5. Video data mining applications
The
fact that video data are used in many different areas
such as sports, medicine,
traffic and education
programs, shows how significant it is. The potential
applications of video mining include annotation, search, mining of traffic information, event detection
/ anomaly detection
in a surveillance video, pattern or trend analysis and detection. There
are four types
of videos in our daily life, namely, (a) produced
video, (b) raw video, (c) medical video, and (d) broadcast or prerecorded
video.
5.1 Produced video data mining
A produced video is meticulously produced according
to a script or plan that is later edited, compiled
and distributed for
consumption. News videos, dramas, and movies are examples
of the produced video with an extremely
strong structure but has tremendous
variation in production styles that vary from country to
country or content-creator to content-creator.
5.2 Raw video data mining
There
are two common types of surveillance video used
in the real world applications
such as the security video generally used for property or public areas and the monitoring
video used to monitor the traffic flow. The surveillance systems with data-mining techniques are investigated
to find out suspicious people capable
of indulging in abnormal activities. However, the captured video data are commonly
stored or previewed by operators
to find abnormal moving objects
or events. The identification of the patterns existing in surveillance
applications, building
the supervised models and the abnormal event detection are risky tasks.
5.3 Medical video mining
Audio
and video processing is integrated to mine the medical event information such as dialog,
presentation and clinical
operation from the detected
scenes in a medical video database.
5.4 Broadcast or prerecorded video mining
Broadcast video can be regarded as
being made up of genre (set of video
documents sharing similar
style). The genre of
a video is the broad class to which it may belong to
e.g. sports, news and cartoon
and so on. The content
of broadcast video can be conceptually divided into two parts. First, the semantic content, the story line told by the video. This is split into genre, events and objects. Second,
inherent properties of the digital media video termed as editing
effects.
PutteGowda D,
Asst. Professor,
Dept. of CSE,
ATME College of Engineering, Mysore