【文件属性】:
文件名称:Service-generated Big Data and Big Data-as-a-Service: An Overview
文件大小:3.9MB
文件格式:PDF
更新时间:2018-04-14 01:24:14
Big Data
We live in on-demand, on-command Digital universe
with data prolifering by Institutions, Individuals and Machines at
a very high rate. This data is categories as "Big Data" due to its
sheer Volume, Variety and Velocity. Most of this data is
unstructured, quasi structured or semi structured and it is
heterogeneous in nature.
The volume and the heterogeneity of data with the speed it is
generated, makes it difficult for the present computing
infrastructure to manage Big Data. Traditional data management,
warehousing and analysis systems fall short of tools to analyze this
data.
Due to its specific nature of Big Data, it is stored in distributed
file system architectures. Hadoop and HDFS by Apache is widely
used for storing and managing Big Data. Analyzing Big Data is a
challenging task as it involves large distributed file systems which
should be fault tolerant, flexible and scalable. Map Reduce is
widely been used for the efficient analysis of Big Data. Traditional
DBMS techniques like Joins and Indexing and other techniques
like graph search is used for classification and clustering of Big
Data. These techniques are being adopted to be used in Map
Reduce.
In this paper we suggest various met