Index

A

Abstraction tools

Access to data

Accuracy of data

Activity logs

Algorithms

accuracy
anomalies
data mining
evolution of
real-time results
scenarios
statistical applications
text analytics

Amazon

Amazon S3

Analysis of data. See Data analysis

Anomalies, value of

Apple

Applications

Archives

Artificial intelligence

Astronomy

Auto-categorization

Automated metadata acquisition systems

Availability of data

B

BA. See Business analytics (BA)

BackType

Backup systems

Batch processing

Behavioral analytics

Benefits analysis

Best practices

anomalies
expediency-accuracy tradeoff
high-value opportunities focus
in-memory processing
project management processes
project prerequisites
thinking big
worst practice avoidance

BI. See Business intelligence (BI)

Big Data and Big Data analytics

analysis categories
application platforms
best practices
business case development
challenges
classifications
components
defined
evolution of
examples of
4Vs of
goal setting
introduction
investment in
path to
phases of
potential of
privacy issues
processing
role of
security (See Security)
sources of
storage
team development
technologies (See Technologies)
value of
visualizations

Big Science

BigSheets

Bigtable

Bioinformatics

Biomedical industry

Blekko

Business analytics (BA)

Business case

best practices
data collection and storage options
elements of
introduction

Business intelligence (BI)

as Big Data analytics foundation
Big Data analytics team incorporation
Big Data impact
defined
extract, transform, and load (ETL)
information technology and
in-memory processing
limitations of
marketing campaigns
risk analysis
storage capacity issues
unstructured data
visualizations

Business leads

Business logic

Business objectives

Business rules

C

Capacity of storage systems

Cassandra

Census data

CERN

Citi

Classification of data

Cleaning

Click-stream data

Cloud computing

Cloudera

Combs, Nick

Commodity hardware

Common Crawl Corpus

Communication

Competition

Compliance

Computer security officers (CSOs)

Consulting firms

Core capabilities, data analytics team

Costs

Counterintelligence mind-set

CRUD (create, retrieve, update, delete) applications

Cryptographic keys

Culture, corporate

Customer needs

Cutting, Doug

D

Data

defined
growth in volume of
value of
See also Big Data and Big Data analytics

Data analysis

categories
challenges
complexity of
as critical skill for team members
data accuracy
evolution of
importance of
process
technologies

Database design

Data classification

Data discovery

Data extraction

Data integration

technologies
value creation

Data interpretation

Data manipulation

Data migration

Data mining

components
as critical skill for team members
defined
examples
methods
technologies

Data modeling

Data protection. See Security

Data retention

Data scientists

Data sources

growth of
identification of
importation of data into platform
public information

Data visualization

Data warehouses

DevOPs

Discovery of data

Disk cloning

Disruptive technologies

Distributed file systems. See also Hadoop

Dynamo

E

e-commerce

Economist

e-discovery

Education

80Legs

Electronic medical records

compliance
data errors
data extraction
privacy issues
trends

Electronic transactions

EMC Corporation

Employees

data analytics team membership
monitoring of
training

Encryption

Entertainment industry

Entity extraction

Entity relation extraction

Errors

Event-driven data distribution

Evidence-based medicine

Evolution of Big Data

algorithms
current issues
future developments
modern era
origins of

Expectations

Expediency-accuracy tradeoff

External data

Extract, transform, and load (ETL)

Extractiv

F

Facebook

Filters

Financial controllers

Financial sector

Financial transactions

Flexibility of storage systems

4Vs of Big Data

G

Gartner

General Electric (GE)

Gephi

Goal setting

Google

Google Books Ngrams

Google Refine

Governance

Government agencies

Grep

H

Hadoop

advantages and disadvantages of
design and function of
event-processing framework
future
origins of
vendor support
Yahoo’s use

HANA

HBase

HDFS

Health care

Big Data analytics opportunities
Big Data trends
compliance
evolution of Big Data
See also Electronic medical records

Hibernate

High-value opportunities

History. See Evolution of Big Data

Hive

Hollerith Tabulating System

Hortonworks

I

IBM

IDC (International Data Corporation)

IDC Digital Universe Study

Information professionals

Information technology (IT)

Big Data analytics team incorporation
business value focus
database management as percentage of budget
data governance
evolution of
in-memory processing impact
pilot programs
user analysis

In-memory processing

Input-output operations per second (IOPS)

Integration of data

Intellectual property

Interconnected data

Internal data

International Biological Program

International Data Corporation

International Geophysical Year project

Interpretation of data

J

Jahanian, Farnam

JPA

K

Kelly, Nuala O’Connor

Kogan, Caron

L

Labeling of confidential information

Latency of storage systems

Legal issues

LexisNexis Risk Solutions

Liability

Life sciences

LivingSocial

Location-based services

Lockheed Martin

Log-in screens

Logistics

Logs, activity

Loyalty programs

M

Maintenance plans

Manhattan Project

Manipulation of data

Manufacturing, in-memory processing technology

Mapping tools

MapR

MapReduce

advantages
built-in support for integration
defined
Hadoop
relational database management systems

Marketing campaigns

Memory, brain’s capacity

Metadata

Metrics

Mining. See Data mining

Mobile devices

Modeling

Moore’s Law

Mozenda

N

NAS

National Oceanic and Atmospheric Administration (NOAA)

National Science Foundation (NSF)

Natural language recognition

New York Times

Noisy data

NoSQL (Not only SQL)

O

Object-based storage systems

OLAP systems

OOZIE

OpenHeatMap

Open source technologies

availability
options
pilot projects
See also Hadoop

Organizational structure

Outsourcing

P

Parallel processing

Patents

Pentaho

Performance measurement

Performance-security tradeoff

Perlowitz, Bill

Pharmaceutical companies

Pig

Pilot projects

Planning

Point-of-sale (POS) data

Predictive analysis

Privacy

Problem identification

Processing

Project management processes

Project planning

Public information sources

Purging of data

Q

Queries

R

RAM-based devices

Real-time analytics

Recruitment of data analytics personnel

Red Hat

Relational database management system (RDBMS)

Research and development (R&D)

Resource description framework (RDF)

Results

Retailers

anomalies
Big Data use
click-stream data
data sources
goal setting
in-memory processing technology
organizational culture

Retention of data

Return on investment (ROI)

Risk analysis

S

SANS

SAP

Scale-out storage solutions

Scaling

Scenarios

Schmidt, Erik

Science

Scope of project

Scrubbing programs

Security

backup systems
challenges
compliance issues
data classification
data retention
intellectual property
rules
technologies

Semantics

event-driven data distribution support
mapping of
technologies
trends

Semistructured data

Sensor data

filtering
growth of
types

Silos

Sloan Digital Sky Survey

Small and medium businesses (SMBs)

Smart meters

Smartphones

Snapshots

Social media

Software. See Technologies

Sources of data. See Data sources

Space program

Specificity of information

Speed-accuracy tradeoff

Spring Data

SQL

limitations
NoSQL Integration
scaling

Stale data

Statistical applications

Storage

Storm

Structured data

Success, measurement of

Supplementary information

Supply chain

T

Tableau Public

Taxonomies

Team members

Technologies

application platforms
Cassandra
cloud computing
commodity hardware
decision making
processing power
security
storage
Web-based tools
worst practices
See also Hadoop

Telecommunications

Text analytics

Thin provisioning

T-Mobile

Training

Transportation

Trends

Trusted applications

Turk

Twitter

U

United Parcel Service (UPS)

Unstructured data

complexity of
defined
forms
growth of
project goal setting
social media’s collection
technologies
varieties of

U.S. census

User analysis

Utilities sector

V

Value, extraction of

Variety

Velocity

Vendor lock-in

Veracity

Videos

Video surveillance

Villanustre, Flavio

Visualization

Volume

W

Walt Disney Company

Watson

Web-based technologies

Web sites

click-stream data
logs
traffic distribution

White-box systems

Worst practices

Wyle Laboratories

X

XML

Y

Yahoo