Execution Engines
|
Batch
|
Tez
|
|
Spark
|
A fast and
general engine for large-scale data processing
|
||
Cascading
|
|||
Pig
|
an ETL library
for Hadoop. It generates MapReduce jobs. You use it when you have processes
that are ETL-like.
|
||
Map Reduce v1/v2
|
|||
ML, Graph
|
Graphx
|
||
MLLIB
|
|||
Mahout
|
machine
learning or predictive analytics. A library.
|
||
SQL
|
Drill
|
A schema-free
SQL query engine for Hadoop, NoSQL, and Cloud Storage. Doesn't use MapReduce.
|
|
Shark
|
|||
Impala
|
|||
Hive
|
SQL like query
used with Hbase. It uses H-sql. Ad-hoc querying.
|
||
NoSql &
Search
|
Accumulo
|
||
Soir
|
|||
HBase
|
|||
Streaming
|
Storm
|
A free and
open source distributed real-time computation system.
|
|
Spark
Streaming
|
|||
Yarn
|
“Yet Another
Resource Negotiator”. sometimes called MapReduce 2.0. Apache YARN decouples
resource management and data processing in Hadoop.
|
||
Data Governance & Operations
|
Data
Integration & Access
|
Hue
|
|
HttpFS
|
|||
Flume
|
a log
collector because Hadoop jobs produce a large amount of log information about
job process because the jobs are running batch, so they take time to run
|
||
Sqoop
|
Transfers bulk
data between Hadop and Oracle’s DBMS.
|
||
Security
|
Knox
|
||
Sentry
|
|||
Workflow &
Data Governance
|
Falcon
|
||
Oozie
|
a Workflow
scheduler library for Hadoop jobs
|
||
Provisioning
& Coordination
|
Savannah
|
||
Juju
|
|||
Zookeeper
|
A centralized
service for maintaining configuration information, naming, providing
distributed synchronization, and providing group services.
|
Wednesday, January 18, 2017
MapR EcoSystem
https://www.mapr.com/products/product-overview/overview%20
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment