Wednesday, December 2, 2015

Apache Big Data Tools

In this blog post, I will explain about few Apache Big Data tools to solve common big data problems.

Hadoop Family Tools
Project URL
Hadoop http://hadoop.apache.org/
Ambari http://ambari.apache.org/
Avro http://avro.apache.org/
Cascading http://www.cascading.org/projects/cascading/
Chukwa http://chukwa.apache.org/
Flume https://cwiki.apache.org/confluence/display/FLUME/Home
HBase http://hbase.apache.org/
Hadoop
Distributed File System
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
Hive http://hive.apache.org/
Hivemall https://github.com/myui/hivemall
Mahout http://mahout.apache.org/
MapReduce http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html
Oozie http://oozie.apache.org/
Pig http://pig.apache.org/
Sqoop http://sqoop.apache.org/
Spark http://spark.apache.org/
Tez http://tez.apache.org/
Yarn http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html
Zookeeper http://zookeeper.apache.org/
Big Data Analysis Platforms
and Tools
Project URL
Disco http://discoproject.org/
HPCC http://hpccsystems.com/
Lumify http://lumify.io/
Pandas http://pandas.pydata.org/
Storm https://storm.apache.org/
Databases/Data
Warehouses
Blazegraph http://www.systap.com/bigdata
Cassandra http://cassandra.apache.org/
CouchDB http://couchdb.apache.org/
FlockDB https://github.com/twitter/flockdb
Hibari http://hibari.github.com/hibari-doc/
Hypertable http://hypertable.org/
Impala http://www.cloudera.com/content/cloudera/en/products-and-services/cdh/impala.html
InfoBright
Community Edition
http://www.infobright.org/
MongoDB http://www.mongodb.org/
Neo4j http://neo4j.org/
OrientDB http://www.orientdb.org/index.htm
Pivotal
Greenplum Database
http://pivotal.io/big-data/pivotal-greenplum-database
Riak http://basho.com/riak-0-10-is-full-of-great-stuff/
Redis http://redis.io/
Business Intelligence
Project URL
Talend Open
Studio
http://www.talend.com/index.php
Jaspersoft http://www.jaspersoft.com/
Pentaho http://community.pentaho.com/
SpagoBI http://www.spagoworld.org/xwiki/bin/view/SpagoWorld/
KNIME http://www.knime.org/
BIRT http://www.eclipse.org/birt/phoenix/
Data Mining
Project URL
DataMelt http://jwork.org/dmelt/
KEEL http://keel.es/
Orange http://orange.biolab.si/
RapidMiner https://rapidminer.com/
Rattle http://rattle.togaware.com/
SPMF http://www.philippe-fournier-viger.com/spmf/
Weka http://www.cs.waikato.ac.nz/~ml/weka/
Query Engines
Project URL
Drill http://drill.apache.org/
Programming Languages
Project URL
R http://www.r-project.org/

Big Data Search
Project URL
Lucene http://lucene.apache.org/core/
Solr http://lucene.apache.org/solr/
In-Memory Technology
Project URL
Ignite http://ignite.apache.org/
Terracotta http://www.terracotta.org/



No comments:

Post a Comment