In this blog post, I will explain about few Apache Big Data tools to solve common big data problems.
Hadoop Family Tools
|
Project URL
|
| Hadoop | http://hadoop.apache.org/ |
| Ambari | http://ambari.apache.org/ |
| Avro | http://avro.apache.org/ |
| Cascading | http://www.cascading.org/projects/cascading/ |
| Chukwa | http://chukwa.apache.org/ |
| Flume | https://cwiki.apache.org/confluence/display/FLUME/Home |
| HBase | http://hbase.apache.org/ |
| Hadoop Distributed File System | https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html |
| Hive | http://hive.apache.org/ |
| Hivemall | https://github.com/myui/hivemall |
| Mahout | http://mahout.apache.org/ |
| MapReduce | http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html |
| Oozie | http://oozie.apache.org/ |
| Pig | http://pig.apache.org/ |
| Sqoop | http://sqoop.apache.org/ |
| Spark | http://spark.apache.org/ |
| Tez | http://tez.apache.org/ |
| Yarn | http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html |
| Zookeeper | http://zookeeper.apache.org/ |
Big Data Analysis Platforms
and Tools
|
Project URL
|
| Disco | http://discoproject.org/ |
| HPCC | http://hpccsystems.com/ |
| Lumify | http://lumify.io/ |
| Pandas | http://pandas.pydata.org/ |
| Storm | https://storm.apache.org/ |
| Databases/Data Warehouses | |
| Blazegraph | http://www.systap.com/bigdata |
| Cassandra | http://cassandra.apache.org/ |
| CouchDB | http://couchdb.apache.org/ |
| FlockDB | https://github.com/twitter/flockdb |
| Hibari | http://hibari.github.com/hibari-doc/ |
| Hypertable | http://hypertable.org/ |
| Impala | http://www.cloudera.com/content/cloudera/en/products-and-services/cdh/impala.html |
| InfoBright Community Edition | http://www.infobright.org/ |
| MongoDB | http://www.mongodb.org/ |
| Neo4j | http://neo4j.org/ |
| OrientDB | http://www.orientdb.org/index.htm |
| Pivotal Greenplum Database | http://pivotal.io/big-data/pivotal-greenplum-database |
| Riak | http://basho.com/riak-0-10-is-full-of-great-stuff/ |
| Redis | http://redis.io/ |
Business Intelligence
|
Project URL
|
| Talend Open Studio | http://www.talend.com/index.php |
| Jaspersoft | http://www.jaspersoft.com/ |
| Pentaho | http://community.pentaho.com/ |
| SpagoBI | http://www.spagoworld.org/xwiki/bin/view/SpagoWorld/ |
| KNIME | http://www.knime.org/ |
| BIRT | http://www.eclipse.org/birt/phoenix/ |
Data Mining
|
Project URL
|
| DataMelt | http://jwork.org/dmelt/ |
| KEEL | http://keel.es/ |
| Orange | http://orange.biolab.si/ |
| RapidMiner | https://rapidminer.com/ |
| Rattle | http://rattle.togaware.com/ |
| SPMF | http://www.philippe-fournier-viger.com/spmf/ |
| Weka | http://www.cs.waikato.ac.nz/~ml/weka/ |
Query Engines
|
Project URL
|
| Drill | http://drill.apache.org/ |
Programming Languages
|
Project URL
|
| R | http://www.r-project.org/ |
Big Data Search
|
Project URL
|
| Lucene | http://lucene.apache.org/core/ |
| Solr | http://lucene.apache.org/solr/ |
In-Memory Technology
|
Project URL
|
| Ignite | http://ignite.apache.org/ |
| Terracotta | http://www.terracotta.org/ |
No comments:
Post a Comment