BDEv: Big Data Evaluator


BDEv Features

BDEv supports the following features, which are executed in an automatic way.

  • BDEv performs:
    • Automatic configuration of the frameworks
    • Deployment of the frameworks using different cluster sizes
    • Execution of predefined and user-defined workloads (batch and interactive modes)
  • BDEv automatically records:
    • Output and runtime of the workloads
    • Performance and scalability results
    • Configuration and log directories of the frameworks
    • Resource utilization stats using dstat/dool and BDWatchdog
    • CPU power consumption stats using RAPL and turbostat
    • Microachitecture-level events using Oprofile
    • Energy consumption stats using HPE iLO technology
    • Automatically-generated graphs
  • BDEv allows:
    • To execute the workloads using different frameworks and cluster sizes
    • To unify the configuration of the different frameworks
    • To configure timeouts for the workloads
    • The use of high-performance resources like IP over InfiniBand
    • The use of multiple disks to store intermediate data for the frameworks

Supported frameworks in BDEv 3.9

Framework Version Deploy mode Network interfaces
Hadoop 1.2.1 JobTracker GbE / IPoIB
Hadoop-YARN 2.10.2 / 3.1.4 / 3.2.4 / 3.3.6 / 3.4.0 YARN GbE / IPoIB
Hadoop-UDA 1.2.1 JobTracker IPoIB
Hadoop-UDA-YARN 2.10.2 YARN IPoIB
Flame-MR 1.1 / 1.2 YARN GbE / IPoIB
RDMA-Hadoop 0.9.9 JobTracker GbE / IPoIB
RDMA-Hadoop-2 1.2.0 / 1.3.5 YARN GbE / IPoIB
RDMA-Hadoop-3 0.9.1 YARN GbE / IPoIB
DataMPI 0.6.0 Standalone GbE / IPoIB
Spark 1.6.3 / 2.4.8 / 3.2.4 / 3.3.4 / 3.5.1 Standalone / YARN GbE / IPoIB
RDMA-Spark 0.9.5 Standalone / YARN GbE / IPoIB
Flink 1.14.6 / 1.15.4 / 1.16.3 / 1.17.2 / 1.18.1 / 1.19.0 Standalone / YARN GbE / IPoIB

Supported benchmarks in BDEv 3.9

Benchmark Input generator Source code
Hadoop DataMPI Spark Flink
Testdfsio - Hadoop examples - - -
Wordcount RGen (Hadoop RandomTextWriter) Hadoop examples DataMPI examples BDEv* BDEv*
Sort RGen (Hadoop RandomTextWriter) Hadoop examples DataMPI examples BDEv* BDEv*
Grep RGen (Hadoop RandomTextWriter) Hadoop examples DataMPI examples BDEv* BDEv*
Terasort RGen (Hadoop TeraGen) Hadoop examples DataMPI examples BDEv* BDEv*
TPCx-HS RGen (TPC HSGen) TPC - TPC BDEv
Pagerank RGen Pegasus - BDEv* BDEv*
Connected Components RGen Pegasus - BDEv* BDEv*
Bayes RGen Apache Mahout - BDEv* -
Kmeans RGen (Mahout GenKMeansDataset) Apache Mahout - BDEv* BDEv*
Aggregation RGen Apache Hive - BDEv -
Join RGen Apache Hive - BDEv -
Scan RGen Apache Hive - BDEv -
Command User-defined User-defined User-defined User-defined User-defined
*Adapted from Spark and Flink examples when possible