Course Project Report

Distributed Database Implementation with MySQL and DB2 as its underlying DBMS

Objective 1: Compare query speed
Objective 2: Compare ease of use

About MySQL

Open source
Most widly used Relational DBMS
Easy to configure and set up
Provides connectors for many languages

Amazon Web Services (AWS) EC2 Instances

Instance IDOSDBMSNumber
MySQL #Red Hat 4.1.2MySQL 5.0.454
DB2 #openSUSE 4.1.2DB2 Express-C 9.7.14

MySQL Configuration

General Setup
Create a user with password

# useradd <user name>

Later changed the home directory of the new user

# usermod -d /mnt/home/<user name> -m <user name>

Change the data directory of MySQL to the /mnt directory

Stop MySQL

    # service mysqld stop

Change the /etc/my.cnf configuration file

    :
    datadir=/var/lib/mysql => /mnt/var/lib/mysql
    :

Regenerate the system tables

    # mysql_install_db

Restart MySQL

    # service mysqld start
Create a MySQL user with password

mysql> create user <user name>

Data Preparation
Download the Java connector for MySQL, mysql-connector-java-5.1.12-bin.jar
Compile my program on node1 with the connector
Set up databases with create database command

node1 is used as both the catalog node and a data node

mysql> create database <database name>

Grant privilege to the MySQL user

mysql> grant all on <database name>.* to <user name>

showdatabases1

showdatabases2

showdatabases3

showdatabases4

Create tables with the distributed DB implementation

config.txt:

catalog.driver=com.mysql.jdbc.Driver
catalog.hostname=jdbc:mysql://localhost:3306/catalog
catalog.username=mysqlinst1
catalog.passwd=ics421

numnodes=4

node1.driver=com.mysql.jdbc.Driver
node1.hostname=jdbc:mysql://localhost:3306/mydb1
node1.username=mysqlinst1
node1.passwd=ics421

node2.driver=com.mysql.jdbc.Driver
node2.hostname=jdbc:mysql://10.212.133.242:3306/mydb2
node2.username=mysqlinst2
node2.passwd=ics421

node3.driver=com.mysql.jdbc.Driver
node3.hostname=jdbc:mysql://10.245.210.70:3306/mydb3
node3.username=mysqlinst3
node3.passwd=ics421

node4.driver=com.mysql.jdbc.Driver
node4.hostname=jdbc:mysql://10.245.221.220:3306/mydb4
node4.username=mysqlinst4
node4.passwd=ics421
                    

sql.txt:

CREATE TABLE Detection ( 
   objID BIGINT NOT NULL, 
   detectID BIGINT NOT NULL,
   filterID SMALLINT NOT NULL,
   imageID BIGINT NOT NULL,
   obsTime FLOAT NOT NULL DEFAULT -999,
   raObs FLOAT NOT NULL DEFAULT -999,
   decObs FLOAT NOT NULL DEFAULT -999,
   mag REAL NOT NULL DEFAULT -999,
   sky REAL NOT NULL DEFAULT -999,
   sgSep REAL NOT NULL DEFAULT -999 ) ;
   
CREATE TABLE Object ( 
   objID BIGINT NOT NULL,
   htmID BIGINT NOT NULL,
   zoneID INT NOT NULL,
   ra FLOAT NOT NULL,
   de FLOAT NOT NULL,
   cx FLOAT NOT NULL,
   cy FLOAT NOT NULL,
   cz FLOAT NOT NULL,
   lambda FLOAT NOT NULL DEFAULT -999,
   beta FLOAT NOT NULL DEFAULT -999,
   l FLOAT NOT NULL DEFAULT -999,
   b FLOAT NOT NULL DEFAULT -999,
   lsg FLOAT NOT NULL DEFAULT -999,
   bsg FLOAT NOT NULL DEFAULT -999,
   gMagBest REAL NOT NULL DEFAULT -999,
   rMagBest REAL NOT NULL DEFAULT -999,
   iMagBest REAL NOT NULL DEFAULT -999,
   zMagBest REAL NOT NULL DEFAULT -999,
   yMagBest REAL NOT NULL DEFAULT -999,
   grColor REAL NOT NULL DEFAULT -999,
   riColor REAL NOT NULL DEFAULT -999,
   izColor REAL NOT NULL DEFAULT -999,
   zyColor REAL NOT NULL DEFAULT -999,
   sgSep REAL NOT NULL DEFAULT -999 )
                    

showtables1

showtables2

showtables3

showtables4

dtables-mysql

Mount the data machine as a network drive with command

# sshfs db2inst1@ip-10-245-189-242:/mnt/astronomy </path/to/a/local/directory>

Create a script to read a directory and load csv files
for every .gz file f in directory
    gunzip f
    load_into_target_table(f)
    gzip f
end for

load_into_target_table(file f):
    load data local infile '/path/to/the/file/0000001.obj.csv' 
        into table Object 
        fields terminated by ',' 
        optionally enclosed by '"' 
        lines terminated by '\n';
                
time-row-count-mysql1
time-row-count-mysql2
time-row-count-mysql3
time-row-count-mysql4

DB2 Machine Setup

General Setup
A system user db2inst1 is already created
A DB2 user db2inst1 is already created
Default data directory is on the root partition

db2 => create database <database name> on path/to/the/100GB/directory/datadir

Data Preparation
Download the Java connector for DB2, db2-jcc.jar
Compile my program on node1 with the connector
Set up databases with create database command

node1 is used as both the catalog node and a data node

Create tables with the distributed DB implementation

config.txt:

catalog.driver=com.ibm.db2.jcc.DB2Driver
catalog.hostname=jdbc:db2://localhost:50001/catalog
catalog.username=mysqlinst1
catalog.passwd=ics421

numnodes=4

node1.driver=com.ibm.db2.jcc.DB2Driver
node1.hostname=jdbc:db2://localhost:50001/mydb1
node1.username=db2inst1
node1.passwd=ics421

node2.driver=com.ibm.db2.jcc.DB2Driver
node2.hostname=jdbc:db2://10.195.47.21:50001/mydb2
node2.username=db2inst1
node2.passwd=ics421

node3.driver=com.ibm.db2.jcc.DB2Driver
node3.hostname=jdbc:db2://10.195.11.198:50001/mydb3
node3.username=db2inst1
node3.passwd=ics421

node4.driver=com.ibm.db2.jcc.DB2Driver
node4.hostname=jdbc:db2://10.195.11.193:50001/mydb4
node4.username=db2inst1
node4.passwd=ics421
                    

sql.txt:

CREATE TABLE Detection ( 
   objID BIGINT NOT NULL, 
   detectID BIGINT NOT NULL,
   filterID SMALLINT NOT NULL,
   imageID BIGINT NOT NULL,
   obsTime FLOAT NOT NULL DEFAULT -999,
   raObs FLOAT NOT NULL DEFAULT -999,
   decObs FLOAT NOT NULL DEFAULT -999,
   mag REAL NOT NULL DEFAULT -999,
   sky REAL NOT NULL DEFAULT -999,
   sgSep REAL NOT NULL DEFAULT -999 ) 
COMPRESS YES NOT LOGGED INITIALLY;
   
CREATE TABLE Object ( 
   objID BIGINT NOT NULL PRIMARY KEY,
   htmID BIGINT NOT NULL,
   zoneID INT NOT NULL,
   ra DOUBLE NOT NULL,
   de DOUBLE NOT NULL,
   cx DOUBLE NOT NULL,
   cy DOUBLE NOT NULL,
   cz DOUBLE NOT NULL,
   lambda FLOAT NOT NULL DEFAULT -999,
   beta FLOAT NOT NULL DEFAULT -999,
   l FLOAT NOT NULL DEFAULT -999,
   b FLOAT NOT NULL DEFAULT -999,
   lsg FLOAT NOT NULL DEFAULT -999,
   bsg FLOAT NOT NULL DEFAULT -999,
   gMagBest REAL NOT NULL DEFAULT -999,
   rMagBest REAL NOT NULL DEFAULT -999,
   iMagBest REAL NOT NULL DEFAULT -999,
   zMagBest REAL NOT NULL DEFAULT -999,
   yMagBest REAL NOT NULL DEFAULT -999,
   grColor REAL NOT NULL DEFAULT -999,
   riColor REAL NOT NULL DEFAULT -999,
   izColor REAL NOT NULL DEFAULT -999,
   zyColor REAL NOT NULL DEFAULT -999,
   sgSep REAL NOT NULL DEFAULT -999 ) 
COMPRESS YES NOT LOGGED INITIALLY;
                    

listtables1

listtables2

listtables3

listtables4

dtables-db2

Copy data from the data machine using secure copy

# scp db2inst1@10.245.189.242:path/on/the/remote/drive path/to/the/local/folder

Create a script to read a directory and load csv files
for every .gz file f in directory
    gunzip f
    db2 connect to <database name> user <user name> using <password>
    load_into_target_table(f)
    gzip f
end for

load_into_target_table(file f):
    db2 load from <file path> of del insert into <table name>
                
time-row-count-db21
time-row-count-db22
time-row-count-db23
time-row-count-db24

Comparison of DB2 and MySQL

Experiments of Section I
Experiment 1
config.txt (MySQL)
catalog.driver=com.mysql.jdbc.Driver
catalog.hostname=jdbc:mysql://localhost:3306/catalog
catalog.username=mysqlinst1
catalog.passwd=ics421
                
config.txt (DB2)
catalog.driver=com.ibm.db2.jcc.DB2Driver
catalog.hostname=jdbc:db2://localhost:50001/catalog
catalog.username=db2inst1
catalog.passwd=ics421
                
sql.txt
select count(*) from object;
                
Result
row-count-jade-mysql
row-count-jade-db2-2
Experiment 2
config.txt (MySQL)
catalog.driver=com.mysql.jdbc.Driver
catalog.hostname=jdbc:mysql://localhost:3306/catalog
catalog.username=mysqlinst1
catalog.passwd=ics421
                
config.txt (DB2)
catalog.driver=com.ibm.db2.jcc.DB2Driver
catalog.hostname=jdbc:db2://localhost:50001/catalog
catalog.username=db2inst1
catalog.passwd=ics421
                
sql.txt (query 100)
SELECT O.objID, O.ra, O.de, O.htmid, O.zoneid
FROM Object O
WHERE
(   SIN(RADIANS(O.de)) * SIN(RADIANS( +0.5))
  + COS(RADIANS(O.de)) * COS(RADIANS( +0.5))
     * COS(RADIANS((O.ra) - (67.5)))
) >=  COS(RADIANS( 1.0/60.0));
                
Result
query-100-jade-mysql
query-100-jade-db2
Experiment 3
config.txt (MySQL)
catalog.driver=com.mysql.jdbc.Driver
catalog.hostname=jdbc:mysql://localhost:3306/catalog
catalog.username=mysqlinst1
catalog.passwd=ics421
                
config.txt (DB2)
catalog.driver=com.ibm.db2.jcc.DB2Driver
catalog.hostname=jdbc:db2://localhost:50001/catalog
catalog.username=db2inst1
catalog.passwd=ics421
                
sql.txt (query 101)
SELECT O.objID, O.ra, O.de, O.htmid, O.zoneid
FROM Object O
WHERE
(O.zoneid BETWEEN FLOOR((90.0 + ( +0.5) - ( 1.0/60.0))/0.008333)
            AND FLOOR((90.0 + ( +0.5) + ( 1.0/60.0))/0.008333))
AND
(   SIN(RADIANS(O.de)) * SIN(RADIANS( +0.5))
  + COS(RADIANS(O.de)) * COS(RADIANS( +0.5))
     * COS(RADIANS((O.ra) - (67.5)))
) >=  COS(RADIANS( 1.0/60.0));
                
Result
query-101-jade-mysql
query-101-jade-db2
Experiments of Section II
Experiment 1
Performance Comparison with and without indexes
Experiment is done on just one data node
Result
index-experiment-mysql
index-experiment-db2
Experiment 2
config.txt (MySQL)
catalog.driver=com.mysql.jdbc.Driver
catalog.hostname=jdbc:mysql://localhost:3306/catalog
catalog.username=mysqlinst1
catalog.passwd=ics421
                
config.txt (DB2)
catalog.driver=com.ibm.db2.jcc.DB2Driver
catalog.hostname=jdbc:db2://localhost:50001/catalog
catalog.username=db2inst1
catalog.passwd=ics421
                
sql.txt (query 101)
SELECT O.objID, O.ra, O.de, O.htmid, O.zoneid
FROM Object O
WHERE
(O.zoneid BETWEEN FLOOR((90.0 + ( +0.5) - ( 1.0/60.0))/0.008333)
            AND FLOOR((90.0 + ( +0.5) + ( 1.0/60.0))/0.008333))
AND
(   SIN(RADIANS(O.de)) * SIN(RADIANS( +0.5))
  + COS(RADIANS(O.de)) * COS(RADIANS( +0.5))
     * COS(RADIANS((O.ra) - (67.5)))
) >=  COS(RADIANS( 1.0/60.0));
                
Result
query-101-jade-mysql-index
query-101-jade-db2-index
Experiment 3
config.txt (MySQL)
catalog.driver=com.mysql.jdbc.Driver
catalog.hostname=jdbc:mysql://localhost:3306/catalog
catalog.username=mysqlinst1
catalog.passwd=ics421
                
config.txt (DB2)
catalog.driver=com.ibm.db2.jcc.DB2Driver
catalog.hostname=jdbc:db2://localhost:50001/catalog
catalog.username=db2inst1
catalog.passwd=ics421
                
sql.txt (query 100)
SELECT O.objID, O.ra, O.de, O.htmid, O.zoneid
FROM Object O
WHERE
(   SIN(RADIANS(O.de)) * SIN(RADIANS( +0.5))
  + COS(RADIANS(O.de)) * COS(RADIANS( +0.5))
     * COS(RADIANS((O.ra) - (67.5)))
) >=  COS(RADIANS( 1.0/60.0));
                
Script
for i from 0 to 99
    execute: (time ./run.sh config.txt sql.txt > /dev/null) 2>&1 | grep real
                
Analysis
MySQL (s)DB2 (s)
045.55856.925
145.87659.460
245.93658.074
346.02454.029
448.33656.999
548.63652.168
:::
<snip><snip><snip>
:::
9647.51652.349
9748.45255.929
9849.48558.173
9949.52059.229
Average48.64756.490
Minimum45.55851.870
Maximum51.43259.779
Experiment 4
config.txt (MySQL)
catalog.driver=com.mysql.jdbc.Driver
catalog.hostname=jdbc:mysql://localhost:3306/catalog
catalog.username=mysqlinst1
catalog.passwd=ics421
                
config.txt (DB2)
catalog.driver=com.ibm.db2.jcc.DB2Driver
catalog.hostname=jdbc:db2://localhost:50001/catalog
catalog.username=db2inst1
catalog.passwd=ics421
                
sql.txt (query 101)
SELECT O.objID, O.ra, O.de, O.htmid, O.zoneid
FROM Object O
WHERE
(O.zoneid BETWEEN FLOOR((90.0 + ( +0.5) - ( 1.0/60.0))/0.008333)
            AND FLOOR((90.0 + ( +0.5) + ( 1.0/60.0))/0.008333))
AND
(   SIN(RADIANS(O.de)) * SIN(RADIANS( +0.5))
  + COS(RADIANS(O.de)) * COS(RADIANS( +0.5))
     * COS(RADIANS((O.ra) - (67.5)))
) >=  COS(RADIANS( 1.0/60.0));
                
Script
for i from 0 to 99
    execute: (time ./run.sh config.txt sql.txt > /dev/null) 2>&1 | grep real
                
Analysis
MySQL (s)DB2 (s)
03.33813.809
13.31013.029
23.23413.269
33.40214.131
43.23313.164
53.05412.551
:::
<snip><snip><snip>
:::
963.25413.134
973.08912.991
983.23813.052
993.13413.146
Average3.25213.287
Minimum3.05311.936
Maximum3.40213.871
Experiment 5
config.txt (MySQL)
catalog.driver=com.mysql.jdbc.Driver
catalog.hostname=jdbc:mysql://localhost:3306/catalog
catalog.username=mysqlinst1
catalog.passwd=ics421
                
config.txt (DB2)
catalog.driver=com.ibm.db2.jcc.DB2Driver
catalog.hostname=jdbc:db2://localhost:50001/catalog
catalog.username=db2inst1
catalog.passwd=ics421
                
sql.txt (query 102)
SELECT O.objID, O.ra, O.de, O.htmid, O.zoneid
FROM Object O
WHERE
(O.ra BETWEEN ((67.5)-( 1.0/60.0)) AND ((67.5)+( 1.0/60.0)))
AND
(O.de BETWEEN (( +0.5)-( 1.0/60.0)) AND (( +0.5)+( 1.0/60.0)))
AND
(   SIN(RADIANS(O.de)) * SIN(RADIANS( +0.5))
  + COS(RADIANS(O.de)) * COS(RADIANS( +0.5))
     * COS(RADIANS((O.ra) - (67.5)))
) >=  COS(RADIANS( 1.0/60.0));
                
Script
for i from 0 to 99
    execute: (time ./run.sh config.txt sql.txt > /dev/null) 2>&1 | grep real
                
Analysis
MySQL (s)DB2 (s)
01.2965.750
11.3545.540
21.3375.624
31.3545.683
41.3175.349
51.3545.729
:::
<snip><snip><snip>
:::
961.3346.167
971.3655.856
981.3806.563
991.3725.661
Average1.2965.534
Minimum1.0594.778
Maximum1.4106.827
Experiment 6
config.txt (MySQL)
catalog.driver=com.mysql.jdbc.Driver
catalog.hostname=jdbc:mysql://localhost:3306/catalog
catalog.username=mysqlinst1
catalog.passwd=ics421
                
config.txt (DB2)
catalog.driver=com.ibm.db2.jcc.DB2Driver
catalog.hostname=jdbc:db2://localhost:50001/catalog
catalog.username=db2inst1
catalog.passwd=ics421
                
sql.txt (query 100)
SELECT count (*) from Object;
                
Script
for i from 0 to 99
    execute: (time ./run.sh config.txt sql.txt > /dev/null) 2>&1 | grep real
                
Analysis
MySQL (s)DB2 (s)
01.3249.057
11.2189.917
21.2069.253
31.2099.608
41.0819.790
51.2589.624
:::
<snip><snip><snip>
:::
961.3069.224
971.2069.409
981.2699.345
991.2829.745
Average1.1959.420
Minimum0.9788.957
Maximum1.33310.317

Summary

Query Speed Comparison
chart
Ease of Use Comparison
Overall, two DBMSs work similarly
Java connectors are practically identical
Query syntax is similar
Command line tool syntax is different
DB2 command line processor is stateful, MySQL is stateless
MySQL standard output is a little cleaner

Source

Valid HTML 4.01 Valid CSS