Pursuing

Alfred’s weblog

An severe performance impact by subselect in sql “from” clause (DB2)

Started working on the auto-pricing process performance test, but an very strange thing blocks me: a very simple query runs in very very bad performance, only 1 record inserted per second! The database is DB2 8.5 on zLinux server which runs in normal CPU utilization (35% only).

Originally, I thought there must be some problem in the database, so bad performance should not happen even the SQL hasn’t been optimized. But the feedback from DBA is that he checked all things in DB, but can not find any abnormal indication. I really felt overwhelmed!

But today, I finally acknowledge that the bad performance is really caused by my query:

// less 1 record inserted per second, 30000 records insertion need 10 hours to complete (timeout…)
INSERT INTO WWPRT.CABLEPRODUCT_PRICETYPE_JOIN_CN (CABLEPRODUCTID, PRICETYPE, SLAVE) (
     SELECT cableproduct.ID, pricetypes.PRICETYPE, ‘N’
     FROM WWPRT.CABLE_PRODUCT_JOIN_CN AS cableproduct,
     (
          SELECT DISTINCT usprice.PRICETYPE FROM WWPRT.PRICE_CN usprice
          JOIN WWPRT.CONF_PRICETYPE pt ON pt.ID = usprice.PRICETYPE
          WHERE usprice.CABLEID = ‘USCABF3FVT01’
          AND pt.DOMAIN IN ( SELECT DISTINCT DOMAIN FROM WWPRT.CONF_GAP )

     ) AS pricetypes
     WHERE CABLEID = ‘GEOANNF3FVT1’
     AND EXISTS (
         ……
     )
)

The arch-criminal is the subselect statement in “from” clause, the subselect query is performed for each record in WWPRT.CABLE_PRODUCT_JOIN_CN table. (awful!)

I changed it to a new way, using temporary table which query is executed only once.

// 30000 records insertion only costs 9 seconds to complete!
WITH US_PRICETYPES (PRICETYPE) AS (
    SELECT DISTINCT usprice.PRICETYPE FROM WWPRT.PRICE_CN usprice
    JOIN WWPRT.CONF_PRICETYPE pt ON pt.ID = usprice.PRICETYPE
    LEFT OUTER JOIN WWPRT.CONF_GAP gap ON pt.DOMAIN = gap.DOMAIN
    WHERE usprice.CABLEID = ‘USCABF3FVT01’
    AND gap.DOMAIN IS NOT NULL
)
SELECT COUNT(*) FROM NEW TABLE
(INSERT INTO WWPRT.CABLEPRODUCT_PRICETYPE_JOIN_CN (CABLEPRODUCTID, PRICETYPE, SLAVE) (
     SELECT cableproduct.ID, pricetypes.PRICETYPE, ‘N’
     FROM WWPRT.CABLE_PRODUCT_JOIN_CN AS cableproduct,
     US_PRICETYPES AS pricetypes
     WHERE CABLEID = ‘GEOANNF3FVT1’
     AND EXISTS (
         ……
     )
))

The improvement is awesome! From 10 hours to 10 seconds! Although it happens in DB2, but I think this case should also exist in other types of database, but not verified.

April 30, 2009 Posted by | Specialty | , , | Leave a comment

An Derby Properties and Connection URL Example

Have been using Derby for a long time, which is a very good choice as memory database or embedded database, let me backup some useful Derby DB Properties here:

Developer can use a derby.properties file in the application classpath or javadb classpath to configure the database, these are some configuration examples in the file:

—————————- Derby derby.properties information —————————-
–### Derby System Home Directory
derby.system.home=D:/workspace/Boulevard/BoulevardDB

–### Derby Authentication Configuration and Turn On
derby.connection.requireAuthentication=true
derby.authentication.provider=BUILTIN
derby.database.propertiesOnly=true

–### User/Password Definition (System Level)
derby.user.alfred=alfred

–### User Authorization Definition (System Level)
derby.database.defaultAccessMode=fullAccess
derby.fullAccessUsers=cube

–### Some Others Configuration
derby.infolog.append=true
derby.storage.pageSize=8192
derby.storage.pageReservedSpace=60
derby.database.defaultConnectionMode
derby.locks.deadlockTimeout=20
derby.locks.waitTimeout=30
—————————- Derby derby.properties information —————————-

You have 2 ways to start the database:

1. Start embedded Derby from java application. The database is active from the first time you connect to it, and become inactive when your java application stop; and any other application outside of the same JVM can not access the database instance.

To connect to embedded Derby by URL:
——————————- Derby connection information ——————————-
— ### Use ij command to connect embeded database; if not exist, create it.
CONNECT ‘jdbc:derby:data;create=true;user=alfred;password=alfred’ AS CUBE;
— ### Use ij command to connect embeded database; if not exist, don’t create it, throw out error.
CONNECT ‘jdbc:derby:data;user=alfred;password=alfred’ AS CUBE;
——————————- Derby connection information ——————————-

2. Start Derby network server. The network server instance can only be started by command “…\javadb\bin\startNetworkServer.bat”, and it runs as a server which will not stop when your application stops; and other applications in any JVM can also access the database locally or remotely.

To connect to Derby network server:
——————————- Derby connection information ——————————-
— ### Use ij command to connect network server database, reading default repository;
— ### If not exist, create it.
CONNECT ‘jdbc:derby://localhost:1527/data;create=true;user=alfred;password=alfred’ AS BOULEVARD;
— ### Use ij command to connect network server database; reading default repository;
— ### If not exist, don’t create it, throw out error.
CONNECT ‘jdbc:derby://localhost:1527/data;user=alfred;password=alfred’ AS BOULEVARD;
— ### Use ij command to connect network server database, reading specific repository;
— ### If not exist, create it.
CONNECT ‘jdbc:derby://localhost:1527/D:/workspace/Boulevard/BoulevardDB/data;create=true;user=alfred;password=alfred’ AS BOULEVARD;
— ### Use ij command to connect network server database, reading specific repository;
— ### If not exist, don’t create it, throw out error.
CONNECT ‘jdbc:derby://localhost:1527/D:/workspace/Boulevard/BoulevardDB/data;user=alfred;password=alfred’ AS BOULEVARD;
——————————- Derby connection information ——————————-

April 16, 2009 Posted by | Specialty | , , | Leave a comment

Some lessons learned from my TDD practice

Attended 2 days TDD course in last Dec. and started to use it in my project from Jan. These are some lessons learned from my TDD practices so far:

1. For any change into the code behavior, must think and write the test first (of course, this is the core principle of TDD);

2. In an ideal way, the TDD can totally replace low level design. But according to the result of a survey for TDD developers, most of new TDD developers found it was really hard to write effective test case directly without any low level design. As a trade off, developer can document a simplified low level design before coding test case. (The low level design should not be too detail, Simple Design is one of practices in eXtreme Programming, which is one part of Agile)

3. If developer write low level design first before coding test case, list out needed TDD test cases in the design document. Two benefits:
     a. Enforce developer to think about how to test it when he think about the design;
     b. Make sure the developer will not forget any test when he is coding.

4. Test case code and running result should also be reviewed during code review activity.

5. Unit test cases must be frequently executed to ensure all tests can be passed at the end of every day (it’s better to adopt Continuous Integration if possible, CC can automatically run all test cases periodically and generate feedback);

6. Isolate the test data within one test class as much as possible, in order to make sure changes in one test will not impact others, and all tests can be re-run easily by another one; and the developer must make sure the test case running should not be impacted by local environment configuration or data.

7. Feel comfortable only when test case is written for function or business scenario, not for java class or method. You know, actually TDD is a business driven design/development approach.

More to be added…

April 16, 2009 Posted by | Specialty | , , , | 1 Comment