Ask The Real Tom

April 30, 2009



High Perfomance DML with Parallel Piplined Function and Bulk DML (II)

Filed under: oracle — Tags: , , — admin @ 7:34 am

Refering to my previous blog.
I could speed up the update of my large 120GB table from 02:25:43.86 to 01:28:46.61.

I’ll just explain the changes to get the additional 55 minutes.

v_array_size PLS_INTEGER := 10000;
I still not sure what the optimal size is bit 10000 works well.

FORALL i IN INDICES OF v_row1
This is the main change. The collection gets no copied, it just references the collected array.

select count(*) from table(pgk_xxxx_mig_parallel_ptf_api.ptf_dml( cursor(select /*+ PARALLEL(t,64) PARALLEL_INDEX(t, I_PK_TRANSAKT, 64) */ TRANSACTION ID from T_TRANSACTION t ) ) );
parallel 64 does more than 16.

Still improvements should be possible. I’m working on it. :-)

April 24, 2009



High Perfomance DML with Parallel Piplined Function and Bulk DML

Parallel Update and Delete has some restriction on partitioned tables. The statement will just run in a parallel degree equal to the partition count.

I searched a solution to optimze a update on a 120G Table with about 300′000′000 rows which touches every row.

My current solution is a do a buld update within a pipelined function which is parallel executet.
With this solution I can run the update with bulk dml and in parallel.




Important is that the dml within the pipelined function is an anoymous transaction since any implicit or explicit commits are allowed in the session where the pipelined function is executet.




Now I have the possibility to tune any bulk batch size and any parallel degree to optimized the dml to my need.


CREATE OR REPLACE PACKAGE pgk_tom_mig_parallel_ptf_api AS

TYPE t_parallel_test_row IS RECORD ( TRANSACTIONID NUMBER(16));
TYPE t_parallel_test_ref_cursor IS REF CURSOR RETURN t_parallel_test_row;
TYPE t_dml_counter IS TABLE OF NUMBER;

TYPE t_TRANSACTIONID_tab IS TABLE OF t_transaction.TRANSACTIONID%TYPE INDEX BY BINARY_INTEGER;
v_row1 t_TRANSACTIONID_tab;
v_array_size PLS_INTEGER := 500000;

function ptf_dml(p_cursor IN t_parallel_test_ref_cursor)
return t_dml_counter
pipelined
parallel_enable (PARTITION p_cursor BY ANY );
-- parallel_enable (PARTITION p_cursor BY RANGE (TRANSACTIONID));

END pgk_tom_mig_parallel_ptf_api;
/
SHOW ERRORS
CREATE OR REPLACE PACKAGE BODY pgk_tom_mig_parallel_ptf_api AS

-- create or replace function ptf_dml(p_cursor IN SYS_REFCURSOR)
function ptf_dml(p_cursor IN t_parallel_test_ref_cursor)
return t_dml_counter
pipelined
-- parallel_enable (PARTITION p_cursor BY RANGE (TRANSACTIONID))
parallel_enable (PARTITION p_cursor BY ANY )
is
PRAGMA AUTONOMOUS_TRANSACTION;
r_rowid t_transaction.TRANSACTIONID%type;
BEGIN
LOOP
FETCH p_cursor BULK COLLECT INTO v_row1 LIMIT v_array_size;
FORALL i IN 1..v_row1.COUNT

UPDATE t_transaction t
SET t.yx =
.......
WHERE TRANSACTIONID=v_row1(i);

-- rollback;
commit;
pipe row (1);
EXIT WHEN p_cursor%NOTFOUND;
END LOOP;
end ptf_dml;

end pgk_tom_mig_parallel_ptf_api;
/
SHOW ERRORS

alter session force parallel query parallel 16;
select count(*) from table(pgk_tom_mig_parallel_ptf_api.ptf_dml( cursor(select /*+ PARALLEL(t,16) PARALLEL_INDEX(t, I_PK_TRANSID, 16) */ TRANSACTION
ID from T_TRANSACTION t ) ) );
alter session enable parallel query;

COMMIT;

April 14, 2009



Is it possible to parallize index rebuild with piplined functions?

Filed under: oracle — Tags: , , — admin @ 4:33 pm

I was just wondering if how I could rebuild multple indexes of a table simultaneously (but not parallel)

My test case:

whenever SQLERROR continue;

create user test4 identified by test4;

grant create session to test4;
grant resource to test4;

connect test4/test4

set serveroutput on

CREATE OR REPLACE TYPE number_ntt AS TABLE OF NUMBER
/
show errors

create or replace function ptest(p_cursor IN SYS_REFCURSOR)
return number_ntt
pipelined
parallel_enable
is
v_number number;
begin
loop

fetch p_cursor into v_number;
EXIT WHEN p_cursor%NOTFOUND;
execute immediate 'alter index i1 rebuild';

dbms_output.put_line('doit for: '||v_number);
pipe row (1);
end loop;
end;
/
show errors

create table t1 (id1 number);
insert into t1 values (1);
insert into t1 values (2);
insert into t1 values (3);
insert into t1 values (4);

create index i1 on t1 (id1);

select /* PARALLEL(4) */ * from table(ptest( cursor(select id1 from t1) ));

connect / as sysdba

drop user test4 cascade;

My result: NO

SQL> @pipe

User created.

Grant succeeded.

Grant succeeded.

Connected.

Type created.

No errors.

Function created.

No errors.

Table created.

1 row created.

1 row created.

1 row created.

1 row created.

Index created.

select /* PARALLEL(4) */ * from table(ptest( cursor(select id1 from t1) ))
*
ERROR at line 1:
ORA-14552: cannot perform a DDL, commit or rollback inside a query or DML
ORA-06512: at "TEST4.PTEST", line 12

Connected.

User dropped.

April 3, 2009



DBMS_METADATA output format

Filed under: oracle — Tags: , , , — admin @ 12:02 pm

Extranctin DDL mit DBMS_METADATA is really easy wait to optain the DDL.
But often the formating in the spool file just sucks.

Here some hints:


set heading off
set feedback off
set verify off
set pagesize 0
set linesize 1000
set newpage none
set trimspool on
set long 5000000

col output for a1000;

SELECT dbms_metadata.get_ddl('TABLESPACE', 'T_XXX_1M') output FROM dual;

The trick is to name an format the column.

Benefits:

  • no trailing spaces
  • no unwanted line breaks

April 1, 2009



Fast import/export in MySQL with LOAD DATA INFILE

Filed under: mysql — Tags: , , , , — Tom's Real Assistant @ 3:05 pm

MySQL does not offer many possibilities for importing/exporting data in a fast manner. One way to do so is to use the LOAD DATA INFILE statement for loading data from/to text files.

SELECT * INTO OUTFILE '/tmp/data.txt' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n'; FROM test.data_table;

With the example above you create a csv (comma-separated values) file with the content of a given table.

To read rows from a text file into a table use the following example:

LOAD DATA INFILE '/tmp/data.txt' INTO TABLE data_table FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';

Furthermore this way you are able to parallelize imports and exports to reduce the runtime. Just execute several statements simultaneously.

Powered by WordPress