“enq:CR - block range reuse ckpt”
I think the problem is this: when doing parallel DML, Oracle uses direct path reads to read the data. To make sure the data being read is consistent though it needs to checkpoint blocks it is going to read. CR enqueue seems to be used to synchronize CKPT with PX slaves reading the data. It’s going like this: PX slave requests CKPT to checkpoint a range of blocks it’s about to read (and possibly enqueues on CR in order to be able to do so,) CKPT in turn orders DBWR to write all dirty blocks in the requested range, waits for this write to complete and signals PX slave that the blocks are consistent and can be read. Repeat as necessary. Now, with really large buffer cache and good portion of it dirty due to recent DML on source tables, busy CPUs and relatively slow writes this checkpointing might build up into a major performance inhibitor when these recently touched tables are being read in direct path mode: you have 8 cores and 8 parallel slaves, so CKPT, DBWR and pretty much all other background processes are competing with them for CPU time and I/O bandwidth (they all have the same priority, so your PX slaves can easily evict CKPT or DBWR from CPU only to immediately go back to sleep on an enqueue held by a process it just preempted.)
Things to try here:
1. do a checkpoint before running the insert and see if this makes any difference;
2. lower the degree of parallelism to 5-6;
3. If you have more than one DBWR configured - configure db_writer_processes back to 1.