Explore Courses Blog Tutorials Interview Questions
0 votes
in Salesforce by (11.9k points)

I am developing a Java application which will query tables which may hold over 1,000,000 records. I have tried everything I could to be as efficient as possible but I am only able to achieve on avg. about 5,000 records a minute and a maximum of 10,000 at one point. I have tried reverse engineering the data loader and my code seems to be very similar but still no luck.

Is threading a viable solution here? I have tried this but with very minimal results.

I have been reading and have applied every thing possible it seems (compressing requests/responses, threads etc.) but I cannot achieve data loader like speeds.

To note, it seems that the queryMore method seems to be the bottle neck.

Does anyone have any code samples or experiences they can share to steer me in the right direction?


1 Answer

0 votes
by (32.1k points)

A procedure which I've used in the past is to query for the IDs that you want. You can then parallelize the retrieves() across various threads.

It looks something like this:

(query thread) > BlockingQueue > (thread pool doing retrieve()) > BlockingQueue

The first thread is used to do query() and queryMore() as quickly as it can, writing all IDs it gets into the BlockingQueue. queryMore() isn't something you should call concurrently, as far as I know, so there's no way to parallelize this step. All IDs are written into a BlockingQueue. You may wish to package them up into bunches of a few hundred to reduce lock contention if that becomes an issue. A thread pool can then do simultaneous retrieve() calls on the IDs to get all the fields for the SObjects and put them in a queue for the rest of your app to deal with.

Browse Categories