Among the many new goodies in Summer ’09 release is a powerful new feature to do batch processing on your database records. Tasks that require processing of large data volumes without any active human intervention can take advantage of this feature. As an example, consider the task of validating addresses in your contacts when you can potentially have millions of contact records. A batch job would be ideal for this scenario since you can start the batch job, continue to work or even log off while the job continues to execute.
To use this functionality, you need to implement the Database.Batchable interface. You can find an example of the usage in the Apex Code Developer’s Guide. The Database.Batchable interface has three methods that you would need to implement as shown below
global class MyBatchTest implements Database.Batchable
{ global Database.QueryLocator start() { ... } global void executeBatch( SObject[] batch) { ... } global void finish() { ... }}
The start() method determines the set of records that will be processed by the executeBatch method. You would need to construct a SOQL query and return a QueryLocator object. For example,
return Database.getQueryLocator( 'SELECT Name, MailingAddress FROM Contact' )
would return all contact records for processing. You can ofcourse, make the query as selective as you wish with additional filter criteria. There is a limit of five fifty million records which can be returned by the QueryLocator object.
To start a batch job, you create and instance of this class and call the executeBatch method.
MyBatchTest b = new MyBatchTest( ... ) ; ID myBatchJobID = Database.executeBatch(b) ;
When you call executeBatch on your instance, the system enqueues the job for processing and returns an ID. When the system is ready to execute the job, it calls the start method and then calls the executeBatch method for chunks of 200 records. So if the QueryLocator returned back 1000 records, the executeBatch method will be called five times. The batch job is run using the permission of the user that enqueued the job. The finish method is called after all records have been processed and can be used to perform any post-processing tasks like sending out e-mails etc. The ID returned by the Database.executeBatch method can be used to monitor the status of the job programmatically by querying the AsynchApexJob queue. You can also monitor the job under Setup->Monitoring->Apex Jobs.
The documentation has additional details on usage, governor limits and a few best practices. A common question that comes up is the ability to schedule jobs at a certain time or with some periodicity (for example run a job every day at midnight). This feature is not (yet) available. Also, the batch Apex feature is still in preview mode and has to be explicitly provisioned for your org. If you need this feature, please contact support with a short description of your use case.
Finally, I would encourage you to sign up for the Summer ’09 preview, it has a lot of other cool new features!
