Introduction
Salesforce Bulk API allows for asynchronous processing of large volumes of data. This is especially useful for tasks that would exceed normal processing limits. Over time, Salesforce introduced Bulk API v2 to enhance and streamline bulk processing. This guide aims to delineate the nuances between the two versions.
Bulk API v1
- Functionality: Asynchronous processing of large data volumes.
- Governor Limits:
- Maximum of 5 batches per rolling 24-hour period for free Developer Edition.
- Maximum of 5,000 batches/day for orgs.
- A batch can have 10,000 records at most.
- Max character limit for a CSV batch is 400 MB.
- Features:
- Supports CSV, XML, and JSON data formats.
- Requires the use of batches.
- CRUD operations + Query supported.
Bulk API v2
- Functionality: A simplified process for inserting, updating, deleting, or querying large sets of data.
- Governor Limits:
- No daily limits on the number of batches.
- A batch can have up to 150 million records.
- 100 MB is the maximum size for a single file.
- Features:
- Only CSV and JSON are supported (no XML).
- Eliminates the need for the user to define batches; Salesforce automatically splits the data into batches.
- Provides a simplified status check.
- CRUD operations + Query supported.
Key Differences Between Bulk API v1 & v2:
- Batch Handling:
- v1: Requires explicit batch creation.
- v2: Salesforce handles batch creation automatically.
- Data Format:
- v1: Supports CSV, XML, and JSON.
- v2: Supports only CSV and JSON.
- Governor Limits:
- v1: 5,000 batches/day (with exceptions for free Developer Editions). Batches limited to 10,000 records or 400 MB for CSV.
- v2: No daily batch limits. Each batch (auto-created) can be up to 150 million records, but each file must not exceed 100 MB.
- Status and Monitoring:
- v1: Requires checking status for each batch.
- v2: Provides a simplified status API endpoint that returns the state of the entire job.
When to Use Which?
Advantages of Bulk API v1:
- Data Format Flexibility: If you need to handle XML, v1 is your only option.
- Granular Control: Since you’re managing the batches, you can have specific control over how data is processed.
Advantages of Bulk API v2:
- Simplicity: Auto-batch creation removes a layer of complexity.
- Volume: Ideal for extremely large datasets due to the 150 million record limit per batch.
- Monitoring: Easier and more straightforward job monitoring.
Use Cases for Bulk API v1:
- If your system or integration middleware specifically deals with XML data formats.
- When you want explicit control over the batch processing.
Use Cases for Bulk API v2:
- Processing massive datasets.
- When you prefer a hands-off approach to batch management.
- Integrations where monitoring job status with minimal overhead is critical.
Conclusion
While both Bulk API v1 and v2 offer robust mechanisms for processing large datasets in Salesforce, your specific needs and the nuances of your integrations will determine which is best for your use case. For most modern applications, v2 offers a streamlined, powerful, and simplified approach, but v1 still has its place, especially when dealing with XML or when needing granular control over batches.