Data loading and configuration strategy

When defining your data loading and configuration strategy, consider using the following:

Pre-Processing

Setup options to defer non-critical processes and speed up loading.
  • Using "Public Read/Write" Organization-wide sharing defaults – When you load data with a Private sharing model, the system calculates sharing as the records are being added. If you load with a Public Read/Write sharing model, you can defer this processing until after cutover.
  • Avoid Complex object relationships – The more lookups you have defined on an object, the more checks the system has to perform during data loading. If you can establish some of these relationships in a later phase, loading will be quicker.
  • Avoid Sharing rules – If you have ownership-based sharing rules configured before loading data, each record you insert requires sharing calculations if the owner of the record belongs to a role or group that defines the data to be shared. If you have criteria-based sharing rules configured before loading data, each record with fields that match the rule selection criteria also requires sharing calculations. Enable parallel recalculation and defer sharing calculation features by reaching salesforce.
  • Avoid Workflow rules, validation rules, and triggers – These are powerful tools for making sure data entered during daily operations is clean and includes appropriate relationships between records. Unfortunately, they can also slow down processing if they are enabled during massive data loads. Clean up the data before insert hence improving the loading performance. 
Few essential aspects to load the data:
  • Parent records with master-detail children – You won’t be able to load child records if the parents don’t already exist. 
  • Record owners (users) – In most cases, your records will be owned by individual users, and the owners need to exist in the system before you can load the data.
  • Role hierarchy – You might think that loading would be faster if the owners of your records were not members of the role hierarchy. But in almost all cases, the performance would be the same, and it would be considerably faster if you were loading portal accounts. So there would be no benefit to deferring this aspect of configuration. 
    • Load users, assigning them to appropriate roles.
    • But do remember if record owner has role and that may result in ownership skew.

Ref: https://developer.salesforce.com/blogs/engineering/2013/04/extreme-force-com-data-loading-part-2-load-into-a-lean-salesforce-configuration.html

Processing Data Load

  • Load parent objects before their master-detail children, then extract keys as needed for later loading. 
  • Use the fastest operation possible: insert is faster than upsert, and even insert + update can be faster than upsert alone.
  • When processing updates, only send fields that have changed for existing records.
  • Group child records by ParentId, making sure that separate batches don’t reference the same ParentIds. This practice can greatly reduce or eliminate the risk of record-locking errors. If this cannot be arranged, you also have the option of using the Bulk API in serial execution mode to avoid locking from parallel updates.

Post-Processing

Once you have finished loading your data, it is time to complete the data enrichment and  configuration tasks you have deferred until this point:
  • Set organization-wide defaults to Public Read Only or Private. Create public groups and queues. Create sharing rules.
  • Add lookup relationships between objects, roll-up summary fields to parent records, and other data relationships between records.
  • Enhance records in Salesforce with foreign keys or other data to facilitate integration with your other systems.
  • Batch Apex and the Force.com Bulk API are both efficient methods for performing all aggregates/trigger logic updates to a very large number of records.
  • Reset the fields on the custom settings you created for triggers, so that they will fire appropriately on record creation and updates.
  • Turn validation, workflow, and assignment rules back on so they will trigger the appropriate actions as users enter and edit records.

Ref: https://developer.salesforce.com/blogs/engineering/2013/05/extreme-force-com-data-loading-part-3-suspending-events-that-fire-on-insert.html

Comments