Wednesday, 27 November 2013

Salesforce Account Record Merging With Deleting the Duplicate Records

Whenever we are working with bulk data, its a hard thing to check the duplicate records and then delete them without loosing data.

So I thought why shouldn't we go for a bulk process that helps us in Merging the duplicate records and them clearing them from database without any loss of data.

So here I have done something for my Account object, I have written a batch, whenever we execute it, it will start processing on Account records created in "LAST 15 MINUTES", takes those Account's Name and Billing Address, and then it will check in database that is there any record having same Name AND Billing Address or not.

This batch won't work if there is any slight change in between name or billing address.

for example I have created two Account records having same Name and Billing Address only the first record Phone was an extra field filled.

1st Account created

2nd Account without Phone but same Billing Address and Name

Now when I execute my batch in console

Now my batch deletes duplicate Account that have empty phone

Now the best thing is, no matter how many field you have in Account it will save your field values in the unique record, that means your data will never loose your data.

Here is the batch

* Description : Batch class to merge duplicate transaction Account records.
* Created Date : 11-26-2013 
* Revision Logs : V1.0 - Created
global class Batch_MergeDuplicateTransactionAccounts implements Database.Batchable<sObject>, Database.Stateful {
    //Set to hold the account records name with billing address field values
    global Set<String> setAccountNameWithBillingAddress;
    //String to hold the Account object fields names
    String accountFieldsNamesString = '';
    //Set to hold the Account fields API name strings
    Set<String> setAccountFieldsNameString;
    //Calling Constructor 
    global Batch_MergeDuplicateTransactionAccounts() {
        //Memory Allocation to collections
        setAccountFieldsNameString = new Set<String>();
        setAccountNameWithBillingAddress = new Set<String>();
        //Describe Account object and get all the fields
        Map<String, Schema.SObjectField> accountFieldsMap = Account.sObjectType.getDescribe().fields.getMap();
        //Loop through Account fields Names through Schema Methods
        for(String fieldName : accountFieldsMap.keySet()) {
            //Describe field
            Schema.DescribeFieldResult field = accountFieldsMap.get(fieldName).getDescribe();
            //Filtering out the fields for getting only updatable non system fields
            if(!field.isCalculated() && field.isCreateable() && field.isUpdateable()
               && !field.getLocalName().equalsIgnoreCase(Constants.PARENTID)) {
                   //Populate set with the Fields Names string values
                   //Account object fields name string
                   if(accountFieldsNamesString == '')
                       accountFieldsNamesString = field.getLocalName(); 
                       accountFieldsNamesString += ',' + field.getLocalName();
    //Start method
    global Database.QueryLocator start(Database.BatchableContext BC) {
        //Varibale to hold the current date time value
        DateTime currentDateTime =;
        //Varibale to hold the 15 minute ago date time value
        DateTime fifteenMinuteAgoDateTime =;
        //String variable to hold the Account records those were created in last 15 minute and so.
        String sOQLQuery = 'SELECT ID, Name FROM Account WHERE'
            + ' CreatedDate >: fifteenMinuteAgoDateTime AND CreatedDate <=: currentDateTime'
            + ' AND Name != null ORDER By CreatedDate ASC';
        //Fetching all the Account records from the database
        return Database.getQueryLocator(sOQLQuery);
    //Exectue Method having logic for duplicate accounts finding's
    global void execute(Database.BatchableContext BC, List<Account> scope) {
        //Set to hold the account name with billing address strings
        Set<String> setAccountNameWithBillingAddressStrings = new Set<String>();
        //Map to hold the Account records corresponding the account name and billing address value as key
        Map<String, List<Account>> mapNameBillingAddressKeyWithAccounts = new Map<String, List<Account>>();
        //This map is to hold the Parent Id and List of List of Accounts. One list will have maximum 2 child records.
        //Standard merge statement allows to merge 3 records at a time, so List of list will hold list of 2 child records
        Map<Id, List<List<Id>>> mapParentAccountWithListOfChildrenAccounts = new Map<Id, List<List<Id>>>();
        //This mapis to hold the updated Parent data always
        Map<Id, Account> mapParentAccount = new Map<Id, Account>();
        //Loop through account records in scope
        for(Account acc : scope) {
            //Checking value in set
            if(!(setAccountNameWithBillingAddress.contains(acc.Name.trim().toLowerCase()))) {
                //Populate set with values
        //Check set for size
        if(setAccountNameWithBillingAddressStrings != null) {
            //Loop through eligible account records
            for(Account account : Database.query('SELECT ' + accountFieldsNamesString + ' FROM Account WHERE Name IN : setAccountNameWithBillingAddressStrings AND Name != null ORDER By CreatedDate ASC')) {
                //Key String
                String keyString = '';
                String billingAddressString = '';
                //Appending account billing address field values in key string after performing validation on them
                if(account.BillingStreet != null)
                    billingAddressString += account.BillingStreet.trim().toLowerCase(); 
                    billingAddressString += null;
                if(account.BillingCity != null)
                    billingAddressString += account.BillingCity.trim().toLowerCase(); 
                    billingAddressString += null;
                if(account.BillingState != null)
                    billingAddressString += account.BillingState.trim().toLowerCase(); 
                    billingAddressString += null;
                if(account.BillingCountry != null)
                    billingAddressString += account.BillingCountry.trim().toLowerCase(); 
                    billingAddressString += null;
                if(account.BillingPostalCode != null)
                    billingAddressString += account.BillingPostalCode.trim().toLowerCase(); 
                    billingAddressString += null;
                //Formation of key string with the help of account name and billing address string
                keyString = account.Name.trim().toLowerCase() + Constants.SEPERATOR + billingAddressString;
                System.debug('@@@@@ keyString ' + keyString);
                //Check for key value in map
                if(mapNameBillingAddressKeyWithAccounts.containsKey(keyString)) {
                    //Get the Values of the Map and add Id to it.
                } else {
                    //Creat a new Set at values and add Id to it.
                    mapNameBillingAddressKeyWithAccounts.put(keyString, new List<Account>{account}); 
                System.debug('@@@@@ mapNameBillingAddressKeyWithAccounts ' + mapNameBillingAddressKeyWithAccounts);
                //Loop through map keys
                for(String key : mapNameBillingAddressKeyWithAccounts.keySet()) {
                    //Checking if we have more than one account record in the list corresponding to the account name, billingaddress combined string key
                    if(mapNameBillingAddressKeyWithAccounts.get(key) != null && mapNameBillingAddressKeyWithAccounts.get(key).size() >= 1) {
                        //Account record having oldest created date stamped on it will become parent of other dup recods
                        Account parentAccount = mapNameBillingAddressKeyWithAccounts.get(key)[0];
                        //Set Parent in Map with latest Values
                        mapParentAccount.put(parentAccount.Id, parentAccount);
                        //Add a default list
                        List<List<Id>> lstOfLst = new List<List<Id>>();
                        lstOfLst.add(new List<Id>());
                        mapParentAccountWithListOfChildrenAccounts.put(parentAccount.Id, lstOfLst);
                        //Lopp through the child records
                        //Set all the null field in Parent with child data if child have not null value
                        for(Integer i=1; i<mapNameBillingAddressKeyWithAccounts.get(key).size(); i++) {
                            //Dup Child Account record
                            Account childAccount = mapNameBillingAddressKeyWithAccounts.get(key)[i];
                            //Loop through set having Account object fields API Name with it
                            for(String accountFieldAPIName : setAccountFieldsNameString) {
                                //Checking for value in child with respect to Parent
                                if(parentAccount.get(accountFieldAPIName) == null && childAccount.get(accountFieldAPIName) != null) {
                                    //Populating Instance with value
                                    parentAccount.put(accountFieldAPIName, childAccount.get(accountFieldAPIName)); 
                            //Put the latest innstance of Parent Account in Map
                            mapParentAccount.put(parentAccount.Id, parentAccount);
                            //Get List from Marging Map
                            List<List<Id>> mergingAccounts = mapParentAccountWithListOfChildrenAccounts.get(parentAccount.Id);
                            //Chekcif list size has been reached to 2, add a new List and add account in that
                            if(mergingAccounts[mergingAccounts.size() - 1].size() == 2) {
                                //Add a new List
                                mergingAccounts.add(new List<Id>()); 
                            //Add Child record in List
                            mergingAccounts[mergingAccounts.size() - 1].add(childAccount.Id);
                            //Put this list back in original map
                            mapParentAccountWithListOfChildrenAccounts.put(parentAccount.Id, mergingAccounts);
            System.debug('@@@@@@ value in mapParentAccount ' + mapParentAccount);
            System.debug('@@@@@@ value in mapParentAccountWithListOfChildrenAccounts ' + mapParentAccountWithListOfChildrenAccounts);
            //Start Merging Process
            for(Account pAccount : mapParentAccount.values()) {
                //Get merging list and start merging process
                if(mapParentAccountWithListOfChildrenAccounts.containsKey(pAccount.Id)) {
                    //Loop through the merging list
                    for(List<Id> accounts : mapParentAccountWithListOfChildrenAccounts.get(pAccount.Id)) {
                        if(accounts != null && accounts.size() > 0) {
                            System.debug('###### accounts ' + accounts);
                            //Merge statement for merging of the child records with respect to Parent Account record
                            merge pAccount accounts;
    //Finish Method
    global void finish(Database.BatchableContext BC) {

Here is the Test Class

* Description : Test Class for Batch_MergeDuplicateTransactionAccounts.
* Created Date : 11-27-2013
* Revisiion Logs : V_1.0 - Created
* Code Coverage : 100%
private class Test_Batch_MergeDuplicateTxnAccounts {
    //Test method
    static testMethod void myUnitTest() {
        //List to hold account records
        List listAccounts = new List();
        //Create Account with iteration of count
        for(integer i = 1 ; i <= 100 ; i++) { //Populating the list of Account records listAccounts.add(new Account(Name = 'Test1' , BillingCity = 'TestCity' , BillingState = 'TestState' , BillingPostalCode = '85004' , BillingStreet = 'TestStreet' , BillingCountry = 'US')); } listAccounts.add(new Account(Name = 'Test2' , BillingCity = 'TestCity' , BillingState = 'TestState' , BillingPostalCode = '85005' , BillingStreet = 'TestStreet' , BillingCountry = 'US')); listAccounts.add(new Account(Name = 'Test2'));                  //Insert accounts insert listAccounts; //List to hold contact records List listContacts = new List();
            //Populate the list with contact records
            listContacts.add(new Contact(FirstName = 'Test' , LastName = 'Contact' , AccountId = listAccounts[1].Id));
            listContacts.add(new Contact(FirstName = 'Test1' , LastName = 'Contact1' , AccountId = listAccounts[2].Id));
            //Insert contacts
            insert listContacts;
            //Test start from here
            //Batch Initializing
            Batch_MergeDuplicateTransactionAccounts controller = new Batch_MergeDuplicateTransactionAccounts();
            //Execute Batch
            Database.executeBatch(controller , 200);
            //Test stop here
            //Query to get account records
            listAccounts = [SELECT ID , (SELECT ID From Contacts) FROM Account];
            //Assert for results
            System.assertEquals(listAccounts.size() , 3);
            System.assertEquals(listAccounts[0].contacts.size() , 2);

Thanks & Cheers,
Hope helped someone.

No comments:

Post a Comment