In Part 1 I gave an overview of what the upsert operation was and how it compares to insert and update. I also described what an external id field is and why it makes sense to store common unique identifiers from external systems in Salesforce to simplify data integrations so you don’t have to know the Salesforce IDs to update the records.
In Part 2 I will show you how we can extend our use case of external ids to assigning lookup field values on the records we are upserting. For example, when upserting contacts we can also assign the accounts by referencing an external id field on the account rather than the Salesforce record id. No more Excel vlookup necessary!
I will show you how to use the power of upsert and external ids to assign lookup field values three different ways:
- Salesforce’s Data Loader
- MuleSoft’s dataloader.io
- Apex
The same concepts shown here also apply to any other tool you may be using to perform upserts to Salesforce, such as directly with their SOAP API, REST API, or other products like Talend, MuleSoft, Jitterbit, Informatica, etc.
The Data
In our example let’s assume we are loading contacts and assigning their account. We have an external id text field defined on the Account object aptly named “External ID” (External_ID__c). We don’t know the account’s Salesforce ids but we do know their external system id. Our data may look like this .csv file:
Salesforce’s Data Loader
Launch the program then click the Upsert button then choose the object whose records you’re loading. In this example, we’re upserting contacts.
On the next screen, the program will ask which field on the contact to use as the identifier. You can choose between the standard id field or any external id field that may exist on the contact object. In this example we will simply choose ID field knowing that our upsert operation will insert a new contact record.
On the next screen, the program will walk through each lookup field defined on the contact object and ask which identifier on those objects you have in your .csv file to map with. In this example, we’re assigning the account lookup field by referencing the Account.External_ID__c field.
On the field mapping screen, note that normally you would assign a contact’s account by mapping the AccountId to a column in your file that included actual Salesforce account ids. However, since we don’t have the actual Salesforce account id we will choose and drag the Account:External_ID__c field reference to map to our file data. When the upsert operation is performed, Salesforce will automatically look up the related account id by matching on a record whose External ID field equals the value in your file. The Excel VLOOKUP magic is done automatically for you by Salesforce!
MuleSoft’s dataloader.io
Visit https://dataloader.io and login with your Salesforce account. Once logged in, click on New Task then choose Import.
Choose the upsert operation and the object we’re loading records to.
On the field mapping page, select Account ID (this is the Contact object’s AccountId field) as the field that our Account.External_ID__c column maps to then check the Lookup via option and select the External ID field (or whatever you’ve named yours). I also recommend choosing the option to Mark record with an error if more than one match is found to avoid unintended data assignments.
Apex
To perform the upsert lookup field assignment in Apex we need to use a special method on the SObject class putSObject(..). This method takes two parameters. The first lets us specify the relationship field on the record we’re upserting and the second specifies an SObject with one of its external id fields populated so Salesforce can do the lookup matching. Note that for custom lookup fields the relationship name usually ends with __r suffix, but if you’re referencing a standard field then there is no suffix.
Thanks for reading! I hope this two part series has inspired you to begin using the Upsert operation and External IDs in your data loading endeavors.
Hi Doug! Lovely post.
I wanted to investigate a problem I have had sometimes with concurrent upserts. In some cases I was receiving the error “DUPLICATE_VALUE, duplicate value found: SetupOwnerId duplicates value on record with id:…” when two upserts to the same record arrive exactly at the same time. I think this happens because there’s concurrency in the race condition, but I wanted to investigate the issue further more in the future.
Any ideas about this?
Best regards
LikeLiked by 1 person
Hi Alba, thanks for the kind words!
Regarding the duplicate value issue, I believe you understand the issue well that it’s a race condition between the two concurrent upserts. Both upsert attempts if occuring at the same time will do their check to see if a record already exists, both will see that none already exists and then issue an insert operation. The first insert wins and the second insert gets the error.
Curious, what is your use case that you would have concurrent upsert operations occuring?
To avoid the race condition I imagine you would need to queue the requests up and do them synchronously or somehow identify that two or more requets were being fired off very close together and only send one upsert operation.
I look forward to hearing what you discover!
LikeLike
It’s happening because I have a record holding information which is highly probably to be concurrently updated by different users. Even though I retrieve the record with a for update locking select query, that happened sometimes when performing concurrency tests (not very frequently)!!
I will inform you if I find out anything else.
Kind regards!
LikeLiked by 1 person
I thinking that in the situation you explain above, the for update hasn’t got any effect, as when the two upserts are performed the record doesn’t exist yet. That should be the specific situation then. Thanks!
LikeLiked by 1 person
If i have more than one lookup, how do i neglect the other lookups and concentrate on only one. will the dataloader allow me to do with only one object and neglect the rest??
LikeLiked by 1 person
Hi Rohit,
In the field-mapping step of the Data Loader tool you’re using, you assign which column value in your file maps to which External ID of which lookup field you want to populate. You can definitely distinguish between which lookup field you’re wantinig to map/assign to.
Doug
LikeLiked by 1 person
Hi Doug,
Thanks for the informative post. I’m running into this issue with Jitterbit:
I have a custom object (called Event_Result) that I want to upsert to. That object in SF has 2 lookup fields: Member_ID (an external ID on the Contact object) and Event_ID (an external ID on another custom object called Event).
When I upsert, I need to specify both external IDs, as a Member can be registered for multiple Events. The data in the Event_Result object can look like:
Event / Member / Result
101 / 12345 / 75-75
101 / 34555 / 81-82
…
102 / 12345 / 76-78
I only seem to be allowed to select a single External ID as I step through the wizard.
– If I choose Member_ID, Jitterbit fails trying to overwrite 101 with 102.
– If I choose Event_ID, Jitterbit fails trying to overwrite 12345 with 34555.
– If I choose ID, it creates a bunch of duplicate records.
LikeLiked by 1 person
Hi Pierre,
Thanks for the kind feedback!
I’m not familiar with Jitterbit, you may need to consult one of their forums for support on how that programs wants you to specify the external id mappings. Each ETL tool behaves a bit differently.
From a Salesforce upsert API perspective, what you want to do is supported. Upsert an object and specify multiple lookup fields to match via external IDs. Now, to do an upsert you’ll also be asked which field on the Event_Result__c object is the single external id field to know which of those records the API will insert vs update. You can specify a third column in your file which holds the concatenated value “memberId + eventId” and map that to an external id on the Event_Result__c object.
Good luck!
LikeLike
Great post. Any ideas how it can be used with record types?
LikeLiked by 1 person
Hi Hugo, glad you like the post! Not sure that I have ever tried assigning record types not by their ID. You might try dataloader.io that supports record type assignment by developer name.
LikeLike
Hi Doug, Very Helpful post! However I am getting an error : Cannot specify both an external ID reference Account and a salesforce id, AccountId
This is how I am doing it (opty is a reference to opportunity Object) :
Account accountReference = new Account(External_ID__c = (String)customerMap.get(‘id’));
opty.putSObject(‘account’,accountReference);
Any suggestions will be really helpful.
LikeLiked by 1 person
Hello, per the error then my only guess is the Opportunity.AccountId field is populated. Does your code create the opty variable with new Opportunity() constructor or does it come from somewhere else? If somewhere else, then that may be where the AccountId is populated from.
LikeLike
Hi Doug, thank you for your helpful post. I am having trouble upserting to a custom junction object. I have a .csv with Accounts (and their external IDs). What I want to do is to create junction object records to associate the Accounts (in my case, these are Forest Accounts, a custom Record Type) with what County they are located in. I have a custom County object in Salesforce that holds all of the Counties in the U.S. Then I have a custom junction object called “Forest County Relationship.” Each record in Forest County Relationship associates a Forest Account record and a County record (actually, it’s created to allow for a many-to-many relationship so that a forest can relate to many counties and a county can hold many forests, as happens in real life). My .csv file has column headers like “Account External ID”, “Forest Account”, and “County.” For example, “N0105”, “Briarlake Forest”, “DeKalb County”. I am using the SF Data Loader, but I am new at this and am getting all kinds of errors. Do I need to know the County IDs and match them to the counties shown in my .csv? Do I need to create a new custom External ID field in the County object, in the Forest County Relationship object? I am using Salesforce NPSP Lightning Experience. This may be complicated by the fact that there are many identical county names in the United States, so there might be a DeKalb County, GA, and a DeKalb County, IA…I want to be sure to associate to the proper county AND state, but the state field in my Forest County Relationship object is a cross-object formula field…not sure how this plays in, but I think it might. Thanks for any guidance you can provide!
LikeLike