(How the Nonprofit Starter Pack handles unhandled Apex error messages)
Introduction
If you’re like me, somewhere in your phones’ contact list is a trusty old friend ‘ApexApplication’. And if you’re even MORE like me, this friend emails you a lot, perhaps overwhelmingly often. Sure, the message syntax varies slightly, “Dear Kevin, UnexpectedException” or “Dear Kevin, FIELD_CUSTOM_VALIDATION_EXCEPTION”, but the meaning is still the same: Your code failed to the point of no recourse.
When I was a consultant, we might get a couple of these a day for a current or former client, and the remedy was easy. You’d look at the message, try and distill the behavior or function that caused the error, and pick up the phone and call the client. Maybe we would patch the code, maybe we would ask the client to further explain what they were doing, or maybe we’d throw our hands-up and write it off as a one-time issue.
When I came to the Salesforce.com Foundation, I quickly realized personally calling each and every client with an error message was probably not going to work. We have over 53,000 License records in our License Management Application (LMA) on seven different packages (the five core NPSP packages plus the old template and the template converter), and 150 different package versions in the wild.
While a sobering reminder of your own fallibility, these messages can also serve as an early warning system, and we needed a way to view errors in aggregate. Instead of spotting individual messages, we wanted to look for trends that pointed to a client in serious trouble (maybe some code or customizations on top of the NPSP was conflicting with our package), or a package with a serious flaw.
The Model
How to do this though? The ‘ah-ha!’ moment was the realization that error message are just emails, and Salesforce knows how to handle emails! Our basic architecture looks like this:
(1.User instance generates an Apex Application Error. 2. Message is sent to the email listed in the package license. 3. Package license email automatically forwards to an Apex Email Service address in the LMO. 4. LMO Apex Class picks up error email, parses the email, and creates an Error Message custom object, attaching it to the appropriate License record based on Org ID)
When a user generates an error in any of the five Nonprofit Starter Pack packages, that error is automatically forwarded to the email address associated with the DOT creator. That email address runs a rule on its incoming emails, and forwards Apex Application errors to the email address associated with our email service. (More info on this can be found here, here, here and here.) When our email service receives a valid Apex Application email, it call the associated class to process that email.
The Code
The NPSPErrorProcessor class that lives in our License Management Organization (LMO) is the meat of this design. The original version of this had delightfully long and inefficient looping statements to manually parse the email for as much information as possible to record to our new Error Message record. This worked, but was verbose, inefficient and pretty ugly. Fortunately, our Foundation fellow at the time, Akhilesh Gupta, plays around with regex for fun (seriously), and replaced my hideous FOR loops with this:
'005[A-Za-z0-9]{12}+/(00D[A-Za-z0-9]{12}+)[\\r\\n]+(.+?)[\\r\\n]+caused by:[\\s]*System\\.([^:]*)[\\s]*:(.+)[\\r\\n]+(?:(?:Class)|(?:Trigger))\\.([\\w]*)\\.([\\w]*)\\.?([\\w]*):'
Breaking Down the Regex
Loosely translated, this regex breaks out into some easily parsed groups, let’s check it out:
Let’s see it in context with the rest of the class:
global class NPSPErrorProcessor implements Messaging.InboundEmailHandler {
global Messaging.InboundEmailResult handleInboundEmail(Messaging.InboundEmail email, Messaging.InboundEnvelope envelope) {
Messaging.InboundEmailResult result = new Messaging.InboundEmailresult();
Error_Message__c em = new Error_Message__c(Message__c = email.plainTextBody);
//convert email body to lowercase to avoid case mismatches
string lcEmailBody = email.plainTextBody.toLowerCase().trim();
//First, determine the error type – we’ll search the
//body for some generic clues to determine the type
string emErrorContext = ”;
if (lcEmailBody.contains(‘custom_validation’))
emErrorContext = emErrorContext + ‘Validation Error;’;
if (lcEmailBody.contains(‘apex script unhandled trigger exception’))
emErrorContext = emErrorContext + ‘Apex Trigger;’;
if (lcEmailBody.contains(‘batch’))
emErrorContext = emErrorContext + ‘Batch Apex;’;
if (lcEmailBody.contains(‘visualforce’))
emErrorContext = emErrorContext + ‘Visualforce;’;
if (lcEmailBody.contains(‘apex script unhandled exception’))
emErrorContext = emErrorContext + ‘Apex Class;’;
//Finally, if we haven’t found anything, dump it in ‘other’
if (emErrorContext.length() < 2)
emErrorContext = 'Other;';
em.Error_Context__c = emErrorContext;
string regex = '005[A-Za-z0-9]{12}+/(00D[A-Za-z0-9]{12}+)[\\r\\n]+(.+?)[\\r\\n]+caused by:[\\s]*System\\.([^:]*)[\\s]*:(.+)[\\r\\n]+(?:(?:Class)|(?:Trigger))\\.([\\w]*)\\.([\\w]*)\\.?([\\w]*):';
Pattern emailPattern = Pattern.compile(regex);
Matcher emailMatcher = emailPattern.matcher(email.plainTextBody);
if (emailMatcher.find()){
//These will fail if no ID is found - however, both are parents and required for em
sfLma__License__c errorLicense = [select id, sfLma__Package_Version__r.sfLma__Package__r.id, sfLma__Package_Version__r.id from sflma__License__c where sfLma__Subscriber_Org_ID__c = :emailMatcher.group(1).trim() and sflma__Status__c = 'Active' and sfLma__Package_Version__r.sfLma__Package__r.Namespace__c = :emailMatcher.group(5).trim()];
em.License__c = errorLicense.id;
em.Package__c = errorLicense.sfLma__Package_Version__r.sfLma__Package__r.id; // [select id from sfLma__Package__c where Namespace__c = :emailMatcher.group(5).trim()].id;
em.Package_Version__c = errorLicense.sfLma__Package_Version__r.id;
em.Account__c = [select id from Account where Organization_ID__c = :emailMatcher.group(1).trim()].id;
em.Exception_Type__c = emailMatcher.group(3).trim();
em.Short_Message__c = emailMatcher.group(4).trim();
em.Class_Name__c = emailMatcher.group(6).trim();
em.Method_Name__c = emailMatcher.group(7).trim();
insert em;
}
return result;
}//close handler method
[/java]
And, our test method:
[java]
static testMethod void NPSPErrorProcessorTEST(){
// Create a new email and envelope object
Messaging.InboundEmail email = new Messaging.InboundEmail();
Messaging.InboundEnvelope env = new Messaging.InboundEnvelope();
//Create a dummy account
string idstring = '00D123456789987';
Account a = new Account(Name = 'TestAccount', Organization_ID__c = idstring);
insert a;
//Create new LMA package instance so we can check namespace matching
sfLma__Package__c p = new sfLma__Package__c(Namespace__c = 'test1', Name = 'TestPackage');
insert p;
//Create a new Version
sfLma__Package_Version__c v = new sfLma__Package_Version__c(Name = 'VersionTest', sfLma__Package__c = p.id);
insert v;
//Create a new License
sfLma__License__c l = new sfLma__License__c(sfLma__Seats__c = 1, sfLma__Status__c = 'Active', sfLma__Subscriber_Org_ID__c = idstring, sfLma__Package_Version__c = v.id);
insert l;
//Create a dummy error message using the test org and package
email.plainTextBody = 'Apex script unhandled exception by user/organization: 005400000017nGe/' + idstring + '\n' +
'Scheduled job Nightly Opportunity Rollup threw unhandled exception.' +
'apex script unhandled trigger exception ' + 'visualforce ' + 'custom_validation\n' +
'caused by: System.TestException: Attempted to schedule too many concurrent batch jobs in this org (limit is 5).\n' +
'Class.test1.OpportunityRollups.rollupAllContacts: line 839, column 23\n' +
'Class.test1.OpportunityRollups.rollupAll: line 794, column 3\n' +
'Class.test1.SCHED_OppRollup.execute: line 5, column 9\n' +
'External entry point';
email.fromAddress ='test@test.com';
email.subject = 'Fwd: Developer script exception from TestAccount: Nightly Opportunity Rollup : Attempted to schedule too many concurrent batch jobs in this org (limit is 5).';
NPSPErrorProcessor npspep = new NPSPErrorProcessor();
Test.startTest();
Messaging.InboundEmailResult result = npspep.handleInboundEmail(email, env);
Test.stopTest();
Error_Message__c em = [select e.Short_Message__c, e.Package_Version__c, e.Class_Name__c, e.Method_Name__c, e.Package__c, e.Org_Name__c, e.Message__c, e.Exception_Type__c, e.Error_Context__c, e.Account__c From Error_Message__c e where e.Exception_Type__c = 'TestException'];
system.assertEquals(em.Package__c, p.id);
system.assertEquals(em.Exception_Type__c, 'TestException');
system.assertEquals(em.Account__c, a.id);
system.assert(em.Error_Context__c.contains('Batch Apex'));
system.assert(em.Error_Context__c.contains('Apex Class'));
system.assert(em.Error_Context__c.contains('Visualforce'));
system.assert(em.Error_Context__c.contains('Validation Error'));
system.assertEquals(em.Class_Name__c, 'OpportunityRollups');
system.assertEquals(em.Method_Name__c, 'rollupAllContacts');
system.assertEquals(em.Package_Version__c, v.id);
}
}//Close class
[/java]
Breaking Down the Class Code
We use the string instance method contains() to look for the error context (Validation, Batch Apex, VF, etc.), and then use our regex function to parse the rest of the email. We grab the license ID by looking up the org ID on our License records. The Package and Package Version come from the License record, as does Account, if valid. An exception type is parsed to provide metadata about the Error Message, as well as the short message, and the class and method causing the error. Our newly generated Error Message looks like this (note, this error message is not from the email above, but rather from the email listed in its ‘Message’ field, which contains the original full-text email):
Analyzing Our Data
These Error Message object now allow us to generate useful dashboards for near-real-time monitoring of the health of our package eco-system. Each morning, I check for spikes in Error Messages from the day/night before, and if visible, attempt to address the situation as appropriate. Our Error Message dashboard provides me with all the information I need to quickly assess our package eco-system health.
If we look at Error Messages by Date we see a few interesting things. The initial spike on 1/21 was the load of all existing Apex Application emails that had been piling up in the inbox of our DOT user. The real win came on the spike on 2/10. We noticed a sudden surge in errors in our opportunityRollup class (a particular complex piece of code that aggregates opportunity/donation totals to various non-parent objects, like contacts and households). We were able to reach out to the two organizations that were generating 90%+ of these errors, help them identify the root cause, and provide solutions where needed. (Custom validations + bulk data import = Lots of Errors).
We can also look at most frequent offending classes and packages to identify areas of our code that could use a little love, better documentation, or better usage instructions. It’s also easy to see the most common problems our users have. Some interesting facts:
Conclusion
As you can see there’s lots of interesting information to discern from the data we now have using Inbound Email Services. Being able to automatically accept and parse this information in near real-time and have Salesforce drive our visualizations using simple dashboards provides a level of visibility into our eco-system previously near-impossible without serious custom charting and coding. Email handlers provide an incredible array of flexibility, providing a common endpoint into your system for a variety of automated information from any standardized source (not to mention the SOAP and REST APIs!)
In addition, being able to bind the results into an existing data structure, in this case our LMA, gives us the opportunity to cross-reference data we wouldn’t otherwise have. The power of being able to quickly spot organizations that may be in trouble and identify a fix pro-actively, without them filing a support case or picking up the phone, is the testament to the power of the Cloud. Imagine calling your users and saying: “It seems like you might be having a problem, I can help you fix that right now.”
There’s an array of email you receive on a daily basis that would be useful to view in aggregation. I’d love to hear about your creative uses for the Inbound Email Services, or other interesting force.com hacks to help you manage your own instance, LMA/LMO, or entire eco-system.