Troubleshooting Provisioning Failures
Provisioning failures fall into a few predictable categories. This deep-dive covers the most common error patterns, how to diagnose them systematically, and how to recover from quarantine and escrow situations.
Diagnostic Approach
Before diving into specific error types, establish a consistent diagnostic workflow:
- Check the provisioning progress bar for quarantine status and overall health.
- Read the provisioning logs, filtered by status = Failure or Skipped.
- Inspect the provisioning steps in the log entry for the failing user. The step detail shows exactly where the process broke down.
- Test with on-demand provisioning to reproduce the failure in isolation and get detailed step-by-step output.
- Check the target application directly. Sometimes the root cause is on the app side (API down, rate limiting, schema changes).
The provisioning logs are the single most important diagnostic tool. Every operation - successful or failed - is recorded with full detail about what the engine attempted and what response it received.
Common Error Patterns
Missing Required Attributes
Symptom: Provisioning log shows a failure with a message like “Required attribute [attributeName] is missing” or the target API returns a 400 error about a missing field.
Cause: The source user does not have a value for an attribute that the target application requires, and no default value or expression is configured in the mapping.
Fix:
- Check the failing user’s attributes in Entra ID. Is the required attribute populated?
- If the attribute is commonly empty, add a default value in the attribute mapping. For example, set a default
departmentof “Unassigned” for users without a department. - If the attribute should come from a different source attribute, update the mapping to use an expression with a fallback:
IIF(IsNullOrEmpty([department]), "Unassigned", [department]).
Unique Constraint Violations (Duplicate Values)
Symptom: The target application returns a 409 Conflict or a message about a duplicate value for an attribute like userName or email.
Cause: Another user in the target system already has the same value for a uniqueness-enforced attribute. This commonly happens when:
- A user was previously provisioned manually and a duplicate entry exists.
- Two source users map to the same target attribute value due to an expression issue.
- The user was soft-deleted and re-created in Entra ID but the old entry persists in the target system.
Fix:
- Check the target application for the conflicting entry. Remove or update it if it is stale.
- Review the attribute mapping expression to ensure it produces unique values. For example, if
Join(".", [givenName], [surname])creates duplicates for people with the same name, add a disambiguator likeJoin(".", [givenName], [surname], Left([objectId], 4)). - Use on-demand provisioning to test the updated mapping before restarting the service.
Matching Failures
Symptom: The provisioning engine creates a new user in the target system even though the user already exists there. Or it fails to match and reports a matching error.
Cause: The matching attribute values do not align between the source and target systems. For example, the matching attribute is userPrincipalName to userName, but the existing user in the target app was created with a different username format.
Fix:
- Review the matching attribute configuration in the attribute mappings. Ensure the matching attribute actually contains the same value in both systems.
- Consider adding a second matching attribute with a lower precedence (e.g., match on email if UPN matching fails).
- If users were pre-provisioned manually, you may need to update the target user records to include the correct matching attribute value before enabling automatic provisioning.
SCIM Compliance Errors
Symptom: The target application returns unexpected HTTP status codes or malformed responses. The provisioning log shows “SCIM compliance issue” or unexpected response formats.
Cause: The target application’s SCIM endpoint does not fully comply with the SCIM 2.0 specification. Common violations include:
- Returning 404 instead of an empty result set for user search queries.
- Not supporting PATCH operations.
- Returning the wrong content-type header.
- Not handling filter parameters correctly.
Fix:
- If this is a gallery app, check Microsoft’s known issues list for the application.
- If this is a custom SCIM endpoint, validate it against Microsoft’s SCIM validator tool.
- Contact the application vendor if their endpoint is non-compliant.
Invalid Admin Credentials
Symptom: All provisioning operations fail. The provisioning job enters quarantine with reason “EncounteredQuarantineException” or “Invalid credentials.”
Cause: The admin credentials (API token, OAuth token, or username/password) for the target application are expired, revoked, or incorrect.
Fix:
- Navigate to Provisioning > Admin Credentials.
- Re-enter valid credentials.
- Click Test Connection to verify.
- Restart provisioning.
For OAuth-based gallery apps, you may need to re-authorize by clicking the authorization button, which redirects to the app’s OAuth flow.
Reference Attribute Failures
Symptom: User provisioning succeeds, but manager assignments or group memberships fail with reference resolution errors.
Cause: The referenced object (the manager or the group) does not exist yet in the target system, or its ID cannot be resolved. This is common during initial cycles when a manager has not been provisioned before their reports.
Fix:
- Reference failures are typically self-healing. The provisioning engine retries reference assignments in subsequent cycles. After the referenced user is created, the reference is resolved.
- These failures do not count toward the quarantine escrow threshold.
- If reference failures persist, ensure the referenced user is in scope for provisioning and has been successfully created.
Quarantine Recovery
When a provisioning job enters quarantine, follow this recovery procedure:
Step 1: Identify the Quarantine Reason
Check the provisioning progress bar or query the Graph API:
GET /servicePrincipals/{id}/synchronization/jobs/{jobId}
The status.quarantine.reason field tells you why:
EncounteredQuarantineException: Credential or connectivity failure.EncounteredEscrowProportionThreshold: Too many individual operation failures.QuarantineOnDemand: Manual quarantine by Microsoft support.
Step 2: Fix the Root Cause
- For credential issues: update credentials and test connection.
- For escrow threshold: review the provisioning logs to identify the most common failure pattern. Fix the underlying issue (mapping errors, missing attributes, target API problems).
Step 3: Restart Provisioning
Click Restart provisioning in the portal, or use the Graph API:
POST /servicePrincipals/{id}/synchronization/jobs/{jobId}/restart
{
"criteria": {
"resetScope": "Full"
}
}
Use "resetScope": "Quarantine" to clear only the quarantine flag without forcing a full initial cycle. Use "resetScope": "Full" when you want a clean re-evaluation of all users.
Step 4: Monitor the Recovery
Watch the progress bar and provisioning logs as the initial or incremental cycle runs. Ensure the failure rate drops and the job does not re-enter quarantine.
Accidental Deletion Protection
The provisioning service includes built-in protection against accidental mass deletions. If a single cycle would delete more users than the configured threshold (default: 500), the service pauses and sends a notification.
This protects against scenarios like:
- A scoping filter change that accidentally removes all users from scope.
- A source system data issue that makes it appear as if all users were removed.
When deletion protection triggers:
- Review the pending deletions in the provisioning logs.
- If the deletions are intentional (e.g., you changed the scoping filter on purpose), approve them.
- If the deletions are unintentional, fix the scoping configuration and restart provisioning.
You can configure the deletion threshold via the Graph API:
PATCH /servicePrincipals/{id}/synchronization/jobs/{jobId}
{
"schedule": {
"accidentalDeletionThreshold": 100
}
}
Debugging Attribute Mappings
When attribute values are not flowing correctly (the right user is created but with wrong attribute values), use this process:
- On-demand provision the user and examine the “Attribute Mapping” step. It shows the source value, the mapping expression, and the resulting target value for every mapped attribute.
- Check expression syntax in the expression builder. Paste your expression and test it against a specific user’s attributes.
- Watch for null handling. If a source attribute is null and no default value is configured, the target attribute may be skipped or set to empty. Use
IIF(IsNullOrEmpty([attr]), "default", [attr])patterns for critical attributes. - Check mapping scope. Some mappings are configured to apply only during creation, not during updates. If an attribute is correct at creation but not updated later, check the “Apply this mapping” setting.
Troubleshooting Checklist
| Symptom | First thing to check |
|---|---|
| No users provisioned | Scoping (assigned users? scoping filters?) |
| Users skipped in logs | Scoping filter evaluation or “not effectively entitled” status |
| Users created but with wrong attributes | Attribute mapping expressions |
| Duplicate users in target | Matching attribute configuration |
| All operations failing | Admin credentials (test connection) |
| Job in quarantine | Quarantine reason in progress bar or Graph API |
| Manager assignments failing | Referenced user not yet provisioned (self-healing) |
| Mass deletions pending | Accidental deletion protection threshold |
Next Steps
- How Provisioning Works covers the engine internals that drive these behaviors.
- Monitoring and Logs explains the monitoring surfaces used in troubleshooting.