SCIM Directory Sync
HatiData supports SCIM (System for Cross-domain Identity Management) directory sync through Clerk. When configured, user accounts are automatically created, updated, and deactivated in HatiData based on changes in your identity provider's directory (Okta, Azure AD, Google Workspace, etc.).
How Directory Sync Works
Identity Provider (Okta, Azure AD)
│
├── User created in IdP ──▶ Clerk webhook ──▶ HatiData: create user
├── User updated in IdP ──▶ Clerk webhook ──▶ HatiData: update user
├── User deactivated ──▶ Clerk webhook ──▶ HatiData: deactivate user
└── Group membership ──▶ Clerk webhook ──▶ HatiData: update roles
Clerk normalizes SCIM events from all supported identity providers into a consistent webhook format. HatiData processes these webhooks to keep its user directory in sync.
Prerequisites
- Clerk account with Directory Sync enabled
- HatiData Enterprise tier
- SSO already configured (see Clerk SSO Setup)
- Admin access to your identity provider
Step 1: Enable Directory Sync in Clerk
Create a Directory Connection
- In the Clerk dashboard, navigate to Directory Sync
- Select the organization
- Choose the identity provider type (Okta SCIM, Azure AD SCIM, Google Workspace)
- Follow the provider-specific setup instructions
Configure the SCIM Endpoint (Okta Example)
In Okta:
- Open your SAML application
- Navigate to Provisioning > Configure API Integration
- Enter the SCIM Connector Base URL provided by Clerk
- Enter the API Token provided by Clerk
- Enable the provisioning features:
- Create Users
- Update User Attributes
- Deactivate Users
- Sync Groups (Push Groups)
Step 2: Configure Webhook Delivery
Clerk sends directory events to the HatiData control plane. Set the webhook endpoint:
HATIDATA_CLERK_WEBHOOK_SECRET=whsec_your_clerk_webhook_secret
The control plane automatically registers a webhook endpoint at:
POST https://api.hatidata.com/v1/webhooks/clerk
Webhook Event Types
| Event | Action in HatiData |
|---|---|
dsync.user.created | Create user account with default role |
dsync.user.updated | Update user name, email, and attributes |
dsync.user.deleted | Deactivate user (see deactivation flow below) |
dsync.group.created | Create role mapping |
dsync.group.updated | Update role mapping |
dsync.group.deleted | Remove role mapping |
dsync.group.user_added | Assign role to user |
dsync.group.user_removed | Remove role from user |
Step 3: Configure User Provisioning
Map directory attributes to HatiData user fields:
from hatidata import HatiDataClient
admin = HatiDataClient(
host="localhost",
port=5439,
api_key="hd_live_admin_key",
)
admin.organizations.configure_directory_sync(
org_id="org_acme",
clerk_directory_id="directory_01HXYZ...",
attribute_mapping={
"email": "emails[0].value",
"first_name": "name.givenName",
"last_name": "name.familyName",
"department": "urn:ietf:params:scim:schemas:extension:enterprise:2.0:User:department",
},
default_role="viewer",
auto_activate=True,
)
Provisioning Options
| Option | Default | Description |
|---|---|---|
default_role | viewer | Role assigned to new users |
auto_activate | true | Activate users immediately on creation |
send_welcome_email | true | Send onboarding email to new users |
sync_interval | real_time | real_time (webhook) or hourly (polling) |
Step 4: Configure Group-to-Role Mapping
Map IdP groups to HatiData roles for automatic RBAC:
admin.organizations.configure_group_mapping(
org_id="org_acme",
mappings=[
{
"idp_group_name": "HatiData Admins",
"hatidata_role": "admin",
},
{
"idp_group_name": "HatiData Editors",
"hatidata_role": "editor",
},
{
"idp_group_name": "HatiData Viewers",
"hatidata_role": "viewer",
},
{
"idp_group_name": "Data Scientists",
"hatidata_role": "editor",
"additional_permissions": ["branch_create", "memory_write"],
},
],
)
When a user is added to or removed from an IdP group, the corresponding HatiData role is updated automatically via the SCIM webhook.
Step 5: Deactivation Flows
When a user is removed from the directory (e.g., offboarded), HatiData handles deactivation in a controlled manner:
Deactivation Steps
- Immediate: Dashboard access is revoked. Active sessions are invalidated.
- 24-hour grace: Agent API keys created by the user continue to work for 24 hours to prevent pipeline disruptions.
- After 24 hours: Agent API keys are rotated. The key names are preserved but secrets are regenerated.
- Data retention: The user's audit trail and CoT logs are preserved indefinitely for compliance.
# Configure deactivation behavior
admin.organizations.configure_deactivation(
org_id="org_acme",
api_key_grace_period_hours=24, # Grace period for agent API keys
preserve_audit_data=True, # Keep audit logs after deactivation
notify_admin_on_deactivation=True,
transfer_agent_keys_to="admin@acme.com", # Reassign orphaned keys
)
Monitoring Deactivations
-- Recently deactivated users
SELECT
user_id,
email,
deactivated_at,
deactivation_source,
api_keys_affected
FROM _hatidata_user_events
WHERE event_type = 'user_deactivated'
AND deactivated_at > NOW() - INTERVAL '30 days'
ORDER BY deactivated_at DESC;
Step 6: Verify Directory Sync
Check the sync status and recent events:
# Check directory sync status
curl https://api.hatidata.com/v1/organizations/org_acme/directory-sync/status \
-H "Authorization: Bearer <admin_jwt>"
{
"directory_sync_enabled": true,
"provider": "okta",
"connection_state": "active",
"last_sync": "2025-12-15T10:30:00Z",
"total_synced_users": 142,
"total_synced_groups": 8,
"pending_events": 0
}
Audit Directory Sync Events
SELECT
event_type,
user_email,
group_name,
action_taken,
created_at
FROM _hatidata_directory_sync_events
WHERE org_id = 'org_acme'
ORDER BY created_at DESC
LIMIT 20;
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| Users not syncing | Webhook endpoint not reachable | Verify HATIDATA_CLERK_WEBHOOK_SECRET is set |
| Group membership not updating | Group push not enabled in IdP | Enable "Push Groups" in Okta/Azure AD |
| Deactivated user can still access | Grace period active | Wait 24 hours or revoke manually |
| Role not assigned | Group name mismatch | Check configure_group_mapping group names match IdP exactly |
Related Concepts
- Clerk SSO Setup -- SSO configuration
- Agent Identity Model -- Agent keys vs user auth
- Security Model -- Authentication architecture
- Audit Guarantees -- Audit trail for user events
- Agent Identity Recipes -- Key management patterns