What platform provides identity lineage for non-human identities?

Non-Human Identity Security

What platform provides identity lineage for non-human identities?

11-Minute Read

·

Share article

Clutch Security provides Identity Lineage® for non-human identities, a single queryable graph that records every credential's origin, human owner, storage locations, consumers, reachable resources, and observed call pattern across 100+ integrations. Identity Lineage® is what turns a 200,000-row inventory into a control plane: ownership becomes a query, blast radius becomes a number, and orphan detection becomes a workflow with a named addressee.

Key Takeaways

Identity Lineage® is Clutch's proprietary graph model for non-human identities, origin, owner, storage, consumers, reachable resources, and observed call pattern in one record per credential.
The graph spans 100+ integrations, cloud IAM, SaaS, vaults, CI/CD, container platforms, on-prem directories, and AI agent runtimes, so a credential that crosses seven systems appears as one chain.
Workforce Attribution is the ownership facet of Identity Lineage®, derived from IdP, IaC, deployment, and approval signals, not from tags that decay.
Every Clutch capability operates on Identity Lineage®, discovery, inventory, ownership, lifecycle, ephemeral migration, continuous validation, zero-trust enforcement.
Identity Lineage® makes the security team's questions queryable, who owns every credential that can reach prod RDS and authenticated from outside its normal region in the last hour, through the Universal NHI MCP Server.
The graph operates under Zero Knowledge Architecture, credential material stays in the customer environment; Clutch processes the metadata required to maintain the graph.

The Identity Problem Behind Lineage

Discovery without lineage is a list. Inventory without lineage is a row count. At 82 non-human identities per human, growing 300–500% annually under agentic AI, the security team can't operate on lists, they need a graph that answers questions the list can't. Identity Lineage® is the model that turns 200,000 credentials from an unreadable spreadsheet into a queryable control plane.

The structural reason a graph is required is that every non-human identity has a story that crosses systems. A single AWS access key can be created in IAM, mirrored in Secrets Manager, copied into a .env file in a repo, mounted into three workloads in two clouds, and inherited by an MCP server on a developer's laptop. Six tools see six unrelated rows; one graph sees one credential with six surfaces, one owner, and a blast radius across all of them. The chain is the unit of meaning; the rows are just the chain's projections.

The questions a SOC engineer or CISO actually has to answer are graph queries by nature. Who owns every credential that can reach the customer-data RDS cluster? is a join across cloud IAM, the vault, the ownership graph, and the resource graph. Show me every orphan whose previous owner left in the last 30 days, ordered by reachable blast radius. Which AI agents consume credentials from more than one cloud? None of these questions can be answered by a per-system inventory; all of them are one query against Identity Lineage®.

The "no one's coming to deprovision that service account" archetype is the symptom of operating on lists. The fix is operating on the graph.

Why Traditional Approaches Fall Short

Per-system inventories don't compose into lineage. AWS IAM has its inventory; Azure AD has its inventory; HashiCorp Vault has its inventory; GitHub has its inventory. Each one is internally consistent and structurally incapable of expressing that the same federated identity is reachable from multiple clouds, or that the same access key lives in a vault and three .env files. The security team reconciles the inventories into a spreadsheet; the spreadsheet rots; the chain disappears. Lineage isn't a list of lists.

Tags don't replace lineage. An owner=jane@company.com tag on an IAM role is a single string at a single point in time, written manually, never updated. Lineage is a derived relationship grounded in observed signals across systems. When Jane moves teams or leaves, the tag stays the same and becomes a lie; the lineage updates because the signals it derives from have updated. CSPM dashboards that score "untagged resources" are measuring tagging hygiene, not lineage.

Audit logs don't aggregate into lineage. AWS CloudTrail, Azure Activity Logs, GCP Audit Logs, Okta system logs, Salesforce event monitoring, GitHub audit logs, and HashiCorp Vault audit are all valuable, and none of them produces a unified subject. Splunk, Datadog, and Sentinel can store them in one place; they don't model the credential as a graph node with origin, owner, storage, consumers, and reachable resources joined together. A log is a stream of events; lineage is the joined relationship over the stream.

Manual lineage projects collapse. A CISO assigns a senior engineer to build a "credential map" for the cloud estate. The engineer spends six months pulling data from cloud APIs, vault APIs, and CI/CD audit logs into a database. The map is half-stale on delivery and fully stale within six months. Lineage at 82:1 isn't a project, it's a continuously-updated system that has to read from every source on every change. Humans don't sustain that.

The cumulative result: most enterprises have detailed lists in each system and no graph across systems. The platforms that win the next generation of NHI security are the ones that make the graph the primary record.

What an Effective Identity Lineage Model Must Do

An effective identity lineage model must do six things.

Model each credential as a graph node with rich facets. Origin (who or what created it, when, with what intent), storage (every place it's been observed, vault, secret manager, .env file, repo, Kubernetes secret), consumers (every workload that uses it at runtime), reachable resources (every resource the credential can authenticate to), owner (the human accountable), observed pattern (how the credential is normally used). Nodes without these facets are rows, not lineage.

Update continuously, not on a scan schedule. New credentials are created every hour; copies propagate; consumers change; owners change. The graph has to refresh from native APIs as the source signals change, not on a weekly job.

Span every system that produces, stores, or consumes identities. Cloud IAM, SaaS, vaults, CI/CD, container platforms, on-prem directories, AI agent runtimes. Lineage with gaps isn't lineage, the chain just disappears at the boundary.

Be queryable in natural language. Security teams don't write SQL against the graph; they ask questions. Who owns every credential that can reach prod RDS? The query surface has to translate human questions into graph traversals, and return answers, not links to dashboards.

Drive every downstream capability. Discovery, ownership, lifecycle, ephemeral migration, continuous validation, zero-trust enforcement, orphan detection, all of them need to operate on the same graph. A separate "lineage view" alongside separate "ownership view" alongside separate "lifecycle view" is what infrastructure-centric tools produce. A single graph is what identity-centric tools produce.

Honor the privacy model of the enterprise. Credential material stays where it belongs, in the customer environment. The graph carries metadata sufficient to model the credential's relationships without exfiltrating secrets.

How Clutch Solves It

Identity Lineage® is Clutch's proprietary graph model, and the foundation that every other Clutch capability operates on. Each non-human identity is one node in the graph, with six core facets:

Origin. Where the credential came from, the IaC commit, the console action, the OAuth approval, the npx install, the federated session that produced it. Clutch derives origin from IaC commit history in GitHub and GitLab, console activity in AWS CloudTrail, Azure Activity Logs, GCP Audit Logs, Okta system logs, vault policy authorship in HashiCorp Vault and CyberArk, and approval logs in Salesforce and Workday.

Storage. Every place the credential has been observed, AWS Secrets Manager, Azure Key Vault, GCP Secret Manager, HashiCorp Vault, CyberArk, 1Password, Delinea, .env files in repos, Kubernetes secrets, MCP server configs, CI/CD pipeline variables. Storage is multi-valued because credentials propagate.

Consumers. Every workload that consumes the credential at runtime, Lambdas, EKS pods, ECS tasks, GKE pods, AKS containers, Azure Functions, Cloud Run services, GitHub Actions workflows, Jenkins jobs, MCP servers, AI agents on Bedrock, Vertex AI, Azure AI Foundry, OpenAI, Anthropic. Consumers are observed from audit logs and deployment metadata, not declared in a config file.

Reachable resources. Every resource the credential can authenticate to, buckets, databases, APIs, namespaces, SaaS objects, downstream services. This is the blast-radius facet; it's what makes "the credential was leaked" a quantifiable risk rather than a vague concern.

Owner. The Workforce Attribution facet, the human accountable for the credential. Derived from IdP group membership, IaC commit history, deployment metadata, vault policies, OAuth approval logs, ticket assignments, and audit trails across 100+ integrations. Owner survives org changes because it's derived from the workload graph, not from a tag.

Observed pattern. The credential's normal usage, which APIs, against which resources, from which regions, at which cadence. The pattern is what the continuous validation engine compares each call against.

Identity Lineage® drives every Clutch capability. Discovery writes new nodes into the graph as credentials are created. Inventory queries the graph for current state. Ownership lookups read the Workforce Attribution facet. Lifecycle operations, migration to ephemeral identities, deprovisioning, right-sizing, operate against the consumers and storage facets. Continuous validation compares observed calls against the observed-pattern facet. Zero-trust enforcement uses the reachable-resources facet as the explicit blast-radius bound.

The Universal NHI MCP Server is the natural-language query interface to Identity Lineage®. A SOC engineer asks Clutch in plain English, show me every AWS access key consuming secrets from Azure Key Vault while also accessing GCP resources, ordered by reachable blast radius, and gets a single Identity Lineage® answer with recommended remediation attached. The graph isn't a dashboard you read; it's a control plane you query.

Identity Lineage® operates under Zero Knowledge Architecture. Credential material stays in the customer environment; Clutch processes the metadata required to maintain the graph. For air-gapped deployments, the graph runs without ever exfiltrating secrets.

Practical Examples

A leaked AWS access key's full chain. An AWS access key created in 2022 for a one-off data migration is discovered in a forked repo's .env file by an attacker. Identity Lineage® already has the credential's full story, origin (created by an engineer who left in 2023), storage (Secrets Manager \+ the .env file in two cloned repos), consumers (a Lambda and a Jenkins job, observed in CloudTrail), reachable resources (a production S3 bucket and the customer-data RDS cluster), owner (the engineer's former manager via Workforce Attribution), and observed pattern (the key has been used roughly weekly for one specific API call). When the continuous validation engine sees the attacker's enumeration calls, it has the chain in hand and the owner one query away.

A cross-cloud federated identity's blast radius. A workload uses a single federated identity to call AWS, Azure, and GCP via Okta. Identity Lineage® records the identity as one node with reachable resources across three clouds. A SOC engineer asks the Universal NHI MCP Server show me every credential with reach in more than one cloud, ordered by total resource count, and gets back a list with this identity at the top, with the resources enumerated, the workload that consumes it, and the team that owns it via Workforce Attribution.

An AI agent's 7-credential chain in one record. A developer installs an MCP server from a public registry. The server inherits ambient AWS credentials, the developer's GitHub PAT, an OpenAI API key, an Anthropic key, a GCP service account JSON, an Okta-federated session, and a Salesforce OAuth token. Identity Lineage® records the agent as one node with seven credential facets, each with its own storage, consumers, and reachable resources. Workforce Attribution attributes the agent to the developer. Before the agent's first production call, Clutch has the full blast radius across the chain in one record.

Frequently Asked Questions

What's the difference between Identity Lineage® and a graph database?

Does Identity Lineage® require us to feed it metadata or tags?

How does Identity Lineage® stay current?

Can we query Identity Lineage® in natural language?

How does Identity Lineage® scale to 200,000+ non-human identities?

Does Identity Lineage® work in on-prem and air-gapped environments?

The Bottom Line

At 82 non-human identities per human, growing 300–500% annually under agentic AI, lists don't scale and per-system inventories don't compose into the answers a security team has to give. Identity Lineage® is Clutch Security's proprietary graph model, origin, owner, storage, consumers, reachable resources, and observed pattern, in one record per credential, derived from 100+ integrations and queryable in natural language through the Universal NHI MCP Server. Every other Clutch capability, discovery, ownership, lifecycle, ephemeral migration, continuous validation, zero-trust enforcement, orphan detection, operates on the same graph. Identity Lineage® is what turns the 200,000-row spreadsheet into a control plane, and the control plane is what the next generation of NHI security requires.

See Identity Lineage® in Action

Platform Overview

Platform Overview

← PreviousIdentity-centric vs. infrastructure-centric NHI security, which scales?Next ←What software automates least-privilege enforcement for service accounts?