Skip to main content
Ingestion is the path every record takes from a connected source into Adapter’s knowledge graph. Whether it’s a Slack message, a GitHub pull request, a Notion page, or a row from a custom source, it goes through the same pipeline.

The unified pipeline

There is one ingestion pipeline, not one per source. Each provider has a thin adapter layer that pulls records and emits them in a normalized shape; everything downstream — versioning, indexing, linking, comprehension — is shared. A Gmail email and an Outlook email arrive as the same email evidence, with the same schema and the same downstream guarantees. Normalization happens once, at the edge.

Evidence as the atomic unit

The pipeline emits evidence — typed, schema’d records that are the unit of everything Adapter does. Every evidence record has an evidence_type; see the full list in Evidence types. Some evidence types are general (email, file, calendar_event) and exist regardless of source. Others are provider-scoped (github_issue, linear_cycle) because they only make sense in one tool.

Idempotency and unique indexing

Every evidence record carries a unique key derived from its source identity. Re-ingesting the same record — whether from a backfill, a retry, or a webhook replay — never produces duplicates. The pipeline upserts on the unique index. You can re-run a sync without worrying about double-counting, and you can safely overlap a backfill with an ongoing sync.

Versioning

When a record changes upstream — an edited Slack message, a renamed Notion page, a closed GitHub issue — ingestion writes a new version rather than overwriting. Queries return the latest version by default; the history is retained.

Graph linking

Once a record lands, the pipeline looks for explicit links to other evidence:
  • A GitHub pull request that references an issue → edge from PR to issue
  • A Slack message containing a URL to a Notion page → edge from message to page
  • A calendar event with attendees → edges from event to each person
These links are deterministic — pulled from structured fields and URLs, not inferred — and they form the skeleton of the knowledge graph before any LLMs run.

Comprehension

The same pipeline then runs LLM-orchestrated comprehension over each evidence record. Adapter operates the orchestration end-to-end. Comprehension extracts:
  • Entities — people, projects, products, decisions
  • Concepts — themes and topics that recur across evidence
  • Context — what a record is about and how it relates to others
These are written back into the graph as additional nodes and edges, so a query like “what did the team decide about the auth migration?” can traverse comprehension-derived edges, not just deterministic ones.

Sync modes

Ingestion runs in two modes, configured per connector:
  • One-shot sync — pull everything currently in the source, then stop.
  • Ongoing sync — keep pulling as the source changes.
Both modes use the same pipeline; ongoing sync is one-shot sync that never finishes.

What’s next

Evidence types

Browse the full glossary of evidence types.

Providers

See which sources you can connect today.