https://subscription.packtpub.com/video/data/9781806675555/p1/video1_2/building-knowledge-graphs-to-enable-agentic-ai
# **Building Knowledge Graphs to Enable Agentic AI**
In this session, build a knowledge graph one data product at a time to provide the right context to AI agents and fully exploit the potential of agentic AI.
This talk, **“Building Knowledge Graphs to Enable Agentic AI,”** explains why robust enterprise knowledge graphs and information architecture are essential to make agentic systems reliable in production, and how to build them in a federated “knowledge mesh” way.paste.txt
---
## 1. Why context engineering matters
- Agentic systems are now built around **large language models (LLMs)** instead of purely symbolic rule engines, which makes impressive demos easy but production reliability hard.paste.txt
- The main production gap is often **not model intelligence**, but **poor communication of context**: the agent is not given the right information about its environment, tasks, and state.paste.txt
- LLM context windows are rapidly expanding (up to millions of tokens), but simply stuffing more data in the prompt **reduces accuracy**, increases **cost**, and increases **latency**, especially beyond ~80–100K tokens.paste.txt
## Definition of context engineering
- **Context engineering** = the art and science of filling the context window with **just the right information** at each step, no more and no less.paste.txt
- It extends beyond prompt engineering to include:
- **Procedural instructions** (prompt).
- **State** (current plan, workflow step, recent actions).
- **Knowledge** (domain information, environment description, task-specific facts).paste.txt
---
## 2. Memory models for agents (human-inspired)
- Context engineering requires **memory**: not all information can be kept in the context window, so it must be stored and selectively retrieved.paste.txt
- Agent memory frameworks increasingly mimic **human cognitive architecture**, with:
- **Short-term / working memory**: the current context window.paste.txt
- **Long-term memory** split into:
- **Episodic memory**: transactional trail of recent actions and experiences (task progress, interaction history).paste.txt
- **Semantic memory**: conceptual knowledge of the world and domain.paste.txt
- **Procedural memory**: skills and know-how (how to execute a recipe, follow a procedure, etc.).paste.txt
## Layered information architecture
- Information is organized into layers:
- **Data**: raw observations (e.g., sensor readings) in short-term memory.paste.txt
- **Information**: data enriched with **metadata** that adds meaning and context.paste.txt
- **Knowledge**: abstracted, generalized concepts and models that form a mental model of the world.paste.txt
- For an agent, the **information architecture** (data → information → knowledge) must be explicitly designed and populated with what is relevant to the agent’s tasks.paste.txt
---
## 3. Enterprise information architecture vs per-agent silos
- If each agent builds its own data/information/knowledge stack as a byproduct of its implementation, **every agent will model key concepts differently**.paste.txt
- Example: two teams separately explain “customer” to two different agents; each encodes this concept differently, leading to **semantic mismatches** when agents must coordinate.paste.txt
- To avoid this, **enterprise information architecture** should be a **corporate asset**, shared and incrementally built, from which individual agents load their relevant slice of memory.paste.txt
## Benefits of a shared architecture
- When two agents need context on “customer,” they **load from the same central model and datasets**, so they:
- Speak the **same language**.
- Refer to the **same datasets/data products**.paste.txt
- This shared architecture enables **semantic interoperability** among agents instead of a “big agentic mess” of incompatible components.paste.txt
---
## 4. What is an enterprise knowledge graph?
- An **enterprise knowledge graph (EKG)** is presented as the best model to formalize the **unified enterprise information architecture**.paste.txt
- It integrates three layers into one machine-readable model:
- **Data layer** (actual business data).
- **Metadata / information layer** (descriptions, data contracts).paste.txt
- **Knowledge / conceptual layer** (ontologies and concepts).paste.txt
- The EKG is both **human-readable** and **machine-readable**, usable by LLMs, formal reasoners, and other systems.paste.txt
---
## 5. Data and information layers: data products
## Data layer
- Base of the graph: data generated by **transactional systems**, which the agent ultimately needs to act correctly.paste.txt
- Modern data management has shifted from monolithic warehouses to **modular architectures** composed of **data products**.paste.txt
## Data products
- A **data product** is a **software artifact that exposes data** as its primary function.paste.txt
- It bundles:
- The **data** itself.
- The **code** that transforms/serves it.
- The **infrastructure** to run that code.
- The **interfaces** to consume the data.paste.txt
- Crucially, data products include **data product descriptors or contracts** with rich **metadata**, so data is understandable and reusable.paste.txt
- Managing data as products covers the first two layers of the knowledge graph:
- Data plane (raw data).
- Information plane (metadata that turns data into information).paste.txt
---
## 6. The knowledge layer: ontologies and semantics
## Problem: technical vs semantic interoperability
- Data products are often **technically interoperable** (APIs, formats) but not **semantically interoperable**.paste.txt
- Different domains (e.g., sales vs marketing) model the same concept (customer) differently; consumers must repeatedly do **translation work** across domains.paste.txt
## Central ontology
- To solve this, the organization needs a **central ontology**:
- A **unified conceptual model** of key business concepts (customer, product, order, etc.).paste.txt
- Used to **normalize and relate** data products across domains.paste.txt
- The ontology is encoded as **triples** (subject–predicate–object), forming a graph in which:
- Nodes are **concepts**.
- Edges (predicates) encode relationships.paste.txt
- The **meaning of a concept** comes from its position and relations in the graph (what it is connected to or not), not from a textual definition.paste.txt
## Linking data products to the ontology
- Extend data product contracts to include **links to ontology concepts**:
- E.g., “this dataset contains instances of the concept `Customer` defined in the central ontology,” or “this column is the `shipment address` property of `Customer`.”paste.txt
- This makes data products **semantically interoperable**: if two products link to the same concept, their data can be reliably combined by agents and humans.paste.txt
---
## 7. The scaling problem and “knowledge mesh”
## Central ontology bottleneck
- Building a comprehensive central ontology is **complex**; a central team quickly becomes a **bottleneck**.paste.txt
- When needed concepts are stuck in the ontology team’s backlog, data product teams may stop linking to the ontology, and consumers bypass it, undermining semantic interoperability.paste.txt
## Applying data mesh principles to knowledge
- The talk proposes **“knowledge mesh”**: applying the four **data mesh** principles to **knowledge management**.paste.txt
The four principles adapted:
1. **Knowledge as a product**
- The enterprise ontology is **not monolithic** but composed of **smaller sub-ontologies**, each managed as a **product**.paste.txt
- Each knowledge product (sub-ontology) has clear ownership and evolves over time with use cases.paste.txt
2. **Domain orientation (knowledge domains)**
- Ownership is by **knowledge domains** (e.g., Customer, Product, Order-to-Cash), not strictly by business departments.paste.txt
- Knowledge domains cut across business domains, bringing together representatives from all areas that use a concept like “customer” to agree on a shared model.paste.txt
3. **Federated computational governance**
- A **federated governance team** defines global modeling rules so sub-ontologies can be composed into a coherent whole.paste.txt
- Requires a platform to support:
- Creation, evolution, and versioning of ontologies.
- Composition of sub-ontologies into the enterprise ontology.
- Interfaces for searching concepts and discovering linked data products.paste.txt
4. **Self-serve infrastructure (EIOP platform)**
- The talk references an **Enterprise Information Operation Platform (EIOP)**:
- Manages data products, knowledge products, and their links.
- Supports building and maintaining the enterprise knowledge graph in a federated, scalable way.paste.txt
---
## 8. Use-case–driven incremental growth of the knowledge graph
- The knowledge graph should grow **incrementally, driven by concrete use cases**, not by modeling for its own sake.paste.txt
- For each prioritized use case (often an AI/agent use case):
- Identify **data products** needed or to be extended.paste.txt
- Identify **concepts/knowledge products** needed to describe and semantically align those data.paste.txt
- Run two parallel streams:
- Build/extend the data products.
- Build/extend the ontology concepts (knowledge products).paste.txt
- Link the data products to the concepts; this **adds nodes and edges** to the enterprise knowledge graph.paste.txt
- Over time, use case after use case, the graph **expands organically**, always anchored in business value.paste.txt
---
## 9. Interaction with LLMs and agents
## Using the knowledge graph with LLMs
- When a user asks a question, the agent can:
- Perform **concept extraction** from the query.
- Search the **knowledge graph** for relevant concepts and relations.paste.txt
- The retrieved graph structure can be passed to the LLM as:
- Formal triples (RDF, Turtle-style).
- Or **natural language sentences** representing those triples.paste.txt
- The speaker prefers **natural text representation**, since LLMs are trained more heavily on text than on formal ontology formats.paste.txt
## Agents learning from experience
- Context goes both ways:
- From memory to context window (**retrieval**).
- From context window back to memory (**writing**), capturing feedback and outcomes.paste.txt
- When agents write back to memory, their internal state changes, so they **learn from experience**, similar to humans.paste.txt
- User feedback and environmental signals influence the agent’s memory and, over time, its behavior.paste.txt
## Generative AI vs agentic AI
- **Generative AI** (GenAI) = LLMs used primarily to **generate content** (text, images) in response to prompts.paste.txt
- **Agentic AI** = LLM-based systems that **also interact with the external environment** through sensors and actuators, taking actions autonomously.paste.txt
- Agentic AI = GenAI **plus** the ability to act in the world, guided by context and goals.paste.txt
---
## 10. Final takeaways
- There will be **no reliable AI or agentic systems** without a reliable **enterprise information architecture**.paste.txt
- Intelligence (LLMs, orchestration frameworks) will increasingly be provided by external vendors, but **your knowledge is unique and must be structured by you**.paste.txt
- The best way to **future-proof AI investments** is to continuously improve the enterprise knowledge graph and information architecture so agents always receive high-quality context.paste.txt
1. [https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/attachments/139614499/c9fd2284-b316-4126-b35b-873091643fe9/paste.txt](https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/attachments/139614499/c9fd2284-b316-4126-b35b-873091643fe9/paste.txt)
2. [https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/attachments/139614499/2ee025f7-cabe-4041-992e-39e39dd2d69e/paste.txt](https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/attachments/139614499/2ee025f7-cabe-4041-992e-39e39dd2d69e/paste.txt)
---