From Architecture to Production
https://subscription.packtpub.com/video/data/9781806675555/p1/video1_1/engineering-graph-rag-agents#2303
---
# **Engineering Graph RAG Agents**
In this session, Anthony Alcaraz will explore how to build intelligent systems using Graph RAG (Retrieval-Augmented Generation) by integrating knowledge graphs, autonomous agents, and neural-symbolic reasoning. He'll dive into practical implementation patterns, architectural principles, and real-world case studies that demonstrate how to design, develop, and deploy production-ready agentic systems. Attendees will learn how to overcome the limitations of traditional AI approaches by leveraging graph-native architectures for enhanced reasoning capabilities and enterprise scalability.
---
This first video, **“Engineering Graph and RAG Agents”**, sets up a design and implementation mental model for building agentic systems using graphs plus RAG as the backbone of the ML stack.
## High-level purpose and structure
- Present a system-design way of thinking about **agentic AI**, arguing it is not hype but a durable pattern already working in production tools.
- Introduce **context engineering** as the core challenge and explain why **graphs + RAG + structured outputs** should sit at the center of serious agent architectures.
- Organize the talk around **eight pillars** of an “agentic graph RAG system” and the idea of a **horizontal (process) graph** plus **vertical (knowledge) graph**.
---
## What are agents and the context problem?
- Traditional LLM apps are **stateless assistants**: no lasting memory, purely reactive, no notion of ongoing goals or processes.
- An **agent** is defined by three capabilities: it **perceives state**, it **reasons**, and it **acts** to reduce time-to-value inside real software workflows.
- The current ecosystem has many frameworks and unstable best practices; the missing piece is a robust **system design** around context and knowledge.
## Limitations of traditional RAG
- Standard RAG over **only vector databases** struggles with:
- Connecting **past interactions** and modeling **temporality** (temporal reasoning).
- Distinguishing **similarity vs causality**; semantic proximity does not encode cause–effect or process structure.
- A Microsoft paper is cited showing vector RAG accuracy drops drastically from local to global queries (e.g., whole-dataset questions), illustrating that vector-only retrieval is not enough for global reasoning.
---
## Context engineering: core concept
- **Context engineering** = the practice of giving an agent the **right context** at the right time so LLM reasoning can actually work.
- Four core operations on context are defined: **write**, **select**, **compress**, and **isolate** context.
- The claim: **graph-based approaches + structured outputs** are superior for all four operations.
## “Write” operation example
- When writing to memory, the goal is **not** to dump isolated facts into a list but to **create new information while preserving relationships**.
- This is done by grounding extraction in an **ontology**, then writing into a **graph** so the agent maintains a coherent world model instead of unstructured notes.
---
## The eight pillars of an agentic graph RAG system
The talk references **eight pillars** of an “agentic graph RAG system,” used to “put the LLM fluid into a bottle” so it’s reliable in production.
- Pillars (grouped as presented conceptually):
- **Knowledge representation**: graphs, ontologies, events, temporal modeling.
- **Memory systems**: graph-based memories, bloat control, update/removal.
- **Tool orchestration**: graphs of tools, dependencies, and capabilities.
- **Structured output**: enforcing schemas to stabilize format and reasoning.
- **Reasoning and graph planning**: symbolic + semantic reasoning over graphs.
- **Optimization**: making graph building, retrieval, and reasoning efficient.
- **Security / policy**: constraints and information flow over tool/knowledge graphs.
- **Self-evolution**: evaluation, fine-tuning, and RL-on-top of the agent workflows.
The speaker’s metaphor: **LLMs are fluid**, and the eight pillars plus graphs and structure are the **container** that makes them usable for real business problems.
---
## Horizontal (process) vs vertical (knowledge) graphs
## Horizontal: process / workflow graph
- Think of agentic design as a **process engineering** problem:
- Work with business experts to map **business processes into steps**.
- Decide at each step **how agents act**, which tools they use, and how state is maintained.
- Agentic systems live on a **spectrum between determinism and autonomy**:
- Some tasks require **deterministic, stateful pipelines** (e.g., payment, safety-critical flows) with structured outputs and strong guardrails.
- Other tasks can tolerate **more autonomy**, including multi-agent parallelism and exploratory planning.
## When to use which pattern
- **Single-threaded / deterministic pipelines**:
- High task dependencies, state integrity critical, reliability over speed.
- A “second control pipeline” that must always behave the same way.
- **Parallel / multi-agent systems**:
- Only when **tasks are independent**, otherwise you introduce chaos.
- Works well for research-style workloads (searching multiple domains or sources in parallel) where results can be aggregated safely.
- **Autonomous agents**:
- Not suitable for every task; reserved for domains where autonomous planning is acceptable and can be controlled by constraints.
## Orchestration runtimes
- The speaker stresses the importance of a **durable runtime** for agent workflows:
- Needs to be **resilient to interruptions**, durable, and asynchronous by design.
- Mentions tools like **Temporal/Stateful runtimes** (e.g., Temporal-like or “to re-state” style systems) and notes that some libraries modeled as graphs (e.g., LangGraph) lack strong runtime durability.
## Vertical: knowledge / data graph
- On the vertical axis is **knowledge and tools** the agent can use.
- Layers include:
- A **probabilistic layer** (vector similarity search) as the fuzzy entry point.
- **Graph databases** (property graphs or RDF graphs) that encode relationships and constraints.
- **Ontologies** and schemas that constrain what the agent can assert and how it can reason.
- The vertical graph should represent:
- Domain **knowledge** and **events**.
- **Tool dependencies** and capabilities (which tools require which inputs/outputs).
- **Agent capabilities and constraints**.
- **Evaluation results** and feedback, forming a data flywheel.
---
## Structured output as a core enforcement layer
- A central theme: **structured output** is more than JSON formatting; it is a **reasoning constraint**.
- Example: the **Outlines** library (by Dott) enforces token-level constraints so the decoder only emits tokens that fit a given schema.
- Tokens outside the schema have probability zero, which improves **accuracy** and **inference speed** (e.g., via coalescing methods).
- Structured output is used to:
- Enforce **domain ontologies** and reasoning steps (force the model to think in pre-defined categories/fields).
- Make pipeline steps **composable and testable**, because each step has a contract.
- Align small language models with large-model performance when combined with strong schemas.
The speaker likens structured output to the **TCP/IP layer of agentic systems**, i.e., the standard that makes communication reliable between steps and tools.
---
## Knowledge representation and graph choices
## What to model in the graph
- For agent workloads, focus on modeling **events**, not just static entities, to enable **temporal reasoning**.
- Ensure **contextual boundaries** in the graph to prevent “context mixing” and distraction, a major failure mode where irrelevant retrieved snippets mislead the agent.
- NodeRAG-like approaches are highlighted:
- Nodes for entities, relation edges, **summary nodes**, and contextualized subgraphs to enable targeted retrieval.
## Practical graph strategies
- **Postgres + extensions**:
- Example from Writer: triples stored directly in Postgres, vectorized with **pgvector** for retrieval.
- **Apache AGE** is mentioned as a module that layers a graph model on top of existing Postgres tables for quick experimentation.
- **Dedicated graph databases**:
- **Property graphs** (Neo4j, TigerGraph-like systems) for relationship-centric, Pythonic analysis, with fewer constraints.
- **RDF graphs** when strong ontological constraints and semantic reasoning are needed.
- Property graphs can integrate RDF-style reasoning via modules that overlay RDF semantics.
## Hypergraphs and complex events
- For events involving more than two entities (e.g., a prescription with doctor, patient, drug, date), **hypergraphs** can be useful.
- Hypergraphs can better model such n-ary relationships but have a less mature ecosystem and fewer off-the-shelf algorithms.
---
## Using agents to build the graph
- Historically, building knowledge graphs was **manual and expensive**.
- Now, **agents + structured outputs** can automate graph population:
- LLMs extract entities and relations from text while respecting a predefined schema.
- The talk mentions a four-module framework (e.g., “exKG” style) with: **extract entities**, **extract relations**, **integrate**, etc., to automate ingestion.
---
## Memory systems and graph-based memory
- Modern memory providers (Cogni, Mem0, MemGraph-like players, Letter-based systems) are moving to **graph-based representations** in their memory systems.
- Core challenges:
- **Memory bloat**: not everything should be remembered; the agent must decide what is important to store.
- **Removal and update**: neutralize or update obsolete information.
- **Complexity management**: large graphs must remain queryable and navigable.
- There is a trend toward using **reinforcement learning** to teach an LLM when to write to memory and when to read from memory, treating memory access as a tool.
---
## Reasoning on graphs
## Using ontologies for reasoning control
- Ontologies act like **domain rules**, analogous to traffic laws governing navigation.
- Combine **structured output** with **ontologies** so the model’s reasoning is constrained to valid concepts and relationships.
- If the database is built on an ontology, agents can **derive new knowledge**:
- Example: if “acquires” is transitive and A acquires B, B acquires C, the agent can infer A controls C.
## Operators and algorithms
- Reasoning uses different operator types:
- **Node operators**: find relevant nodes.
- **Relationship operators**: inspect or filter edges.
- **Chunk operators**: link back to original text sources for grounding.
- Graph algorithms (e.g., community detection like Louvain/Comm-style methods) can cluster related nodes for **GraphRAG**-type retrieval.
## Flow: semantic + symbolic reasoning
- Typical pattern:
- Start with **semantic search** (vector) to find entry-point nodes.
- Traverse the graph with symbolic operators, compute aggregates (counts, totals), and call specialized tools for complex reasoning.
## Getting LLMs to understand graphs
- LLMs do **not natively reason on graphs**; naive methods serialize a graph into JSON or lists of triples.
- Studies show significant accuracy gaps between different serialization strategies.
- A more promising direction is to **train models with graph-aware attention** so they understand graph relationships natively; the talk references MIT-style work and experiments planned in the author’s book.
---
## Planning and tool orchestration via graphs
## Planning with causal graphs
- For **autonomous agents** that must decide “what to do next,” exposing a **causal graph** of the domain helps planning.
- The agent uses the causal graph in a planning phase before acting, grounding its plan in explicit cause–effect structure.
## Graph of tools
- A major issue: **prompt-tool bloat**—giving an agent too many tools degrades performance (e.g., MCPR-style findings showing tool-count sensitivity).
- Solution: build a **graph of tools**, where:
- Nodes are tools with descriptions, inputs/outputs, and constraints.
- Edges represent **dependencies** and valid compositions (A → B → C pipelines).
- The graph is updated from **usage and evaluation**, turning tooling into **data**, not hardcoded logic.
- Security: attach **constraints and policies** (e.g., information-flow control) to this tool graph to restrict which tools may be used when and on what data.
---
## Optimization: efficiency of graph + RAG agents
## Building the graph
- Use **small specialized models** (e.g., “Triplex” small LM trained to extract triples) and/or classic NLP (NER, relation extraction) to build graphs efficiently.
- Libraries like **LightRAG** (lightweight GraphRAG) are cited as practical tooling for efficient extraction and retrieval.
## Retrieval strategy
- Employ **dual-level search**:
- Decide when a query needs **graph retrieval**, when vector-only is enough, or when to combine them.
- Microsoft’s **Lazzy GraphRAG** (closed-source) is mentioned as an example of dynamic graph-aware retrieval.
- Use **GPU-accelerated graph libraries** to speed graph traversals and reasoning.
## Reasoning efficiency
- Because graphs + structured outputs control the problem well, you can often rely on **small models** to do most reasoning, with large models reserved for hard cases.
---
## Evaluation and self-evolution
## Multi-layer evaluation
- Agentic systems need **multi-layer evaluation**, enabled by decomposed workflows:
- First check: **Is context sufficient?** If not, it’s a retrieval or data problem.
- Then check: knowledge conflicts (LLM pretraining vs retrieved data).
- Evaluate each step for correctness, robustness, and safety.
- The talk mentions systems using another LLM to judge **context sufficiency** and to attribute failures to retrieval vs reasoning.
## Self-evolution pipeline
- After evaluation, apply **self-evolution** mechanisms:
- **Prompt tuning** and **fine-tuning** of models.
- **Reinforcement learning** to optimize policies, including when to call tools or memory.
- The agentic workflow graph effectively defines the **RL environment**: states, actions, and rewards.
## RL-as-a-service stack
- Libraries like **Verifier** illustrate an emerging **RL-as-a-service** stack abstracting away RL algorithm complexity (e.g., PPO) so developers can focus on success metrics.
- Ground truth signals come from the multi-layer evaluation plus implicit feedback (e.g., user reformulations, usage patterns).
---
## Vertical optimization of KG–LLM interfaces
- A highlighted example: **Cogni’s** work on optimizing **KG–LLM interfaces** for complex reasoning.
- The idea: treat the **knowledge-graph + LLM interface as a parametric model** and optimize parameters such as:
- Chunk size for text segments used to build the graph.
- How many vertices to retrieve per query.
- Which graph templates or views to expose.
- The exact instructions/prompts given to the LLM for KG-related tasks.
- The challenge: optimizing one step can degrade downstream steps; some systems (e.g., Optimus-like approaches) are starting to tackle **global optimization** over the whole pipeline.
---
## Q&A highlights
- **What vertex/edge types work well?**
- There is no universal best; start from the **business problem**, derive an **ontology**, and let that drive the graph data model, sources, and agent pipelines.
- **Dependency on ontology quality?**
- Yes: around **80% of the work** in serious agent systems is **data engineering and ontology design**, not LLM prompting.
- Vector-only retrieval cannot model relationships and reasoning at scale; combining **graphs and vectors** is essential.
This video, as the first in “ML Stack in Action,” establishes the mental model: an **agentic graph + RAG system** with clear separation of process vs knowledge, rigorous context engineering, structured outputs as the enforcement layer, and evaluation-driven self-evolution wrapping the entire stack.
Add to follow-up
Check sources
1. [https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/attachments/139614499/2ee025f7-cabe-4041-992e-39e39dd2d69e/paste.txt](https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/attachments/139614499/2ee025f7-cabe-4041-992e-39e39dd2d69e/paste.txt)
---
Welcome. All art session is with Anthony ALCAwith his deep expertise in agentic systems,knowledge graphs and enterprise ML platforms.Anthony will walk us through engineering graph Ragents, blending symbolic reasoning,autonomous agents and practical deployment patterns.So expect a lot of real world case studies and actionable insights from the field.With that, I will hand it over to you Anthony.Hello everyone. So very, very nice to to be here with you.So today I will present.So I would, I would say design system thinking around the agentic.I think agentic is not hype. Uh it's not a bubbleI'm seeing in my daily Java system that works.I mean every one of you must have heard about lovable perplexity,even their new bronzer.Sowe have in mind lots of products that's working with Irow.But I think the the the design,the system design thinking behind those tools are are still emerging.And I am writing a book on what is called agentic graph system.Agentic graph rackthat emphasize a certain way to build agentsbecause I think that at the center of agents shouldbe a certain types of data structure of data graphs.And I I will explain why today.So my talk will be in three parts.I will talk about how to solve the context engineering challenge in agentic.So I will explain what is the context engineering.What is the concept that is emerging?Why graphand I will detail,I would say the the pillars to to build it effectively and to make itreliable with security and also build a real data fly wheel around LLM systems.So what is context engineering first? What are agents? OK.Basically currently most of jaI applications were like assistant,they were stateless. So they don't remember between interaction,they forget context, they are mostly reactive.What she wants to build today is something that decreased considerably.The time to valuethat is an agent, an agent is basically three things. Itof self it reason and it actsand you want to build those tools because like I said,you want to give in every softwarea time to value goingfromTsAeronthe tooltochu based on agent that can give value in a few seconds.Yeah,basically this is what you want to do.You want to build agents that are stateful but remember that have memories,etcetera.So it will be the object of this talk.There is currently many frameworks and the Stack is not stabilized yet.The best practice is to do it.It's very hard to do it and it's not, I would say the best practices are not yet therebut they are energy.Sowhat is the current problem of traditional Rag? So basicallyto build agents, you want to give to these agents a context context.So proper context, your enterprise context.The problem with traditional rag that is based mostly on on Veeldatabaseis that it's hard to connect past interaction.OK. To build a relationship between your dataandin terms of temporality to have like temporal reasoning vector database are not,are not, are not suited to do thisand similarityis not causality.And also when you do vector search, you are missing some information,not everything is a question of a similarity.What you want to build is to you want to build symbolic reasoning.You want to reason on the connection within your data,you want to create a symbolic toolsand that I will detail during the talk studies.So I have lots of resources in my in my presentation.But recently there has been a Microsoft research paper that showedthat for local queries and I will explain the differencebetween global queries and local queries for local queries.So vectorraggoes from 90% accuracy for local queriesto 20% for global queries.Global queries are queries that concerning that that's about your entire dataset.OK.So you don't,you can't reason on your entire dataset with onlyvector set vector search will have more support rolebut I will detail.So what is context engineering?I think we should think.So it's a, it's a practice to give the proper context to the agent.OK.And you wantto give only information that are relevant to the agentbecause LLM are like statistical and even with LLM.So last reasoning model,they only do proper and good reasoning.If they are given the proper context,the right context and context engineering is is basically for four things.It's right, right. So write a context, a proper context,select the proper context,compress the proper context and isolate.And I think soI will demonstrate that for these four operations, right, select compress andisolategraph approachescombine with structural output better.If you take for example, the right operationonce you what she wants to do, for example,from memory is that you don't dump facts in the list.OK.What she wants is to create new informationwhile preserving the vitavital connection between the facts and creating a coherent world view.So for this, for example, you you will,you will base your your data gathering on an ontologythat will help you structure yourgraph. So this is just to illustrate the right.So my talk will be about the pillars of what I call. So Ajugraph Roragraph system.Basically there are eight. So there is a Tin my image, there is a missing component.So this is very structural output memory systemtool, orchestrationoptimization.Because you will see that there is lots of optimizations to do on such system.There is a reasoning with graph planning system and knowledge representation.And the last oneis self evolution.OK. So basically the book I'm I'm writing is about this 88 component.And why why what I like about this image is that for me,LLS are like a fluid.OK?Thanks to their in context learning.Thanks to their capability, the adadaptability, this is their strengths. OK? They are adaptable.They are like a fluidbut to make it reliable in production, to solve real business problem,you need to make to put this fluid into a bubble.OK? And, and these eight pillars help you dodo exactly that.So why graph?SoI don't say that vector search like useless. On the contrary,what he wants to do is to combine bothOK. Veelsearch will be like an entry, entry point in most of your system.That is to say you will start mostlywe vesearch to find the entry points on the graph. Once you have foundthis entry points on the graph,you will apply.Thanks to dynamic reasoning with your agents tools pretty fineas an example on the graph that can, that will add reasoningupon the initial nodes that you you foundfor example, and I will have real use cases in a few few slides that illustrate this.But thanks to graph,you find a very bad relationship.You are solving the problem of distracting context.That is to say contexts that are irrelevant tothe to the queries or the reasoning step,you can answer thequestion like why orhow much,how much this node as a relationship,how manythese nodes from this.So you have like a reasoning layer that is possibleupon the graph that is not possible with vector search.So mostly what she wants to do is to combine bothon your data engineering part of Argenti.But for meand my book emphasizesa agentic must be thoughtinto a two dimension.So you have the process layerOK.That is uh it's like process engineering.So what you want to do is to map your, your business, your business subjectinto step. OK? You want to work with business experts to map this processand see all agentscan be useful and which step they will need to respectand you have the knowledge, the knowledge vertical.So you have like an horizontal graph and a vertical graph. OK.On the process layer,what's the question that you you need to ask is I would say all the agent actOK.And on the vertical axis, what the agent needs to knowbasically is this and on the vertical you will have for example, your toolsand your data engineering, your databases that the agents will use andmostly what agents will need for complex reasoning,rough relationship between your data.So it'sait's it's this kind of of I would say mental model that, that,that I think we should have when we design agentic.But I think I sawagentic is also spectrum. OK?It's a spectrum between data determinism and and autonomy.That is to sayyou want, you, you won't havein many cases you wantto to create likeamaybe secondpipe pipeline. OK?For payment, for example,you you want your agent to respect and to reproduce structure output, reliable,output, that's automaticand reliable. Always the same.You don't want to have autonomy in every of your step within your agentic.So what will be agentic is to find the proper equilibrium, the proper cursorbetween full autonomy and full determinations.That is to say agentic will be a mix between these two categories. Before, before LLSwe we had software that we are only deter deterministicwith agents.So with the JA I capabilities, we can move the CORSon on autonomy spectrum but it won't be full autonomy. OK?Because you, you want to haveguardrail,etceteraand graph plus other technologies like structure output will allow you to put theCORS to say to mix the normal components of the L and M with the symbolic one.Sowhat you want to what what you want,what you want to have with your agents is to have so autonomy as the right place state.OK?Because you want to keep state of what is happening for your agent,agents to learn from interactionand also reactivity. That is to sayon the process layer, you want you want to agentto be to adapt to, to react and maybe make choices on what needs to be done next,etcetera.So on the horizontal apps.So if we follow our logic of determinism and autonomy, what you will mostly have isfor example, for a medical agentbut must process the control task. And this will be always the same,you will have like deterdeterministic workflows.OK? Maybe within those deterministic workflows, you will have some,some component of autonomy like uh retrieve an ontology, for example,for certain kind for for to adapt its its reasoning.00 OK.For I, if you move along the way you will have like what we call todaymultiagent system, that is to sayyou will have an orchestrator agentthat, that we choose which,which agents that we call in parallel multi agents to perform a task.And on the extreme side, we will have autonomous agents.What I will demonstrate is that for each of those like uh agentic workflows,you need condition.OK? You, you can't do parallelism for every, every of your task. OK.I will explain why and, and, and autonomous agents are not reliable for every taskthey must be used in certain types of task.So this is what I explainedwhen you have like high task dependencies, state inintegrity is critical reliability over speed.You want to build second,show processing, second control pipeline. So you will use a single thread, asingle thread or, or or multiple agents or multiple LS maybe or SL MSthat are sequential and, and, and do all always the same thing.When you want to create parallelization,you need to make sure that the task that the tasks that are parallelizedindependent.But there is no dependency between the task because if they are performedin paralleland, and they are dependent, you will create problem in chaos. OK. Uh umYeah, there has been, for example, this paper I put in the resources and if I didn't,I will make sure it's the caseof, I don't remember the company that wrote it, but it says don't build a multiagent system.Basically, this is what they are saying.I think they, they were too extreme.I think multi agent system will be built. For example,if you have a look, for example, in real production,you have like deep research agentsin cloud or, or even Google Gemini or even, or, orPT they use agents in parallel becausethe task of researching like domain or domain or multiple sourcesare not independent in itself and they can combine the results.This is the thinking behind the horizontal graph,what what you want to use and and it's, it's it is the case for, for example,companies like dust, OK, that are very successful in France and over Europe.And the start I think in the USor even if you take the example of the library from AWS like youragents.So you want to design your, your workflow as a graph,OK? Soshow parallel or even autonomous, OK?But you want to, you want to your run time to be durable, OK?And for this, so you have libraries like torestateand and a recent one that I saw. So mostly I usetoolyou want your, your, your runtime to be resistant to, to like interrupinterruption.You want to be in, in in Cronusby design.And I think for example that it is opinionatedbut but if you take long graph itso it works like this as a graph but the runthe runtime, it's not, it's not great. OK? It's not resistant to interruption,to durable, I would say durable runtime.So be cautious about what you are using,I mean in your code to orchestrate these agentic uh work flows.So now on the vertical graph,so you will have multiple layers,OK?And you will see I will expand,I saw the different types of database you will usein a strategy to to be cheap efficient with,with thisnotably with progress.So you will have like a probabilistic layerthat is to say it's through your vectorization,OK? Because you will perform vector similarity searchat first. So it will be the more fuzzy one. OK.On top of this,you will have different types of database and notably graph database.So you can have like property graph property graphs or LDF graphs.The difference hereis that the, the constraint that you will have on your data is different.So property graphs are much more loose. OK? UhThere is much more or less constrained different. And for ad F graph, it's much much,it's, it's less different.And for this, in terms of his vertical axis,you must design and you must think aboutwhat is needed for your agents.Andthe top would be to have like an ontology of what the agent needs.OK? Because you, you can with graph technologiesput constraintand make sure that the agents when the database is updated,when it reason with itrespect the ontology in question. So you will have example down theroaduhon this vertical axis, what you want to represent is not only knowledge,but you want to represent some pluralities.You want to represent a tool,tool dependencies because as you will see LLS are very bad fornufunction that is to say they are very badto, to,to respect like when a function needs an input and an output from another function.So you want to create a graph of your tools for orchestration of the toolsand that you want to represent the capabilities of the agents,what are their constraint,etcetera?And you want these vertical axis also to to learnfrom, from errors from evaluations, you want to create a data flowfrom itand and graph of course, are the best representation to do it and to make it evolve.UhOne important component that will be key in my demonstration is structural output.I think he'scurrently there isa company called dotSo they have a library called the outlinesthat we can find on github. So it is the most I would say stars in the field.When you think about structural output,it's not only format respecting that is to say I want to generate aGSONit constrains the formatOK of each type step of the workflow of the workflow of the agents.But it also constrains the reasoningfor I will give you an exampleuh if you have like an ontology. So a way of thinking about a domainand you want to make sure that the reason systematicallywith this ontologyand with the step of the ontology,you will combine the structural output with this ontology, you will make sure.So a structural output is also a tool to forcing the LLMto respect a certain way of thinking a certain way of reasoning.And what is interesting. For example, there is a blog by dotT. So it's a company behind outlinesthat shows they showthatSLM so small language modelcan be competitive with a much more larger modelwhen you use it in combination with strucwith structural output. So structural output must be think about this way.That is, it's not only a tool for format, but it's actually a tool for reasoning.Andwhat they are creatingis thatcomplex behavior emergesfrom simple components.They're creating reliability on each stepand the max step composable and also testable because you will use tooutput not only to generateOK, but also to, to evaluate and I will detail this down theroad.So it's like theTCP/IP of agent TK I, what what's made successful ciscoit, I won't tell this butit's, it's, it's the same asI put it.It's like the the reliable standards and reliable contracts for, for agentsin terms of communication between the different step of your agentic workflows.In terms of I saw on the interaction with tools with API etcetera.So it's, it's,it's a really important component that is I thinkwill gain traction in the months to come.And especially with, with what is being outlines that is I think by having testing,testing, test, test it in production much more superior to the entropic and openedsolution.So it's,it's it is the enforcement layer in boththe horizontal graph and the vertical graph,it ensures reliable process.And it's it's important in terms of orchestration, routing tool, calls, etcetera.Sothis is the basis of I would say why graph now let's dig deep into the pillar of agentgraph in the remaining minutes that I have.Soif you take those the knowledge representation,so the semantic representationwhat you want to modelumuhyou want your graph in most cases for your agent workload to mobile events.That is to saynot only static things, butyou want to allow a real temporalreasoning with your agent. And it's only possible with a graphmostly with with graph graphstructure. What what you wantto dois to have a contextual boundaries.That is to say you want to define specific scope for knowledgeto prevent context mixing.OK? Because one of the most case of failure in your agent,it will be what is called the distracting effect.That is to say you retrieve some snippets orsome sources of information to inform your generation.And you will have like distracting,distracting words or distracting things that are not relevant for your reasoning.So you want you want to prevent thatyou will see down the road. Butthere is for example,this approach that is called noderagthat builds anousgraph.That is to say you will have nodes for entities,you will have relationships, you will havesummary nodeswhat she wants to target.What she wants to model is the possibilityto do targeted and efficient information retrieval.And you will see you will have real examples just aftertheafterand so how to choose yourknowledgetoor how to build this.So I have a strategy,for example, if you take the exa mle of writer, so it's,it's a huge platform company that is currently building agentic system.They're moving in production for marketing.They don't use per se agraph database for basic retrieval.What they did is that they have graph data model. So tripleswithin their progress.OK.And on top of that,so they have like the extract triples from likechunk text based on ontologies based on their data modeland they use PG vector so they vectorize those triple to perform the retrievaland something that is powerful.But I have seen in many in some startups,many startups I would say because it's the case for many startup.But I I've I helped during my work is to use a module that is calledaApacheAzure.That is to say it's a module on postgraphs that, that, that creates a graph on top of your existing postgraphdata databases.OK. So this is a AAA simple strategy to build your, I would say your graph.Uh The problem with that is that if you have like 1 billionedges, that is to say if your graph is is too big.This solution won't be scalable.So you will do it for to quickly experiment to quickly, I would say leverage your postdata.But if you want to have like a proper scalable graph database,you will useeither uhpropertygraph. So like database like Neofour G or FRB ATF graph startup.UhSothis kind of graph.So property graphs are best for relationship centric analysis like PythonPythonic analysisstructure.You willand for property graph, I saw you,you must know that there are modules that allow youto integrate RDF reasoning on top of your property graph,you won't have the same property as LDF graph,but you can use module to transform your property graph into an LDF graph.Soand I sawhygraph.SoI have seen cases and, and, and when you want to model the complex eventsthat involve more than two entities at onit's called an NRE relationship.So I will take example for like a prescription.When you have a prescription, it has a doctor, a patient, a drug, a date.So it's like a graph in 3D 3D.You, you, you, you might want to usehygraph and you can do it with Tgraph, for example, as a graph provider. So you wantuhit'sall it'swith. Sohygraph, you can model complex complex eventsbut theecosystem is less, is less mature. So you have access to less,less algorithmwith this kind ofa graph.So all of these are already explained. So the trick with the first graphisattachAzure. And the reason why using dedicated graph databases,I won't detail it.Another thing that I want to emphasizeis that you can use agentsbecausepreviously what was difficult to build your graph database, it was,it was mostly manual.OK? And and it was avery costly.I think there is a trend and the technology is gaining moreand more capabilities to use agents to build your graph databases.And strucoutputas you understood will be key herebecause it will make sure that the LLM or SL MSthat is used extracts information from your sources.So your data sources by respecting the schema of the data model of extraction.And there is example. So there is a good, it's not a library but it's a, it's a github,iex two KGI have put it in the resources.That is a good example of a framework with four modules. It's a module that is stillextract entities, extract relations,relations and integratethat can be used to automate this graph population, this this graph building.UmOK. And I I explained, I saw the role oftooutput in this process.No,thereis. Sowhat you want to have with your agents is memory. So memory system.So currently there there is like four main provider for this.So you have a company called Cogni based in MunichA memozap.And the last one is letter,all of these four companies are moving towardsgraph based representation for their memory system,those challenges. So all to choose between thembecause I think they are the the real phase is is irrelevant andand most of them are open source,by the way, they have open source. Uh I mean thesothey areopenso they have like a proprietarysolution that they are opensource.Like for example,cognating the optimization of the retrieval but it's it's mostly open source.So, so, so challenges or to choose to choose between them. But if that'syou want to have your retrieval that is efficient. OK?And you want to optimize for this, I will talk about it.In my last, my last slide is if I have timeyou want to,the problem is memory bloat is that if you have a memory system,you don't want to memorize everything.So you need to have like your reasoning layer that makes surethat what is memorized is pertinent for the agent in question.Soyou want also to have like a removalsystem but remove like neutralized update information.And you want also to manage the complexity of thisgraphgraph that graph database or graph solution. One point that is important hereis thatmany approaches right now use reinforcement learningto train an LLMto know when to, constantly,when to update into a memory system, when to leverageits memory tools because it will be one tool among others exercise that.So memory and graph here is really important to make sure that yourstateful.UmSonow that we talked about Krepresentation,let's talk about reasoning.So you will have reasoning on both the horizontal,horizontal graph but also on the vertical, on the vertical graph.MOne thing that is important in your reasoningis to,to use and to leverage existingontology is likea,it's like a person.It's, it's like a person with a map but no understanding of, of,of traffic laws or the difference between a hospital anda warehouse ontology are like rules of your domain.OK? If they are,it's, it's the rules,the, the rules of your domain and what you want to make sure to control the,I would say the outputs thetoken prediction of the LLM its reasoning is to constrain and to you to leverage,to leverage ontologiesand to use it with, with umwith strucoutput. So it on force the logical consistency,it allows you. So also, so you will have this on yourhorizontal graph but also on your vertical graph.Andyou will see that on the vertical, on the vertical Ron when you on your data engineering, when it's retrieved.So you will have mostly like on force logiclike on force queries that you can predefine.Butif the agent as an as the ontology,as the schema of the ontology, it can also deduce new queries that are relevant to,to, to, to the, to the step in question, to, to, to the, to the query in question.Uh It's,it's allow you because if yourdatabase is built upon an ontologies, it allow you agents to derive, to derive,derive new knowledge from existing fact.For example, if the ontology defines the acquire relationship as transitiveand the agents know A, A acquire Band B acquire C, it can infer thatA no control control C.So you can have like transitive reasoning if your, your, your, your graph,for example, is, is, is based, is in the RDF, is in the RDF graph.Sothat's it. And also what is what will be powerfulit, it will be to and, and I, I put an example in resources is to,to create a retrieval of the ontology in question and to enforce it with structure.UhWhat is the toolkit for reasoning?So on graphs, you have multiples like nodes operators.Uhsoyou have like the node operators that is to say you want to find the proper things.OK. So it's the proper nodes,you have the relationship operators.You want to find relationships that are relevant,the chunk, the chunk operators, you want to go back to the sources,the material sources that has been used to extract informationwhat she wants to.So you have many operators to build your reasoning layer on top of your graph.Of course, you can use as a graph algorithm.For example, there is the approach of uh um Microsoft. So apaper called Graph Rack, they use like acommthatisto say you want to find nodes or relationships that areuh uhthat, that, that, that pertains that that are from the same community.UmSoyou have all of these that allow you tohave multiple layers that is to say you will mix semantic searchwith symbolic reasoning and I will give you an example.OK,sorry for that.UmSoimagine,so you have a graph of companies with information on their like media coverage,for example.So because thanks to graph, you can create a relationship,you, you, you can rebuildqueries.So symbolic queries that will allow you to perform reasoning on top of the, of the,of the not in questionand, and like perform reasoning on on this.SoI want to tell you because I won't have the time but you can,you can this of the agents also maybe find.So are theseare the tools to be selected and, and, and to perform reliablyand also agents based on the ontology on the schema ofthe graph creates new queries that can be used also to,to, to, to perform to perform the reasoning.So you will start maybe like with the semantic search on some notesand from these notes, you can likedo basic math operation, like how many customers have these,these companies based on the graph.And on top of that,you can have like predefined tools that willallow you to perform much more complex reasoning.So it's very flexible but it is to say your grow database become a reasoning layerthat an agent can leverage us too.Also, one challengeis that LLMSdon't understand and don't reason natively on graph.So you have multiple methods to make sure that the LMunderstand the graph in question.I will because I won't go along. But there is an approach is a trainingto make sure that the attention mechanism of a of a model is if graphgraph uh it's,it's a method that has been developed by mit thatI will be experimenting in a notebook in my,in my book that makes sure that theLLM respect understand natively the graph relationship.So in terms of reasoning,you want to make sure that the LLM that is usedas the core engine of your agents understand natively the graph.Mostly most methods currently that translate the graph asaGSON orYor, or a list of, ofSthere is a study that I put in resources that shows the diffthe accuracy gap between these different translation layer.I would say currently.And I think the most promising approaches will beto train your LLM to understand your graph natively.And there are many approaches that are exactly doing this.And so I want to tell it because I want to say, I think II I too much timeyou want your planning layer that is to sayuh so you won't use, for example, if you have an autonomous agentthat needs to decide autonomously what to do.So there is no second control workflow.There is no, I would say parallelisms workflow like in a multiagent graph with an orchestrator.Here is an agent that uh based on, on a problem or a query needs to decide by itself what to do.And hereI saw and I will illustrate this.One of the best approaches to do this is to, to give it to give access to the agents,the causal graph that is to say a graph represents your, your,your domain knowledge with causalities and and and LLM can be leveraged.Uh So in his planification phase. So before acting,he can leverage this graph to to plan.So I won't detail because I think I I have been too longin terms of tool orchestrationgraphs are powerful.So because what you will have is is a, is a program called prompt tobloat. That is to say if you gave, if you give too many tools.So there is a study called the M CPRthat show this.If you give like 100 tools, the LLM won't be consistent,you won't know what tools to call.And I'll show you what you want to create is dependencies between the tools,the tools OK.Youwant to use toA to B to C to perform a specific task and you want us to update what has worked,what has worked, what has, what didn't work.And all of this basically, and there is a meta episode of neofouron this. You want to not to hardcode your tools, but you want to make data out of it.You want to your tools to be datathat, that evolve, uh that evolve and can be like updatedatstyle. So one of the best approach would be to create a graph, graph of toolsand it's,it's also graph also because if you think about thesecurity crisis of agents that is a huge subject,you can have a constraints, policies on top of, of this graph to impose and to seeto secure. Uh I saw you your, your, your, your agentsand there is this approach that is called information flow control.That does exactly exactly this.So, so yeah, and, and goOK andmove on. SoI think it's exactly what I just explainedand it is exactly what I just explained. Sorry for this, but I want to, to finish.Uh So it was exactly this you want to create likea graph of tools that evolve with evaluation with UU usage.UhAnd also most critique that I will I heard from graph approaches is thatthey are not efficient.What she wants to have is to optimizeall of this. You want to optimize the building of rust,you want to optimize the retrieval and youwant to optimize the reasoning of your agents.So far we built um for example, there is this small language model called triplexthat is a small language model that is trained to extract relationshipand entities to build a knowledge graph.So you want to have approachesthat are efficient in terms of building maybe use an N LP basic NNP approaches,not not neural language model to build your, your, your, your graph.There is this library called light graph like sorrylike Rthat does exact exactly exactly thisto retrieve,you want to make sure that the LLMdoes graph queries when it is necessary.So you will have a system that maybe thattriage or know when the query needs a graph retrieval. And when,so you want to have like a duallevel search etcetera.So there are approaches,there are of course these approaches by Microsoft that isnot yet open source that is called Lazi graph rack.That does exactly exactly this.And I saw, I saw what you want to to use is the GP Uand,and the D library for graph to accelerate andshow the retrieval and the usage of your,of your graph.And for reasoningbecause you are using graph and maybe you are using,you are using a structure output.You want to use as much as possible, a small language model to make sure acore flowefficiencyand also our accelerated influence.So this is exactly what I just presented.So yeah, I I just want to make sure that I am on time.Exactlythe last layer that I want to talk about. And I think it's it's it is an important one.The most important dimension for uh for agentic is is like the evaluation of it. OK.And I I associate the evaluation with the self evolution.That is to say you want your agentic to self evolve to improve over timeand as automatic as as possible you will haveof course human in the loop in the process.But you, yeahfor this.So the workflow approaches that is to sayyou to separate your agentic uh workflow in stephelp you to isolatefrom this isolation.You want to perform a multilayer evaluation.Basically what she wants to evaluate isis is many things you want to make sure that your agents before reasoningas the proper context. So is the context sufficient?OK. There is an approach is on this side, put the guitab in resourcesbut design a system to know if the the, the,the context that will be used by theagent is sufficient and it can be logically deduceduh it, it using another LLM. So LLM adjust to, to,to judge thisonce you know that he has a sufficient context.So if it didn't have the sufficient context, it's retrieval problem. OK?Because if the agent doesn't have the sufficient contextto answer the query or the reasoning step,the problem comes from your, your, your databases, from your your retrieval layer.Sothe retrieval method is not good or database in itself is lacking some information.So once you have this,you have multiple problems that can appear.So you, you have conflict, conflict, knowledge, conflict, that is to say your, the,the the data that is within your LLM conflict with your data that is retrieved.It, it's a problem. It'satar.So you need to multilayer multilayer evaluationbut I'm detailing in, in, in my, in my book and I put resources uhthat makes sure that you have a proper evaluation of your system.Once you have this evaluation,what she wants to do is to performself evolution, targeted self evolution.So prop tuning, fine tuningand fine tuning with reinforcement learning.Soyou have,I think so what is emerging?So you have this library called verifier.UhWhat is emerg emerging is a Stack of reinforcement learning as a service.Because if you think about so if you think about a agentic.It,it's like you are defining an environmentfor reinforcement learning agents with rewards.And this way of thinking of workflow graph and vertical graph are the environmentsthat you need to, to optimize of what you need to to optimize.So I would advise you to check this library called verifier but Istruc away because it would be more and more easierin the month to cometo abstract away the complexity of a algorithm likepoand allow easily to define the successcriteria to perform a reinforcement learning,functioning with reinforcement learning.And of course, one of the difficulty of it is to have ground proof,but the ground truth will be determined by the evaluation that I, I talked,I talked earlierand you want us to captureimplicit evaluation. That is to say umuhuhwhat are the reason for a user toreformulate? UhDoes it reformulate his initial queries, et cetera, etcetera?So, so you will need all of thisand to get to 100% reliability, you will needI think it's the next frontier plus whatI said earlier in terms of retrieval and mocksI saw.So it would be my my last slideon the vertical. So this was on the horizontal graph.So on the second short step to perform self evolution andimproving the reliability on the vertical graph, what she wants to do is to optimizethe interfaces between L LMS and KG.And forone of the best approaches that I sawand I analyzed and thisis the approaches fromCogniCogni. Sothey released a paper called optimizing the KG LLMAntifafor complex reasoning.And it is what is behind the COGNapproaches.So basically, they see like a graphicalmachine learning that is to say there are P parameters.But you want to optimizelike thechunk size of the text segment used to build the graphlike webto retrieve vertex chunks or structuregraph templates.Like the number of retrieve attempts to provide a context,the specific instruction given to the LLM.So all of these are likeparameter that you want to optimizeand on each node of your agentic system, if you are on a real production system,you want to make sure that the E parameter that arechosen for each of those steps are the good one.What is complexis that you want, if you optimize one step,you want to make sure that you don'tdegrade the the other subsequent stepcurrently in most approaches, this problem is is not, is not resolved.I saw a recent approaches from Optimus that deal with it.Soit is called optimus but it's not. Uh I would say we spread and easily accessible for,for the developers. So I think for now the approaches fromcompany is is and I will develop it in my, my bookis, is a good approach to optimize this vertical.I mean the retrieval of, of, of the sources by an LLM.So that's it. I think I am. I hope I am on time for some questionand thank you for listeningand sorry forthe pace. I thought I would be, be better.Sothank you Anthony. That was fantastic.Dive into building intelligent and scalable agentic architectures.We have received some good questions so I'll start bringingthem up on screen for you to respond to.We have onecoming off from lorry.It should be on this fee.Uh Sorry, sorry, I have just to put,what are some of your favorite examples ofgraph vertex and edge types that worked well?What do you mean by that?I mean a good example of use cases, real production, use cases that that worked orbecause I mean, you it depends on your domain that you are modal.Butthe question, I don't really get the question.I mean, so that a model that you will be building, willyou need to start from your,what is the business problem that you are trying to solve?And what will lead the agents? OK.2 to 2 have a proper reasoning.I mean, you need to start and the best way to do it is to build theanology on your domain. OK.What is needed to to, to, to onif it's, for example, there is this paper recently from Microsoft on a,on a medical agentuhto, to diagnose, for example, the uh to, to, to treat disease.And what are the steps, for example, to diagnose a disease?These are, are like an ontology, a medical ontology.So this is the kind of reasoning that you need to have depending on your domain if it isand you can havefor every of your domainand xlogwill determine,I would say the data model that you will use to build the graph.What better sources you will need to, to, to, to use as sources,what you will need to implement in terms of agentic pipelines, to populate it,etcetera.So yeah.So in terms of question orders,I tookoneothers that take the outlines compared toit is built on top ofthis.So no, no, no uh our clients. So it's, it's a company that I have inmy current work. So I work only with because this isa, it's a French company. So it's my in my territory,they have a mode OK? Thatbecause what they do is that they make sure that the decoder of the LLMrespect the schema in question. So the other token are zero.So tokens that are external to the schema are zeroand they make sure that the tokens that are withinthe schema that you are given are like probabilistically generated.So it's a better approach than postprocessingfrom, for example, Anthropic, other library. But other method that I sawand I havenumbers that show that it is much more accurate, like close to 100%and also more efficient because they have like methods like coalescencebecause the tokens that are external to the schema are zero,you can, they,they have a method that makes sure that the inference is much more speedbecause they know which tokens are zero.So so they can generate only the tokens that are relevant to the schema.Sostate it's a really, really good approach.And you must know, I saw that behind the hood,they have a product that is to say this is open source is really good,but they have an even better product behind the hood for large scale production.Uh Yeah,Anthony, we have one more from mait doesn't, I don't,I don't seem to find the slide deck in the resources to follow along.I think it will be distributed asa slide deck.UhIf I understand it correctly,this approach is highly limited by the quality of the no,that is used to extract on cities from.That's right. Yeah,I think,but if you want to80% of the of the work for agents will be your data engineering. OK?Most of stealth that ca catchand, and I talk about stealth stealth thataresea,80% of the work is on the quality of the data. OK?This is why by the way, all, allSASS or SAS company can be modernized with agentsis that they are already, they already have the business logic and the data.So the most difficult part is to build this, OK, to build the data model.And the data engineering of your your domainis to do context engineering.But it is to say give the proper dynamic context that isrelevant for the agents to perform his reasoning and its work.And you want you want your context to be in the relationshipwith, you want to be the relationship in your,in your data.Currently.If you use only a vector database, you are retrieving uh so vector similar chunks,it doesn't give you any reasoning. OK?Maybe you can put a metadata on top of it like somemetadatabut but at scale it has some limits. OK? And most,for example, mass production system use use a rank.So we use another neural models totheranks, the chunks that are rich with that are known to be incorrect.So it's, it's crazy. OK?So you need to combine both so graph data models and vector, OK?Because you can vectorize not a relationshipnow.So yeah. Yeah. Yeah.So I mean the quality of your agentic willbe dependent on the quality of your data models.So ontologies, it's, it's, it's a, it's a, it's a good way to do it,but you can combine so ontology.With like I explainedlike a progress that would be out of qualitylike writers does in production, they are triplewithin progress. OK. That has been extracted from text.Thanks Anthony.