ChatGPT, a form of generative AI, represents just a single manifestation of the broader concept of large language models (LLMs). LLMs are an important technology that’s here to stay, but they’re not a plug-and-play solution for your business processes. Achieving benefits from them requires some work by enterprises because, despite the immense potential of LLMs, they come with a range of challenges. These challenges include issues such as hallucinations, the high costs associated with training and scaling, the complexity of addressing and updating them, their inherent inconsistency, the difficulty of conducting audits and providing explanations, and the predominance of English language content.
There are also other factors like the fact that LLMs are poor at reasoning and need careful prompting for correct answers. All of these issues can be minimized by using an internal corpus-based LLM by a knowledge graph. A knowledge graph is an information-rich structure that provides a view of entities and how they interrelate. For example, Rishi Sunak holds the office of prime minister of the UK. Rishi Sunak and the UK are entities, and holding the office of prime minister is how they relate. We can express these identities and relationships as a network of assertable facts with a graph of what we know.
Having built a knowledge graph, you not only can query it for patterns, such as “Who are the members of Rishi Sunak’s cabinet,” but you can also compute over the graph using graph algorithms and graph data science. With this additional tooling, you can ask sophisticated questions about the nature of the whole graph of many billions of elements, not just a subgraph. Now you can ask questions like “Who are the members of the Sunak government not in the cabinet who wield the most influence?”
Expressing these relationships as a graph can uncover facts that were previously obscured and lead to valuable insights. You can even generate embeddings from this graph (encompassing both its data and its structure) that can be used in machine learning pipelines or as an integration point to LLMs. But a knowledge graph is only half the story. LLMs are the other half, and we need to understand how to make these work together. We see four patterns emerging: Use an LLM to create a knowledge graph; Use a knowledge graph to train an LLM; Use a knowledge graph on the interaction path with an LLM to enrich queries and responses; and Use knowledge graphs to create better models.
In the first pattern, we use the natural language processing features of LLMs to process a huge corpus of text data (e.g. from the web or journals). We then ask the LLM (which is opaque) to produce a knowledge graph (which is transparent). The knowledge graph can be inspected, QA’d, and curated. Importantly for regulated industries like pharmaceuticals, the knowledge graph is explicit and deterministic about its answers in a way that LLMs are not.
In the second pattern, we do the opposite. Instead of training LLMs on a large general corpus, we train them exclusively on our existing knowledge graph. Now we can build chatbots that are very skilled concerning our products and services and that answer without hallucination. In the third pattern, we intercept messages going to and from the LLM and enrich them with data from our knowledge graph. For example, “Show me the latest five films with actors I like” cannot be answered by the LLM alone, but it can be enriched by exploring a movie knowledge graph for popular films and their actors that can then be used to enrich the prompt given to the LLM. Similarly, on the way back from the LLM, we can take embeddings and resolve them against the knowledge graph to provide deeper insight to the caller.
The fourth pattern is about making better AIs with knowledge graphs. Here interesting research from Yejen Choi at the University of Washington shows a way forward. In her team’s work, an LLM is enriched by a secondary, smaller AI called a “critic.” This AI looks for reasoning errors in the responses of the LLM, and in doing so creates a knowledge graph for downstream consumption by another training process that creates a “student” model. The student model is smaller and more accurate than the original LLM on many benchmarks because it never learns factual inaccuracies or inconsistent answers to questions.
Click here to read the original article published by Infoworld.
Please give your feedback on this article or share a similar story for publishing by clicking here.