How to utilize AI hallucination
How to utilize AI hallucination with RAG
AI hallucination seems to be unavoidable, but we can utilize this with RAG (Retrieval-Augmented Generation) to optimize the output. Below is an example to illustrate how to ask ChatGPT to generate the desired legal provisions.
Conponents: OpenAI API, Zilliz Cloud Collection
1. Setup
from openai import OpenAI
from pymilvus import MilvusClient
import numpy as np
import re
2. ChatGPT
# Configuration for OpenAI
openai_client = OpenAI(
api_key=your_api_key
)
# Call the model to generate your target content
completion = openai_client.chat.completions.create(
model="gpt-4o", # Pick an model
messages=[
{"role": "system", "content": "your_system_content"}, # Set the setting
{"role": "user", "content": "your_user_content"} # Ask your questions
]
)
raw_content = completion.choices[0].message.content
3. Vector Search in Zilliz Cloud Collection
# Embed the raw_content
def get_embedding(text, model="text-embedding-3-small"):
return openai_client.embeddings.create(input = [text], model=model).data[0].embedding
raw_vector = get_embedding(raw_content)
vector = np.array(raw_vector) # Convert it to a (1536, ) vector
# Configuration for Zilliz
milvus_client = MilvusClient(
uri=your_url
token=your_token
)
# Vector Search in Collection
milvus_search = milvus_client.search(
collection_name="your_collection", # Your Collection name
data=[vector],
limit=1, # The number of results to return
search_params={"metric_type": "COSINE", "params": {"nprobe": 10}} # Custom your params
)
milvus_search = str(milvus_search) # Otherwise it will be an Extralist
4. Retrive the target content from Zilliz Cloud Collection
# Retrive the id from the search result with regexes
id_pattern = r"'id':\s*(\d+)"
id_match = re.search(id_pattern, milvus_search)
id = int(id_match.group(1))
# Get the content with id in Collection
milvus_get = milvus_client.get(
collection_name="your_collection",
ids=[id]
)
milvus_get = str(milvus_get) # Otherwise it will be an Extralist
# Retrive the law_article from the get result with regexes
law_article_pattern = r"'law_article':\s*'([^']*?)'"
law_article_match = re.search(law_article_pattern, milvus_get)
law_article = law_article_match.group(1)
print(law_article)
More details see OpenAI documentation and Zilliz documentation.
Now it is done!