Securing Vector Databases

Securing Vector Databases

Vector databases are popular in machine learning (ML) and artificial intelligence (AI) because they can handle high-dimensional vector data, enabling efficient data storage, data retrieval, and similarity search. As their adoption grows, ensuring the security of vector databases becomes critical. This white paper provides a comprehensive overview of vector database security, outlining potential threats and best practices.

Vector Databases

Vector databases are specialized database systems designed to efficiently manage and query high-dimensional vector data. They are integral to recommendation systems, image and video search, natural language processing, and other applications. 

Vector databases are essential for applications that rely on embeddings. Most vector databases can handle large datasets with millions of vectors. They enable similarity search (for example, nearest neighbor search) as well as clustering and classification of data.

High-dimensional vector data refers to data points that are represented in a space with many dimensions. Each data point is described by a vector, which is an ordered list of features. The number of features corresponds to the dimensions of the vector space. For example, if each data point has 100 features, it is represented in a 100-dimensional space.

A high number of dimensions in data can significantly complicate analysis and visualization, a challenge often referred to as the "curse of dimensionality." As the number of dimensions increases, the volume of the data space grows exponentially, leading to data sparsity and making distance measures less meaningful. This sparsity can obscure patterns and relationships within the data, reducing the effectiveness of traditional analysis techniques. 

There are many techniques for dimensionality reduction, such as Principal Component Analysis (PCA), Truncated Singular Value Decomposition (Truncated SVD), Linear Discriminant Analysis (LDA), t-distributed Stochastic Neighbor Embedding (t-SNE), and others.

Figure 1: Example of Principal Component Analysis

3D graph of a Principle Component Analysis of a sentence

For instance, PCA, which was used in Figure 1, is widely used in machine learning. It is a statistical technique for dimensionality reduction that transforms high-dimensional data into a lower-dimensional form while preserving as much variance as possible. PCA identifies new axes (principal components) that are linear combinations of the original variables. These new axes are ordered by the amount of variance they capture from the data.

PCA reduces the number of variables (dimensions) in the data while retaining the most significant information. This helps reduce the need for increased computational complexity and mitigates overfitting and difficulties in visualizing and interpreting the data. By reducing the number of dimensions, PCA helps to alleviate these issues, making the data more manageable and easier to analyze.

Embeddings and Embedding Models

Embeddings are a type of high-dimensional vector data specifically designed to capture the underlying semantics or properties of the data in a continuous vector space (arrays of numbers). They are often used to transform categorical data (like words or images) into numerical forms that machine learning models can process effectively. Modern vector embeddings can have more than 1000 dimensions. For example, OpenAI's embeddings models use more than 1500.

Embeddings often reduce the dimensionality of the original data while preserving its most important features. In natural language processing (NLP) and large language models (LLMs), word embeddings map words to vectors in such a way that similar words have similar vector representations. For example, the words cross-site and scripting might be close to each other in the embedding space.

The following is an example of the numerical representation of embeddings:

"object": "list",
"data": [
  {
    "object": "embedding",
    "index": 0,
    "embedding": [
      0.02785527,
      -0.0033755908,
      0.04646089,
      0.006036859,
      0.04550403,
      0.0009369258,
      -0.009429062,
<output omitted for brevity>

Generally, there are two approaches to creating embeddings from data: one based on the semantic meaning and the other based on the syntactic structure of the documents. Popular methods for syntactic embeddings include CountVectorizer, term frequency-inverse document frequency (TF-IDF), and Normalized TF-IDF. For text data, popular methods for semantic embeddings include Word2Vec, GloVe, and BERT, while convolutional neural networks (CNNs) are used for image data. However, there are many embedding models available from OpenAI and many open-source embedding models available in Hugging Face.

Figure 2: Hugging Face’s Massive Text Embedding Benchmark (MTEB) Leaderboard

screenshot of the Hugging Face Massive Text Embedding Benchmark leaderboard

Resource: Learn more about the MTEB at https://arxiv.org/abs/2210.07316.

Maintaining a Balance

Using very high-dimensional embeddings can lead to overfitting, where the model performs well on training data but poorly on unseen data. This reduces the generalization capability of the model, impacting its effectiveness in real-world scenarios.

Using a large number of embedding dimensions increases computational complexity, requiring more memory and processing power. This can slow down training and inference times, reducing overall efficiency. High-dimensional embeddings can make it harder to effectively apply dimensionality reduction techniques, which are often used to enhance model performance and interpretability.

Finding a balance between size, embedding dimensions, and token count1 is necessary to achieve optimal performance, efficiency, and security. Instead of maximizing these parameters, focus on finetuning them to meet specific use-case requirements. Prioritize the quality of data and embeddings over the quantity. High-quality, relevant data and well-trained, efficient embeddings can provide better performance and security than merely increasing size or dimensions.

1. In the context of AI, especially in natural language processing (NLP) and language models, tokens are the basic units of text that the models work with. Tokens can be words, pixels in an image, or characters, depending on the specific tokenization method used.

Embedding Models and Techniques

With the advent of transformer models (such as GPT, BERT, and others), generating representations of large, multimodal objects for use in machine learning tasks became much easier. The representations are more accurate, and computations can now be performed with GPUs and TPUs to speed up parallel operations.

Two ways to enable the embeddings models are:

  • Train an embeddings model from scratch using BERT or some variation. BERT uses an enormous amount of training data, so this approach is advantageous only if the model has access to a lot of GPUs. This approach may also make sense if the goal is to better understand the internals of the model.
  • Finetune pretrained embeddings. Variations of BERT have been used to generate embeddings to use as downstream input for many recommender and information retrieval systems. One of the greatest benefits of the transformer architecture is the ability to perform transfer learning. SBERT.net provides a detailed explanation.

Before, in pre-transformer architectures, the representation of a dataset was fixed; the weights of the words in TF-IDF could not be changed without regenerating an entire dataset. Now, the output of the layers of BERT can be treated as input for the next neural network layer of a custom model. In addition to transfer learning, there are also numerous compact models of BERT, such as Distilbert and RoBERTA, as well as larger models in places like the Hugging Face Model Hub.

Given their flexibility as data structures, there are several use cases for embeddings. Table 1 provides a high-level summary of different embedding techniques.

Table 1: Embedding Techniques

Embedding Technique Statistical Approach Pros Cons
Bag of Words (BoW) Count-based Simple to understand and implement.
Good for small datasets.
Ignores word order and context.
Sparse representation.
Term Frequency-Inverse Document Frequency (TF-IDF) Count-based Considers word importance.
Less sensitive to common words.
Ignores word order and semantics.
Still a sparse representation.
Word2Vec Predictive Captures semantic meaning.
Dense representation.
Requires large datasets.
Fixed-size vectors don't capture sentence-level semantics.
GloVe (Global Vectors) Predictive/Count-based hybrid Combines advantages of Word2Vec and matrix factorization. Requires large corpora.
Fixed-size vectors.
FastText Predictive Captures subword information.
Good for morphologically rich languages.
Larger model size due to subword information.
Doc2Vec (Paragraph Vectors) Predictive Extends Word2Vec to document level.
Captures document context.
More complex.
Requires tuning.
BERT (Bidirectional Encoder Representations from Transformers) Predictive Deep contextualized embeddings.
State-of-the-art performance.
Computationally intensive.
Requires finetuning for specific tasks.
ELMo (Embeddings from Language Models) Predictive Deep contextualized word representations.
Can be added to existing models.
Computationally expensive.
Requires pre-trained models.
Transformer-based models (e.g., GPT, T5) Predictive Very powerful contextual embeddings.
Can generate text.
Very large models.
Require significant computational resources and finetuning.

Another example is NV-Embed-v1, a top-performing generalist embedding model that is ranked highly on the MTEB. It excels in tasks such as retrieval, reranking, classification, clustering, and semantic textual similarity. This model uses the Mistral-7B-v0.1 as its base and features a 4096-dimensional latent-attention pooling embedding.

Graph Embedding

Graph embedding is a technique that produces the latent vector representations for graphs. Graph embeddings encode the structural and semantic information of graphs, nodes, and edges into low-dimensional vectors while preserving the relationships and properties inherent in the graph. By transforming graph data into a format suitable for machine learning models, graph embeddings enable the application of standard algorithms to complex graph-structured data.

Graph embedding can be performed in different levels of the graph. The two predominant levels are node embedding, which maps each node in a graph to a vector, and graph embedding, which maps the whole graph to a vector. Each vector yielded from this process is also referred to as an embedding or representation.

The following are a few key best practices:

  • Use a graph attention network (GAT) to encode the graph (with graph embeddings models).
  • Store this vector data in a vector database.
  • Select important neighbors and relations through the attention mechanism.
  • Cross-fuse the encoded encrypted feature vectors using the cross-matrix method.

These practices improve the ability of the vectors that are stored in the vector database to avoid inverse lookup security threat attacks.

Vector Databases in Retrieval-Augmented Generation (RAG) Implementations

Vector databases are a key component of retrieval-augmented generation (RAG) implementations and semantic similarity-based search engines. They can enhance the retrieval performance and scalability of RAG implementations.

Resource: Comparing RAG, RAG Fusion, with RAPTOR: Different AI Retrieval-Augmented Implementations offers additional information about retrieval technologies.

Figure 3 illustrates the architecture of a system designed for RAG using an LLM stack.

Figure 3: A RAG Implementation Using an LLM Stack

flow chart of RAG implementation

Data Pipelines

Data pipelines, at the top of the architecture, ingest data sources (such as data lakes, PDFs, JSON, and network and security logs) into the system. This provides the raw data that will be processed and stored in the vector database for retrieval purposes.

Embedding Models

Embedding models convert raw data into vector embeddings using models from providers like Cohere and OpenAI and open-source models from platforms such as Hugging Face. Embeddings are crucial for efficient and accurate information retrieval because they transform documents into a format that the vector database can index and search against.

Vector Database

The vector database stores vector embeddings and supports fast similarity search. Examples include MongoDB Vector Search, Facebook AI Similarity Search (Faiss), Pinecone, ChromaDB, and pgvector. When a query is made to a RAG implementation, the vector database is searched for relevant embeddings, which helps in retrieving the most relevant documents or data points.

Orchestration

Orchestration libraries like LlamaIndex and LangChain are extremely popular because they help with the seamless integration and management of various components within an AI system.

In the context of RAG, LlamaIndex and LangChain can handle the ingestion and indexing of large datasets, ensuring that data can be efficiently retrieved when needed. Both libraries support the integration of LLMs with the data processing pipeline, ensuring that the models have access to the relevant information to generate accurate and contextually appropriate responses. The data ingestion libraries of these orchestrators may vary, depending on the nature of the data.

LlamaIndex

LlamaIndex (formerly known as GPT-Index) is an orchestration library designed to help with the integration of data indexing and retrieval processes with LLMs. It helps in managing and streamlining the workflow of extracting relevant information from large datasets to provide coherent and accurate responses.

LlamaIndex handles the ingestion of data from various sources such as documents, databases, and APIs. It creates and manages indexes that facilitate efficient retrieval of relevant information when a query is made.

When a query is received, LlamaIndex searches the indexed data to find the most relevant information. This involves converting the query into a suitable format and matching it against the indexed data using embeddings and vector similarity search.

LlamaIndex integrates with LLMs to combine the retrieved data with the query context. It allows the LLM to have access to relevant information to generate accurate and contextually appropriate responses.

LangChain

LangChain is another orchestration library designed to streamline the integration and management of LLMs with many data sources and processing pipelines. It helps to define and manage complex workflows that involve multiple steps, such as data retrieval, processing, and generation.

LangChain provides tools for chaining together different operations, ensuring smooth and efficient execution. It supports a wide range of connectors and adapters, making it easy to incorporate different data pipelines and services into the workflow.

LangChain provides capabilities for transforming and processing data as it moves through the workflow. This includes tasks like data cleaning, enrichment, and formatting to ensure that the data is in the optimal state for processing by the LLM. LangChain is designed to be highly customizable and extensible, allowing developers to define custom components and workflows tailored to their specific needs.

The article LangChain is Everywhere includes many resources for getting started with LangChain, including tutorials and links to related tools and frameworks.

APIs and Plug-Ins

APIs and external plug-ins provide integrations with external services. They enhance the capabilities of the RAG system by integrating with third-party services or extending its functionality.

Apps and Front Ends

The app or front-end interface allows users to input their queries and receive the responses generated by the RAG system. Nowadays, proof-of-concept and production-ready front ends can be built using platforms like Streamlit, Vercel, or Streamship.

LLM Cache

LLM cache instances can store recent LLM outputs for faster retrieval in future queries using tools like Redis and GPTCache. This reduces the computational load by caching results of frequently asked questions or common queries.

Policy and Guardrails

The policy and guardrails module enables safe and responsible use of the system.

Logging

These modules manage the operational aspects and logging of the LLM system using tools like MLflow, PromptLayer, Helicone, and Weights & Biases.

LLM Hosting and APIs

These are the APIs and models from providers like OpenAI and Anthropic. Models can be deployed on cloud platforms like Azure, AWS, and GCP, or open-source models from Hugging Face can be used. Opinionated Cloud are specialized cloud services like Runpod and Modal for running custom models on top of their GPU infrastructure.

Encryption for Vector Databases

As illustrated in Figure 3, the process of searching for information in a vector database involves many steps and systems, each of which could be an opportunity for a bad actor to intercept the data.

Sensitive data needs to be protected, but it also needs to be usable. There are several key cryptographic methods that have been used to safeguard sensitive data, including searchable/queryable encryption, homomorphic encryption, and multiparty computation.

Searchable/Queryable Encryption

Searchable encryption (SE), also called queryable encryption, allows users to search encrypted data without decrypting it. This capability is especially useful when data privacy is a concern, such as in cloud storage, vector databases, or shared environments.

Figure 4 illustrates a high-level example of a secure search using searchable/queryable encryption in a vector database.

Figure 4: High-Level Example of a Search Using Searchable/Queryable Encryption

The following are the steps illustrated in Figure 4:

  1. The user (labeled as “omar”) initiates a search query in the system.
  2. The SE AI System receives the search query and analyzes it to determine if it involves any encrypted fields.
  3. Upon identifying that the query involves encrypted fields, the SE AI System requests the necessary encryption keys from the implementor-provisioned key provider (not shown in the figure but implied as part of the process).
  4. The SE AI System submits the query to the Vector Database Infrastructure with the encrypted fields rendered as ciphertext. This ensures that the sensitive data remains encrypted during the entire process.
  5. The Vector Database Infrastructure receives the encrypted query and, using a secure index, searches for the required information without decrypting the data.
  6. The loop labeled “Search index” in the figure represents the process of searching through the secure index using the encrypted query.
  7. After processing the query, the Vector Database Infrastructure returns the encrypted results to the SE AI System.
  8. The SE AI System decrypts the query results using the retrieved encryption keys.
  9. The decrypted results are sent back to the user and presented as plaintext.

This process ensures that data privacy is maintained at all stages, with the data and queries remaining encrypted throughout the transaction with the vector database infrastructure.

Resource: A practical example of searchable encryption is the Queryable Encryption feature of MongoDB. This video by Cynthia Braund from MongoDB explains the concept.

Let’s take a look at the process in more detail. Figure 5 explains how SE allows for secure querying of encrypted data through a multi-step process. When an application submits a query, the Vector Database drivers analyze it and recognize if it involves an encrypted field. The system then requests the necessary encryption keys from a customer-provisioned key provider, such as AWS KMS, Google Cloud KMS, Azure Key Vault, or any key provider that complies with the Key Management Interoperability Protocol (KMIP). The query is submitted to the MongoDB server with the encrypted fields rendered as ciphertext.

Figure 5: Searchable/Queryable Encryption Process

The server processes these queries using a fast, searchable scheme that allows for querying fully encrypted data without knowing the data content. Both the data and the query remain encrypted at all times on the server. The Vector Database then returns the encrypted results to the system, where the results are decrypted with the keys held by the system and returned to the client in plaintext.

In RAG, SE can protect the documents and data used to augment responses, ensuring that sensitive information remains secure during the retrieval process. SE allows secure search operations over vector embeddings, which is critical for maintaining privacy in AI models that use these databases for similarity search, recommendation systems, and many other AI applications.

Homomorphic Encryption

Homomorphic encryption (HE) is a powerful cryptographic technique that allows computations to be performed on encrypted data without needing to decrypt it first. This means that data can remain encrypted throughout the entire processing lifecycle, significantly enhancing privacy and security.

Homomorphic encryption allows mathematical operations to be performed directly on ciphertext. The result of such operations, when decrypted, matches the result of operations that are performed on the plaintext. This is a promising solution for applications that involve sensitive data that is processed by third-party services, such as cloud services.

Here's how HE works:

  • Data is encrypted using a homomorphic encryption scheme.
  • Mathematical operations (addition, multiplication, etc.) are performed directly on the encrypted data.
  • The result of the computation is decrypted to reveal the final output.

HE can be used to train machine learning models on encrypted datasets, ensuring that the raw data remains confidential. AI models can make predictions on encrypted data, protecting sensitive input data during the inference process. HE enables secure analytics on encrypted data, useful for sectors like healthcare and finance where data privacy is crucial.

HE is computationally intensive and can significantly slow down operations compared to unencrypted computations. Developing and deploying HE systems requires expertise in cryptography and significant computational resources.

Resource: A technical report by Brian Kishiyama and Izzat Alsmadi reviews the functionality of searchable encryption implementations and provides an evaluation of homomorphic encryption.

HE and secure multiparty computation (SMPC) are two advanced cryptographic techniques that can be used together to enhance security and privacy in data processing.

Figure 6: A Search Using Homomorphic Encryption

 

This figure illustrates the following process:

  1. The user (Omar) encrypts the message: Enc(m).
  2. Omar sends the encrypted message Enc(m) to the server.
  3. The server stores the encrypted message.
  4. Omar queries the encrypted message Enc(m) from the server.
  5. The server performs a homomorphic evaluation f() on the encrypted message Enc(m), resulting in Enc(f(m)).
  6. The server returns the result of the homomorphic operation Enc(f(m))Enc(f(m)) to Omar.
  7. Omar decrypts Enc(f(m)) to obtain f(m).
  8. Omar computes the decryption Dec(Enc(f(m)))=f(m).
  9. Omar recovers the result f(m).

In short, the server performs computations on encrypted data without decrypting it, and the user can decrypt the result to obtain the final output.

Resource: A Survey on Homomorphic Encryption Schemes: Theory and Implementation provides a great overview of the HE process.

There are three main types of homomorphic encryption.

Partially homomorphic encryption (PHE)

PHE supports only a single type of operation (either addition or multiplication) on encrypted data. PHE schemes are efficient but limited in functionality. Examples include the RSA cryptosystem, which supports multiplication, and the Paillier cryptosystem, which supports addition. Despite their efficiency, the application scope of PHE schemes is limited because they can support only one type of operation.

Somewhat homomorphic encryption (SHE)

SHE allows both addition and multiplication operations but only up to a certain complexity or number of operations. SHE schemes introduce noise during encryption and decryption processes. As operations increase, noise accumulates, making the ciphertext undecryptable once the noise exceeds a certain threshold. SHE schemes are more flexible than PHE but are still limited by the number of operations they can support before noise becomes problematic.

Fully homomorphic encryption (FHE)

FHE is the most advanced type of homomorphic encryption, supporting unlimited addition and multiplication operations on ciphertexts. FHE schemes can perform any computable function on encrypted data, which makes them good for a wide range of applications.

FHE was first proposed by Craig Gentry in 2009, introducing a breakthrough technique called bootstrapping. Bootstrapping allows the ciphertext to be refreshed by reducing noise, enabling unlimited operations. However, FHE schemes are computationally intensive and slow, making practical implementations challenging.

HE schemes, especially FHE, are computationally intensive and significantly slower than operations on plaintext. There are several HE open-source libraries and frameworks, including the following:

Research in homomorphic encryption is ongoing, with a focus on improving efficiency and practical applicability.

Secure Multiparty Computation

Secure multiparty computation (SMPC) allows multiple parties to jointly compute a function over their inputs while keeping those inputs private. Each party learns only the result of the computation, not the individual inputs of the other parties.

Figure 7 explains a typical multiparty computation workflow in an AI context, showing how tasks are distributed, computed in parallel, and then aggregated.

Figure 7: Multiparty Computation in AI

SMPC can be used in federated learning scenarios, where data from different sources is combined to improve model performance without compromising privacy, ensuring that individual data components remain confidential.

Resource: Federated Learning in AI: How It Works, Benefits and Challenges by Muhammad Raza from Splunk explains the concept of federated learning.

SMPC protocols often require significant communication between parties, which can impact performance. Implementing efficient and scalable SMPC solutions can be challenging and resource intensive. Using HE can reduce the communication overhead typically associated with SMPC. For example, homomorphically encrypted data can be processed by a single party (or a subset of parties) without constant interaction with other parties, streamlining the computation process.

AI implementations are evolving at a very fast pace. The importance of securing these systems cannot be overstated. The techniques discussed here (searchable encryption, homomorphic encryption, and secure multiparty computation) are promising solutions for protecting sensitive data in AI implementations. Implementing these methods requires careful planning and expertise. However, the benefits in terms of data security and privacy are well worth the effort. As faster hardware is developed, these techniques will become easier to implement.

Common Security Threats to Vector Databases

The increasing reliance on vector databases in AI implementations creates security challenges that must be addressed to protect sensitive information and ensure the integrity of the data. This section looks at the most common security threats against vector databases, including unauthorized access, insider threats, lack of encryption, and malicious vector injections.

Unauthorized Access to Sensitive Vector Data

Unauthorized individuals gaining access to sensitive vector data can lead to data breaches, misuse, and exploitation of confidential information. Some implementations lack robust authentication and access control mechanisms, such as multi-factor authentication (MFA) and role-based access control (RBAC). These are fundamental security techniques and best practices that, while well-known, are often overlooked during the experimentation phase. As a result, when the system transitions to production, these overlooked practices can introduce significant security threats.

Figure 8: Access Control of Vector Embeddings

Just like in traditional deployments, RBAC should be used to protect sensitive vector data in AI implementations. It allows organizations to define roles and assign permissions to ensure that only authorized users can access or manipulate data.

Identify the roles within your organization that will interact with the vector database. Common roles might include:

  • Trusted Administrator: Full access to all data and configuration settings.
  • Data Scientist: Access to data for analysis and model training with restricions on modifying sensitive data or configurations.
  • Analyst: Read-only access to specific datasets for reporting and analysis.

Insider Threats and Malicious Actions by Employees or Collaborators

Insiders with legitimate access to the vector database can misuse their privileges to exfiltrate data, introduce vulnerabilities, or cause intentional harm. Always enforce strict user activity monitoring, audit logging, and least-privilege access policies to detect and prevent malicious actions by insiders.

Lack of Encryption Support

As illustrated in the Encryption for Vector Databases section, this is an important security consideration. Many vector databases do not support encryption, which leaves data vulnerable to interception. Always use databases that support encryption for data at rest and in transit. Where encryption is not natively supported, implement application-level encryption.

Injection of Malicious Vectors

Attackers can inject malicious vectors into the database to compromise the integrity of the data and the accuracy of query results. To mitigate this threat, validate and sanitize all input data, implement anomaly detection to identify unusual patterns, and use robust data validation techniques.

Weaknesses in the indexing process can be exploited to manipulate search results, degrade performance, or cause denial of service (DoS) conditions. Secure the indexing process with proper validation, use secure indexing algorithms, and regularly audit the indexing mechanism for vulnerabilities.

Inadequate Data Handling

Inadequate data handling practices can lead to the accidental exposure of sensitive data through API responses, logs, or backups. Implement data masking, secure logging practices, and enforce strict data handling policies to prevent accidental leakage.

Resource Exhaustion

Attackers can overwhelm the vector database with excessive queries, causing it to become slow or unavailable. Implement rate limiting, query throttling, and scalable infrastructure to handle peak loads and mitigate DoS attacks.

Vulnerabilities in Third-Party Software

Integrations with third-party tools and open-source software (such as LangChain, LangGraph, LlamaIndex, pgvector, ChromaDB, Faiss, etc.) can introduce risks if the software is not properly patched when a new vulnerability is found and disclosed. Managing vulnerabilities in vector databases is very important for minimizing cybersecurity risks. Regularly check for updates and security patches released by the database vendor. Subscribe to security advisories and mailing lists.

Conduct thorough security assessments of third-party integrations and monitor third-party components for vulnerabilities. Maintain a good inventory using software bill of materials (SBOMs) and AI BOMs.

Additional Resources

 

Acknowledgments

Authors:

  • Omar Santos, Distinguished Engineer, Cisco Security & Trust, AI Security Research
  • Akram Sheriff, Senior AI Lead Scientist, Outshift
  • Mustafa Kadioglu, Data Scientist, Cisco IT

Editor: Diane Morris, Cisco Security & Trust Content Manager

 

Revision History

Version Date Authors Comments
1.1 11-Oct-2024 Santos, Morris Updated Policy and Guardrails.
1.0 31-Jul-2024 Santos, Sheriff, Kadioglu, Morris Initial public release.