Mar 1, 2025

Integrating Semantic Correctness with Generative AI: Introducing Contracts in SymbolicAI

Explores how SymbolicAI uses contracts to ensure semantic correctness in generative AI, reducing hallucinations and building trust.

Leoveanu-Condrei Claudiu

Share this article

Generative AI has opened up wide-ranging possibilities across multiple domains—healthcare, finance, legal, education, and more. Thanks to large language models (LLMs), organizations can build advanced chatbots, text summarizers, code generators, and countless other sophisticated applications. However, these powerful models frequently exhibit behaviors that can harm trust and reliability. Hallucinations, context drift, incoherent outputs, and factual inaccuracies all loom as major challenges. At ExtensityAI, we address these concerns by introducing contracts, a technique that enforces semantic correctness from the outset rather than attempting to correct issues after the fact. In this article, we explore why contracts matter, how they are integrated into SymbolicAI, and how they can reshape the broader AI landscape.

The Challenge: Ensuring Semantic Correctness

Structural vs. Semantic Validity

Software developers typically rely on data typing and schema validation tools (like TypeScript interfaces or Pydantic models) to ensure structural correctness. For instance, if a function expects a string, returning a string satisfies the basic requirement. But generative AI is capable of producing text that meets structural expectations while failing to deliver coherent, relevant, or accurate answers.

With LLMs, the problem becomes more intricate. Even if an LLM respects a JSON schema or completes text in a syntactically correct manner, the output may still stray from the intended meaning or contain factual errors. This phenomenon, sometimes referred to as “hallucination,” impairs user trust and can negatively affect the adoption of AI-driven products.

Probabilistic Nature of LLMs

LLMs are probabilistic systems at their core. They model distributions over sequences of tokens, essentially predicting the next word in a sequence given a history of previous words and contextual embeddings. This probabilistic design means that the results can vary even with identical prompts, especially if the model is temperature-controlled for creative exploration.

By mixing creative freedom with domain-specific constraints, we can inadvertently produce outputs that appear valid superficially but fail to uphold crucial real-world semantics. Traditional validation checks only partially alleviate this issue, often operating too late in the pipeline.

Contracts as a Mechanism for Semantic Validation

Origins in Design by Contract (DbC)

Contracts in the context of SymbolicAI take their inspiration from Design by Contract (DbC), a methodology introduced by Bertrand Meyer for the Eiffel programming language. In DbC, software modules must fulfill certain conditions:

  • Preconditions: Requirements before a function or method is called.

  • Postconditions: Guarantees about the system state or return data after the function or method finishes.

  • Invariants: Conditions that remain true throughout a program’s lifecycle.

When extended to LLMs, this logic must account for the probabilistic dimension. We think of the symbolic expression {Pre} C {Post} [p], where [p] represents the model’s probability of satisfying the contract. Our goal is not absolute assurance—no AI system can guarantee that—but rather increasing the likelihood that the output is correct and detecting errors sooner.

Why Probabilities Matter

For deterministic software, contracts can function as strict rules. For generative AI, which inherently deals with uncertain outputs, we operate with thresholds or confidence measures. One possible approach is to re-try generation if a certain threshold is not met. Another is to accept partial compliance but log the contract violation for future analysis. The flexibility inherent in generative AI demands that contract designs adapt accordingly.

The SymbolicAI Contract Framework

Overview

At a high level, SymbolicAI’s contract framework extends the idea of data validation to the semantic layer. Rather than simply checking that a field is a string or that a list has the right length, we aim to confirm the meaning behind each field. The framework leverages Python decorators (e.g., @contract) and specialized data classes (LLMDataModelinherited from Pydantic) to:

  1. Enforce Structural Validity: Checking input and output shapes.

  2. Enforce Semantic Consistency: Ensuring the content produced is meaningful and aligns with domain rules.

  3. Offer Fallback Mechanisms: Providing safe defaults when checks fail.

Below is a basic illustration:

class TripletInput(LLMDataModel):
    text: str
    ontology: OntologySchema

class TripletOutput(LLMDataModel):
    triplets: list[Triplet] | None

@contract()
class TripletExtractor:
    def forward(self, input: TripletInput) -> TripletOutput:
        if not self.contract_successful:
            # Contract failure can lead to fallback
            return TripletOutput(triplets=None)
        # Normal operation if contract checks succeed
        return self.contract_result
Core Components
  1. Contract Decorator: Binds preconditions, postconditions, and fallback mechanisms to the forward function.

  2. LLMDataModel: A specialized data model that integrates domain-specific descriptions or instructions. For example, if ontology states that only certain types of relationships are valid, the model could embed that constraint.

  3. Retry/Remedy Mechanisms: If the LLM fails to adhere to constraints in the first pass, the system can prompt the model again, providing additional clarifications or instructions derived from the error messages.

Integration with Pydantic

Pydantic is a popular validation library that many Python developers rely on for robust data parsing. By extending it, we allow developers to harness Pydantic’s familiar syntax (Field, validations, default values) while introducing domain-level constraints specific to generative AI. Fields can carry descriptions that serve both as documentation and as prompts for the LLM.

Why This Matters

Embedding Domain Knowledge

Semantic correctness often demands a deeper layer of knowledge than standard validations provide. For instance, in a medical text summarization application, domain experts might specify that certain treatments should not be recommended together, or that a diagnosis must align with recognized guidelines. Contracts allow you to codify these domain rules.

Preventing Erroneous or Harmful Outputs

Many organizations worry about harmful outputs, whether they are insensitive content, factually incorrect statements, or brand-damaging text. Contracts give them an automated way to guide model behavior, effectively adding policy constraints to generative workflows.

Lowering Maintenance Overheads

Systems that incorporate LLMs face frequent updates—model improvements, new domain requirements, additional data sources, and so forth. Contracts explicitly outline the assumptions around input and output, making the system easier to maintain. Changes can be localized to the contract layer instead of requiring a re-check of every function call.

Contracts as Interfaces

Formal Definition

Think of a contract as a handshake between a function’s input and its output, specifying the conditions each must satisfy for the execution to be considered valid. In SymbolicAI, the precondition ensures an LLMDataModel instance meets certain criteria. The function itself transforms that input into an output, which is then tested by a postcondition. If the output is invalid, the system can:

  • Retry generation, possibly with more explicit instructions.

  • Return a default or safe result.

  • Raise an error that prompts developer intervention.

Comparison with Classical DbC

Classic DbC halts execution if a contract fails. For many generative AI scenarios, halting can be too extreme, as partial information may still be valuable. SymbolicAI modifies the approach by allowing execution to continue, but returning minimal or fallback data when a contract fails. This ensures that the pipeline remains resilient, and errors are flagged without crippling the entire workflow.

Practical Application: Knowledge Extraction from Text

Case Study: Legal Documents

A prime application for contracts is extracting structured insights from legal texts, such as Terms of Service (ToS). A typical pipeline might:

  1. Segment the ToS: Break down a large document into smaller passages.

  2. Define an Ontology: Specify valid entities (e.g., “User,” “Service,” “Liability”) and valid relationships (e.g., “grants,” “restricts”).

  3. Create Input Models: Use LLMDataModel to detail the expected text segment and the ontology to guide extraction.

  4. Apply the @contract Decorator: The TripletExtractor can check whether the model recognized all mandatory relationships and whether any discovered relationships violate known constraints.

  5. Fallback Mechanics: If the LLM attempts to invent or hallucinate relationships, the contract can catch them and either prompt a re-try or fill in default relationships.

See the full code as a gist at the following link.

Swapping Models

One key benefit is that contracts allow for easy swapping of different LLMs in the same pipeline. The contract layer remains the same, so if a new model outperforms the old one, you simply route the same inputs to that model and measure how well it meets the contract. If both pass the same contract consistently, they are effectively interchangeable, at least for that specific task.

Key Benefits

  1. Proactive Checking: Contracts provide immediate feedback, letting you catch errors in the moment instead of diagnosing them much later.

  2. Transparent Standards: The system’s expectations are codified, making it simpler for teams to align on what “correct” means.

  3. Enhanced Versatility: By capturing semantic rules, you can manage tasks with intricate domain logic and unusual constraints.

  4. Structured Collaboration: Non-technical domain experts can define or update the contract rules in plain language, bridging the gap between business logic and AI logic.

Future Directions

Formalizing Probability Thresholds

We see potential in systematically managing probability distributions. If an LLM returns a distribution over multiple possible answers, contract logic could weigh these answers by their likelihood. This approach might involve advanced ranking or a “semantic confidence” score.

Security and Trust

Robust contract definitions may mitigate some classes of security vulnerabilities, like prompt injection or adversarial triggers, by establishing baseline trust conditions. Future research could evolve these contracts into a security mechanism that thwarts malicious input.

Contract Serialization and Sharing

A standard format for contracts, akin to .GGUF for model weights, could enable easy sharing across different AI projects and communities, potentially fostering an ecosystem of ready-made contract definitions for common domains.

Co-Evolving Contracts with Models

As LLMs evolve, the best practices around contract design might shift as well. More advanced LLMs could handle more nuanced contracts, automatically parsing deeper semantic logic or domain reasoning. We imagine a future in which model improvements go hand-in-hand with more sophisticated contract definitions.

How to Get Started

  1. Define Concrete Data Models: Identify the key data structures in your domain. If you are dealing with triplet extractions, define your fields (subject, relation, object) and any relevant constraints.

  2. Add Descriptions and Constraints: For each field, include natural language instructions or domain-specific rules. This can be done with Pydantic’s Field descriptions, which integrate directly into the LLM prompt.

  3. Implement@contract: Decorate your classes or functions with @contract. Provide optional pre and post methods if you want advanced control over how input and output are validated semantically.

  4. Handle Fallbacks: Decide how to respond when a contract is not met. Will you trigger a re-generation, return a simple default, or log an error for later inspection?

  5. Iterate and Measure: Gradually refine your contracts by analyzing the frequency of contract violations. If you see recurring patterns of failure, you can adjust contract logic, update the LLM’s prompt, or even refine your domain’s ontology.

  6. Expand to Multi-Agent Scenarios: If you have a pipeline of multiple LLM-based components, define interlocking contracts. The output of one component must be a valid input for the next.

Conclusion

Generative AI is powerful but unpredictable, leaving organizations in a difficult spot when ensuring reliability and correctness. By integrating an approach based on contracts, SymbolicAI attempts to bring more rigor and clarity to the generative process. This design-time enforcement of semantic rules not only detects errors earlier but also helps guide LLMs toward more consistent, meaningful outputs.

While contracts do not guarantee perfection, they attempt to reduce the likelihood of misleading or incoherent responses. They also set the stage for a more accountable form of AI development, where each generative step is transparently governed by explicit, domain-relevant constraints.

Leoveanu-Condrei Claudiu is the creator of contract and co-founder of ExtensityAI. You can find him on LinkedIn, or at his personal website.

The future of AI
Available today

The future of AI
Available today

The future of AI
Available today