📖 OpenHosta.guarded Documentation¶
OpenHosta.guarded is the type validation and conversion module with configurable tolerance. It transforms arbitrary inputs into native Python types with confidence and traceability metadata.
I. Introduction: Guarded Types¶
Multiple Inheritance Order and MRO¶
⚠️ Important rule: When a Guarded type uses multiple inheritance, GuardedPrimitive must be placed first in the explicit base class list.
This rule is now checked at class definition time: a poorly declared subclass explicitly raises a TypeError.
from OpenHosta.guarded.primitives import GuardedPrimitive, ProxyWrapper
class MyGuardedProxy(GuardedPrimitive, ProxyWrapper):
...
And not:
from OpenHosta.guarded.primitives import GuardedPrimitive, ProxyWrapper
class BadProxyOrder(ProxyWrapper, GuardedPrimitive):
...
Why:
- GuardedPrimitive.__new__() drives the entire parsing pipeline (attempt, conversion, metadata)
- Other bases may define their own construction logic or interfere with super()/the MRO
- ProxyWrapper is a delegation mixin, not a construction base
- If the order is reversed, the MRO can short-circuit the Guarded pipeline
This rule applies to all multiple-inheritance subclasses, notably proxy types (GuardedBool, GuardedNone, GuardedRange, GuardedEnum, etc.).
Guarded types are enriched Python types that: 1. Accept imperfect inputs and clean them automatically 2. Retain metadata about the conversion quality 3. Behave like native types for seamless integration
from OpenHosta.guarded import GuardedInt
# Accepts multiple formats
age = GuardedInt("42") # Numeric string
price = GuardedInt("1,000") # With thousands separator
count = GuardedInt(42.0) # Round float
# Behaves like an int
total = age + 10 # 52
# But retains metadata
print(age.uncertainty) # 0.0 (perfectly certain)
print(age.abstraction_level) # 'native'
print(price.uncertainty) # 0.15 (heuristic parsing)
II. Architecture: The Validation Pipeline¶
Each Guarded type uses a cascading pipeline to convert inputs:
2.1 The 4 Parsing Levels¶
Input → Native → Heuristic → Semantic → Knowledge → Output
↓ ↓ ↓ ↓
0.00 0.0-0.15 0.0-0.30 0.0+
(STRICT) (PRECISE) (FLEXIBLE) (CREATIVE)
- Native (Cost: O(1), Uncertainty: 0.0)
- Direct type check (
isinstance) -
If the value is already the correct type, immediate validation
-
Heuristic (Cost: O(n), Uncertainty: 0.05-0.15)
- Deterministic cleanup (regex, strip, cast)
- Removal of spaces, separators, currencies
-
Standard format conversion
-
Semantic (Cost: ~1s, Uncertainty: 0.15-0.30)
- LLM-based conversion (not currently implemented)
-
Context and intent comprehension
-
Knowledge (Cost: Variable, Uncertainty: 0.30+)
- Knowledge base lookup
- Synonym and variant mapping
2.2 Tolerance and Control¶
Tolerance defines how far the pipeline can go.
The constructor always uses the class default tolerance (_tolerance).
To force a tolerance, use attempt():
from OpenHosta.guarded import GuardedInt, Tolerance
# Strict tolerance: only the Native level is accepted
strict_result = GuardedInt.attempt("42", tolerance=Tolerance.STRICT)
# Flexible tolerance: Native + Heuristic accepted
flexible_result = GuardedInt.attempt("1,000", tolerance=Tolerance.FLEXIBLE)
# Default tolerance: class _tolerance
value = GuardedInt("1,000")
To permanently change the tolerance, create a subclass:
from OpenHosta.guarded import GuardedInt, Tolerance
class StrictGuardedInt(GuardedInt):
_tolerance = Tolerance.STRICT
value = StrictGuardedInt(42)
| Level | Constant | Value | Accepted Levels |
|---|---|---|---|
| Strict | Tolerance.STRICT |
0.00 | Native only |
| Precise | Tolerance.PRECISE |
0.05 | Native |
| Flexible | Tolerance.FLEXIBLE |
0.15 | Native + Heuristic |
| Type Compliant | Tolerance.TYPE_COMPLIANT |
0.999 | All (default) |
2.3 Multi-parameter Constructors¶
Child classes of GuardedPrimitive can now accept multiple positional and keyword arguments.
- If the constructor receives a single positional argument and no kwargs, the pipeline receives that value directly.
- Otherwise, the inputs are wrapped in a
GuardedCallInput(args, kwargs)object and passed as a single value to the pipeline.
This allows a subclass to implement its own logic in _parse_native(), _parse_heuristic(), etc., while preserving the current architecture centered on a single value input.
from OpenHosta.guarded.primitives import GuardedPrimitive, GuardedCallInput
from OpenHosta.guarded import Tolerance
class FullName(GuardedPrimitive, str):
_type_en = "a full name"
_type_py = str
_tolerance = Tolerance.TYPE_COMPLIANT
@classmethod
def _parse_heuristic(cls, value):
if isinstance(value, GuardedCallInput):
first = value.kwargs.get("first") or value.args[0]
last = value.kwargs.get("last") or value.args[1]
return Tolerance.PRECISE, f"{first.strip()} {last.strip()}", None
return super()._parse_heuristic(value)
name = FullName("Ada", "Lovelace")
name2 = FullName(first="Ada", last="Lovelace")
III. Scalar Types¶
3.1 GuardedInt¶
Integer with smart parsing.
from OpenHosta.guarded import GuardedInt
# Accepted formats
GuardedInt(42) # Native int
GuardedInt("42") # Numeric string
GuardedInt("1,000") # Thousands separator
GuardedInt("1 000") # Spaces
GuardedInt("-42") # Negatives
GuardedInt(42.0) # Round float
Heuristic Parsing:
- Space removal
- Comma removal (if no decimal point)
- Regex validation: -?\d+
3.2 GuardedFloat¶
Floating-point number with flexible parsing.
from OpenHosta.guarded import GuardedFloat
# Accepted formats
GuardedFloat(3.14) # Native float
GuardedFloat("3.14") # String
GuardedFloat("3,14") # European format
GuardedFloat("1.000,5") # Thousands + decimals
GuardedFloat(42) # Int to float
Heuristic Parsing:
- Replacement , → . for European format
- Multiple thousands separator handling
- Automatic int → float conversion
3.3 GuardedUtf8¶
String with encoding handling.
from OpenHosta.guarded import GuardedUtf8
# Accepted formats
GuardedUtf8("hello") # Native string
GuardedUtf8(b"hello") # UTF-8 bytes
Heuristic Parsing: - Automatic bytes → string decoding - Encoding error handling
III.b Advanced Scalar Types¶
3.4 GuardedComplex¶
Complex number with smart parsing.
from OpenHosta.guarded import GuardedComplex
# Accepted formats
GuardedComplex(1+2j) # Native complex
GuardedComplex("1+2j") # Standard string
GuardedComplex("1 + 2j") # With spaces
3.5 GuardedBytes and GuardedByteArray¶
Binary types with flexible parsing from strings.
from OpenHosta.guarded import GuardedBytes
# Accepted formats
GuardedBytes(b"hello") # Native bytes
GuardedBytes("hello") # String (encoded as UTF-8)
GuardedBytes([104, 101]) # List of integers
IV. Proxy Types (Non-Subclassable)¶
Certain Python types cannot be subclassed (bool, NoneType, etc.). For these, we use a ProxyWrapper.
4.1 GuardedBool¶
Boolean with natural language parsing.
from OpenHosta.guarded import GuardedBool
# Accepted formats
b = GuardedBool(True) # Native bool
b = GuardedBool("yes") # English
b = GuardedBool("oui") # French
b = GuardedBool("1") # Numeric
# Retrieve the native value
b.unwrap() # → True (real Python bool)
# Comparison
if b: # Works thanks to __bool__
print("True")
Knowledge base:
- True: ["yes", "y", "true", "1", "oui", "vrai", "ok"]
- False: ["no", "n", "false", "0", "non", "faux"]
4.2 GuardedNone¶
None type with flexible parsing.
from OpenHosta.guarded import GuardedNone
# Accepted formats
n = GuardedNone(None) # Native None
n = GuardedNone("None") # String "None"
n = GuardedNone("null") # JSON null
n = GuardedNone("nothing") # Natural language
n.unwrap() # → None
4.3 Other Proxies (Any, Range, MemoryView)¶
- GuardedAny: Accepts any type, useful as a "pass-through" with metadata.
- GuardedRange: Proxy for
range(). Acceptsrange(10)or string"0:10". - GuardedMemoryView: Proxy for
memoryview.
4.4 ProxyWrapper Behavior¶
ProxyWrapper is now a mixin:
- It delegates operations to _python_value
- It provides a recursive unwrap()
- It must not implement the main construction logic
- Construction must be handled by GuardedPrimitive via the MRO
⚠️ Important: Proxy types are NOT instances of the native type.
from OpenHosta.guarded import GuardedBool
b = GuardedBool("yes")
isinstance(b, bool) # ❌ False (it's a proxy)
isinstance(b.unwrap(), bool) # ✅ True
# But they behave like the native type
if b: # ✅ Works
pass
b == True # ✅ True (thanks to __eq__)
V. Collections¶
5.1 GuardedList¶
List with string parsing.
from OpenHosta.guarded import GuardedList
# Accepted formats
lst = GuardedList([1, 2, 3]) # Native list
lst = GuardedList((1, 2, 3)) # Tuple → list
lst = GuardedList({1, 2, 3}) # Set → list
lst = GuardedList("[1, 2, 3]") # JSON string
# Normal operations
lst.append(4)
lst[0] # 1
len(lst) # 4
5.2 GuardedDict¶
Dictionary with JSON parsing.
from OpenHosta.guarded import GuardedDict
# Accepted formats
d = GuardedDict({"a": 1}) # Native dict
d = GuardedDict('{"a": 1}') # JSON string
d = GuardedDict("{'a': 1}") # Python repr
# Normal operations
d["b"] = 2
d.get("c") # None
5.3 GuardedSet and GuardedTuple¶
Same logic as GuardedList with their respective specificities.
5.4 Advanced Composite Types¶
GuardedLiteral¶
Creates a type restricted to a set of values, similar to typing.Literal.
from OpenHosta.guarded import guarded_literal
Color = guarded_literal("red", "green", "blue")
c = Color("red") # OK
c = Color("RED") # OK (heuristic: case-insensitive)
# doc-test: raises ValueError
c = Color("yellow") # Error (outside allowed values)
GuardedUnion¶
Creates a type that tries multiple successive conversions, similar to typing.Union.
from OpenHosta.guarded import GuardedUnion, guarded_union
from OpenHosta.guarded import GuardedInt, GuardedUtf8
# Tries Int first, otherwise String
IntOrStr = guarded_union(GuardedInt, GuardedUtf8)
v1 = IntOrStr("42") # → 42 (int)
v2 = IntOrStr("hello") # → "hello" (str)
VI. GuardedEnum: Validated Enums¶
Create enums with case-insensitive and value-based parsing.
from OpenHosta.guarded import GuardedEnum
class Status(GuardedEnum):
PENDING = "pending"
ACTIVE = "active"
DONE = "done"
# Flexible parsing
s1 = Status("active") # By value
s2 = Status("ACTIVE") # By name (uppercase)
s3 = Status("pending") # By value
s4 = Status("Status.ACTIVE") # EnumName.MEMBER format
s5 = Status(".ACTIVE") # Short format (.MEMBER)
# enum.Enum compatible API
s1.name # "ACTIVE"
s1.value # "active"
repr(s1) # "<Status.ACTIVE: 'active'>"
# Comparison
s1 == s2 # True
s1 == "ACTIVE" # True
# Round-trip with repr()
s = Status("ACTIVE")
s_copy = Status(repr(s)) # ✅ Works!
Accepted formats:
- ✅ By name: Status("ACTIVE")
- ✅ By value: Status("active"), Status("pending")
- ✅ Qualified format: Status("Status.ACTIVE")
- ✅ Short format: Status(".ACTIVE")
- ✅ Full representation: Status("<Status.ACTIVE: 'active'>")
Advantages:
- ✅ Case-insensitive parsing
- ✅ Name or value lookup
- ✅ Compatible with repr() for serialization/deserialization
- ✅ enum.Enum compatible API
- ✅ Confidence metadata
VII. Guarded Dataclasses¶
Transform your classes into validated dataclasses with a single decorator.
Note:
@guarded_dataclassautomatically applies@dataclass, you don't need both!
7.1 Simple Usage (Recommended)¶
from OpenHosta.guarded import guarded_dataclass
@guarded_dataclass
class Person:
name: str
age: int
# Standard creation with kwargs
p1 = Person(name="Alice", age=30)
# Standard creation with positional args
p1b = Person("Alice", 30)
# Creation from dict with automatic conversion
p2 = Person({"name": "Bob", "age": "25"}) # age converted from str → int
# Fields are validated and converted
assert p2.age == 25
assert isinstance(p2.age, int) # Automatically converted
7.2 Args and kwargs in Guarded Dataclasses¶
Classes decorated with @guarded_dataclass now support:
- Classic kwargs
- Positional args
- Inputs as dict
- Text representations in constructor or dictionary format
from OpenHosta.guarded import guarded_dataclass
@guarded_dataclass
class Point:
x: int
y: int
p1 = Point(10, 20)
p2 = Point(x=10, y=20)
p3 = Point({"x": "10", "y": "20"})
p4 = Point("Point(x=10, y=20)")
Internally, if multiple arguments are provided to the constructor, they are wrapped as GuardedCallInput(args, kwargs) then parsed by the pipeline.
7.3 With Dataclass Options¶
You can pass options directly to @guarded_dataclass:
from OpenHosta.guarded import guarded_dataclass
@guarded_dataclass(frozen=True, order=True)
class Point:
x: int
y: int
pt = Point(x=10, y=20)
# pt.x = 100 # ❌ Error: frozen=True
7.4 With Default Values¶
from OpenHosta.guarded import guarded_dataclass
@guarded_dataclass
class Config:
host: str = "localhost"
port: int = 8080
debug: bool = False
# Use defaults
c1 = Config()
assert c1.host == "localhost"
# Partial override
c2 = Config(host="example.com")
assert c2.port == 8080 # Keeps default value
# From dict
c3 = Config({"host": "api.example.com", "port": "3000"})
assert c3.port == 3000 # Converted from "3000" (str) → 3000 (int)
7.5 Automatic Type Conversion¶
The decorator uses TypeResolver to automatically convert values:
from OpenHosta.guarded import guarded_dataclass
@guarded_dataclass
class User:
username: str
age: int
active: bool
tags: list
# All conversions are automatic
user = User({
"username": "alice",
"age": "25", # str → int
"active": "yes", # str → bool
"tags": "python,ai" # str → list (CSV parsing)
})
assert user.age == 25
assert user.active == True
assert isinstance(user.tags, list)
7.6 Guarded Metadata¶
Like all Guarded types, dataclasses retain metadata:
from OpenHosta.guarded import guarded_dataclass
@guarded_dataclass
class Product:
name: str
price: float
# Creation from dict
p = Product({"name": "Widget", "price": "9.99"})
# Metadata available
print(p.uncertainty) # 0.15 (FLEXIBLE - heuristic parsing)
print(p.abstraction_level) # 'heuristic'
# Normal creation
p2 = Product(name="Gadget", price=19.99)
print(p2.uncertainty) # 0.0 (STRICT - native values)
print(p2.abstraction_level) # 'native'
7.7 Legacy Usage (With Explicit @dataclass)¶
If you already have @dataclass, @guarded_dataclass detects it and won't reapply it:
from dataclasses import dataclass
from OpenHosta.guarded import guarded_dataclass
@guarded_dataclass
@dataclass
class Person:
name: str
age: int
# Works exactly the same
Recommendation: Use only
@guarded_dataclassfor cleaner code.
VIII. TypeResolver: Automatic Resolution¶
The TypeResolver converts Python annotations into Guarded types.
from typing import List, Dict, Optional
from OpenHosta.guarded.resolver import TypeResolver, type_returned_data
# Simple type resolution
TypeResolver.resolve(int) # → GuardedInt
TypeResolver.resolve(str) # → GuardedUtf8
TypeResolver.resolve(bool) # → GuardedBool
# Complex type resolution
TypeResolver.resolve(List[int]) # → GuardedList
TypeResolver.resolve(Dict[str, int]) # → GuardedDict
TypeResolver.resolve(Optional[int]) # → GuardedInt
# Usage in the pipeline
result = type_returned_data("42", int) # Converts "42" → 42
Used by: The OpenHosta pipeline to automatically type LLM responses.
IX. Creating Your Own Types¶
Inherit from GuardedPrimitive to create custom business types.
9.1 Complete Example: CorporateEmail with 4-Level Pipeline¶
This example shows all pipeline levels with LLM validation and a corporate directory.
import re
from typing import Tuple, Optional, Any
from OpenHosta.guarded import GuardedUtf8, Tolerance
from OpenHosta.guarded.primitives import UncertaintyLevel
from OpenHosta import emulate
# Corporate directory (simulated)
CORPORATE_DIRECTORY = {
"marie.dupont@mycorp.com",
"jean.martin@mycorp.com",
"sophie.bernard@mycorp.com",
"pierre.dubois@mycorp.com",
}
class CorporateEmail(GuardedUtf8):
"""Corporate email validated against the mycorp.com directory"""
_type_en = (
"a corporate email address in the format firstname.lastname@mycorp.com "
"where firstname and lastname are lowercase letters only"
)
_type_py = str
_type_json = {
"type": "string",
"format": "email",
"pattern": r"^[a-z]+\.[a-z]+@mycorp\.com$"
}
_type_knowledge = {
"directory": CORPORATE_DIRECTORY,
"domain": "mycorp.com"
}
@classmethod
def _parse_native(cls, value: Any) -> Tuple[UncertaintyLevel, Any, Optional[str]]:
"""Level 1: Perfect email in the directory (0.00 - STRICT)."""
if not isinstance(value, str):
return UncertaintyLevel(Tolerance.ANYTHING), value, "Not a string"
# NO cleanup - only accept perfect emails
if not re.match(r"^[a-z]+\.[a-z]+@mycorp\.com$", value):
return UncertaintyLevel(Tolerance.ANYTHING), value, "Invalid format"
if value in cls._type_knowledge["directory"]:
return UncertaintyLevel(Tolerance.STRICT), value, None
return UncertaintyLevel(Tolerance.ANYTHING), value, "Not in directory"
@classmethod
def _parse_heuristic(cls, value: Any) -> Tuple[UncertaintyLevel, Any, Optional[str]]:
"""Level 2: Deterministic cleanup (0.05 - PRECISE)."""
if not isinstance(value, str):
return UncertaintyLevel(Tolerance.ANYTHING), value, "Not a string"
original = value
cleaned = value.strip().lower()
cleaned = cleaned.replace("mailto:", "").replace("<", "").replace(">", "")
cleaned = cleaned.replace(" ", "").replace("\t", "")
# Check format after cleanup
if re.match(r"^[a-z]+\.[a-z]+@mycorp\.com$", cleaned):
if cleaned in cls._type_knowledge["directory"]:
# Cleanup performed and in directory → PRECISE
if cleaned != original:
return UncertaintyLevel(Tolerance.PRECISE), cleaned, None
# No cleanup but in directory (already handled by native, but for robustness)
return UncertaintyLevel(Tolerance.STRICT), cleaned, None
else:
# Valid format but not in directory → FLEXIBLE
return UncertaintyLevel(Tolerance.FLEXIBLE), cleaned, "Valid format, not in directory"
return UncertaintyLevel(Tolerance.ANYTHING), value, "Invalid format after heuristic cleaning"
@classmethod
def _llm_cast_email(cls, text: str) -> str | None:
"""
Convert the input text to a valid corporate email: firstname.lastname@mycorp.com
Rules:
- Only lowercase letters for firstname and lastname
- Examples:
* "marie dot dupont at mycorp dor com" → "marie.dupont@mycorp.com"
* "jean martin mycorp" → "jean.martin@mycorp.com"
If the input cannot be converted, return None.
"""
return emulate()
@classmethod
def _parse_semantic(cls, value: Any) -> Tuple[UncertaintyLevel, Any, Optional[str]]:
"""Level 3: LLM correction (0.15 - FLEXIBLE)."""
if not isinstance(value, str):
return UncertaintyLevel(Tolerance.ANYTHING), value, "Not a string"
corrected = cls._llm_cast_email(value)
if corrected and re.match(r"^[a-z]+\.[a-z]+@mycorp\.com$", corrected):
# Corrected email exists in directory → FLEXIBLE
if corrected in cls._type_knowledge["directory"]:
return UncertaintyLevel(Tolerance.FLEXIBLE), corrected, None
# Otherwise, pass to knowledge level
return UncertaintyLevel(Tolerance.ANYTHING), corrected, "LLM corrected but not in directory"
return UncertaintyLevel(Tolerance.ANYTHING), value, "Cannot parse semantically"
@classmethod
def _parse_knowledge(cls, value: Any) -> Tuple[UncertaintyLevel, Any, Optional[str]]:
"""Level 4: Fuzzy matching (0.30 - CREATIVE)."""
if not isinstance(value, str):
return UncertaintyLevel(Tolerance.ANYTHING), value, "Not a string"
# Try to normalize the input for fuzzy matching
cleaned = value.strip().lower()
cleaned = cleaned.replace("mailto:", "").replace("<", "").replace(">", "")
cleaned = cleaned.replace(" ", "").replace("\t", "")
# Extract first and last name if the format is close
match = re.match(r"^([a-z]+)\.?([a-z]+)?@?([a-z\.]*)?$", cleaned)
if not match:
return UncertaintyLevel(Tolerance.ANYTHING), value, "Cannot extract name parts for fuzzy matching"
firstname_raw, lastname_raw, domain_raw = match.groups()
# If the domain is present and incorrect, we can't fuzzy match
if domain_raw and not domain_raw.startswith(cls._type_knowledge["domain"].split('.')[0]):
return UncertaintyLevel(Tolerance.ANYTHING), value, "Incorrect domain for fuzzy matching"
# Try to find the best match in the directory
best_match = cls._find_closest_email(firstname_raw, lastname_raw)
if best_match:
return UncertaintyLevel(Tolerance.CREATIVE), best_match, "Fuzzy matched to directory"
return UncertaintyLevel(Tolerance.ANYTHING), value, "No fuzzy match found in directory"
@classmethod
def _find_closest_email(cls, firstname: str, lastname: str) -> Optional[str]:
"""Find the closest email by Levenshtein distance."""
from difflib import get_close_matches
directory_names = []
for email in cls._type_knowledge["directory"]:
m = re.match(r"^([a-z]+)\.([a-z]+)@", email)
if m:
directory_names.append((m.group(1), m.group(2), email))
# Find close first names
firstnames_in_dir = [n[0] for n in directory_names]
close_first = get_close_matches(firstname, firstnames_in_dir, n=1, cutoff=0.7)
if not close_first:
return None
# Filter candidates by close first name
candidates_for_lastname = [(n[1], n[2]) for n in directory_names if n[0] == close_first[0]]
if not candidates_for_lastname:
return None
lastnames_in_candidates = [c[0] for c in candidates_for_lastname]
close_last = get_close_matches(lastname, lastnames_in_candidates, n=1, cutoff=0.7)
if close_last:
for cand_last, email in candidates_for_lastname:
if cand_last == close_last[0]:
return email
return None
# ============================================================================
# DEMONSTRATION
# ============================================================================
# NATIVE level (0.00)
email1 = CorporateEmail("marie.dupont@mycorp.com")
print(f"{email1} - {email1.abstraction_level} - {email1.uncertainty}")
# → marie.dupont@mycorp.com - native - 0.0
# HEURISTIC level (0.05)
email2 = CorporateEmail(" MARIE.DUPONT@MYCORP.COM ")
print(f"{email2} - {email2.abstraction_level} - {email2.uncertainty}")
# → marie.dupont@mycorp.com - heuristic - 0.05
# SEMANTIC level (0.15) - LLM Correction
email3 = CorporateEmail("marie dot dupont at mycorp dor com")
print(f"{email3} - {email3.abstraction_level} - {email3.uncertainty}")
# → marie.dupont@mycorp.com - semantic - 0.15
# KNOWLEDGE level (0.30) - Fuzzy matching
email4 = CorporateEmail("m.dupond@mycorp.com") # dupond -> dupont
print(f"{email4} - {email4.abstraction_level} - {email4.uncertainty}")
# → marie.dupont@mycorp.com - knowledge - 0.30
# HEURISTIC with valid format but not in directory (0.10)
email5 = CorporateEmail("john.doe@mycorp.com") # Valid format, but not in CORPORATE_DIRECTORY
print(f"{email5} - {email5.abstraction_level} - {email5.uncertainty}")
# → john.doe@mycorp.com - heuristic - 0.10
# Confidence-based decision
if email3.uncertainty <= Tolerance.PRECISE:
print("✅ Automatic validation")
elif email3.uncertainty <= Tolerance.FLEXIBLE:
print("⚠️ Ask for confirmation") # ← This case
else:
print("❌ Reject")
Key points:
_parse_native: No cleanup, strict validation only_parse_heuristic: Detects if cleanup was performed (cleaned != original)_parse_semantic: Usesclosure()for real LLM call_parse_knowledge: Fuzzy matching withdifflib.get_close_matches()- Directory validation: If corrected email exists → FLEXIBLE confidence, otherwise passes to next level
Complete tests: See tests/guarded/test_corporate_email.py (13/13 tests pass ✅)
9.2 Methods to Implement¶
| Method | Required | Description |
|---|---|---|
_parse_native |
✅ Yes | Strict type validation |
_parse_heuristic |
⚠️ Recommended | Deterministic cleanup |
_parse_semantic |
❌ No | LLM-based conversion |
_parse_knowledge |
❌ No | Knowledge base |
9.3 Configuration Attributes¶
| Attribute | Type | Description |
|---|---|---|
_type_en |
str |
English description for the LLM |
_type_py |
type |
Target Python type |
_type_json |
dict |
JSON Schema for validation |
_type_knowledge |
dict\|Any |
Knowledge base (optional) |
_tolerance |
Tolerance |
Default tolerance |
X. Metadata and Traceability¶
Each Guarded instance retains metadata about its creation.
10.1 Available Attributes¶
from OpenHosta.guarded import GuardedInt
age = GuardedInt("1,000")
# Metadata
age.uncertainty # 0.15 (confidence level)
age.abstraction_level # 'heuristic' (method used)
age.unwrap() # Native value (for ProxyWrapper)
# Normal operations
age + 10 # 1010
age > 500 # True
10.2 Abstraction Levels¶
| Level | Description | Typical Uncertainty |
|---|---|---|
native |
Value already the correct type | 0.0 |
heuristic |
Deterministic cleanup | 0.05-0.15 |
semantic |
LLM conversion | 0.15-0.30 |
knowledge |
Knowledge base | 0.30+ |
XI. Pydantic Integration¶
Guarded types integrate with Pydantic V2.
from pydantic import BaseModel
from OpenHosta.guarded import GuardedInt, GuardedUtf8
class User(BaseModel):
name: GuardedUtf8
age: GuardedInt
# Pydantic + Guarded = Tolerant validation
user = User(name="Alice", age="25") # age converted automatically
XII. Best Practices¶
✅ Do¶
-
Use Guarded types for user inputs
-
Force tolerance via
attempt()for critical cases -
Check uncertainty for important decisions
❌ Don't¶
-
Don't use as native dict keys
-
Don't ignore metadata
XIII. Comparison with Pydantic¶
| Aspect | Pydantic | OpenHosta Guarded |
|---|---|---|
| Philosophy | Strict validation | Tolerant conversion |
| Input | Structured data | Anything |
| Failure | Raises ValidationError | Attempts to repair |
| Performance | ⚡ Very fast | 🐢 Slower (if parsing) |
| Metadata | No | Yes (uncertainty, level) |
| Use case | APIs, Config | AI agents, Scraping |
Use Pydantic to validate technical APIs.
Use Guarded to parse human or LLM inputs.
XIV. Known Limitations¶
14.1 Known Bugs¶
- GuardedList - CSV Parsing
GuardedList("1,2,3")converts to a character list-
Workaround: Use JSON format
"[1,2,3]" -
ProxyWrapper - isinstance()
isinstance(GuardedBool("yes"), bool)returnsFalse- Workaround: Use
.unwrap()for the native type
14.2 Unimplemented Types¶
GuardedRange(partial)GuardedMemoryView(partial)GuardedCode(Supported since v4.1, see callables.md)- Semantic parsing (LLM) not connected
XV. Complete Examples¶
15.1 Form Validation¶
from OpenHosta.guarded import GuardedInt, GuardedUtf8, GuardedBool
# Dirty data from a web form
form_data = {
"name": " Alice ",
"age": "twenty-five", # Would require LLM (not implemented)
"age": "25 years", # Works with heuristic
"newsletter": "yes"
}
name = GuardedUtf8(form_data["name"]) # "Alice"
age = GuardedInt(form_data["age"]) # 25
newsletter = GuardedBool(form_data["newsletter"]) # True
print(f"{name}, {age} years old, newsletter: {newsletter.unwrap()}")
15.2 Configuration Parsing¶
from OpenHosta.guarded import GuardedInt, GuardedDict
config_str = '{"port": "8080", "debug": "true"}'
config = GuardedDict(config_str)
port = GuardedInt(config["port"]) # 8080
XVI. Resources¶
- Source code:
src/OpenHosta/guarded/ - Tests:
tests/guarded/ - Examples:
test_guarded_basic.py
Modules:
- primitives.py - Base class GuardedPrimitive
- constants.py - Tolerance enumerations
- subclassablescalars.py - Scalar types
- subclassablewithproxy.py - Proxy types
- subclassablecollections.py - Collections and dataclasses
- subclassableclasses.py - GuardedEnum
- resolver.py - Type resolution