Error Messages
Pavise provides detailed error messages to help you quickly identify and fix validation issues.
Type Errors
When a column has the wrong type, Pavise shows:
Expected type and actual type
Sample invalid values (first 5 maximum)
Row numbers for each invalid value
Actual type of each invalid value
Example
from typing import Protocol
from pavise.pandas import DataFrame
from pavise.exceptions import ValidationError
import pandas as pd
class Schema(Protocol):
age: int
df = pd.DataFrame({"age": [25, "invalid", 30, None, 35, "bad", 40]})
try:
validated_df = DataFrame[Schema](df)
except ValidationError as e:
print(e)
Output:
Column 'age': expected int, got object
Sample invalid values (showing first 3 of 4):
Row 1: 'invalid' (str)
Row 3: None (NoneType)
Row 5: 'bad' (str)
Missing Columns
When a required column is missing:
class Schema(Protocol):
user_id: int
name: str
df = pd.DataFrame({"user_id": [1, 2, 3]}) # Missing 'name'
try:
validated_df = DataFrame[Schema](df)
except ValidationError as e:
print(e)
Output:
Column 'name': missing
Validator Errors
Each validator provides detailed error messages.
Range Validator
from typing import Annotated
from pavise.validators import Range
class Schema(Protocol):
age: Annotated[int, Range(0, 150)]
df = pd.DataFrame({"age": [25, 200, 30, -5, 35, 300]})
try:
validated_df = DataFrame[Schema](df)
except ValidationError as e:
print(e)
Output:
Column 'age': values must be in range [0, 150]
Sample invalid values (showing first 3 of 4):
Row 1: 200
Row 3: -5
Row 5: 300
Unique Validator
from pavise.validators import Unique
class Schema(Protocol):
user_id: Annotated[int, Unique()]
df = pd.DataFrame({"user_id": [1, 2, 2, 3, 5, 5, 5]})
try:
validated_df = DataFrame[Schema](df)
except ValidationError as e:
print(e)
Output:
Column 'user_id': contains duplicate values
Sample duplicate values (showing first 2):
Value 2 at rows: [1, 2]
Value 5 at rows: [4, 5, 6]
In Validator
from pavise.validators import In
class Schema(Protocol):
status: Annotated[str, In(["pending", "approved", "rejected"])]
df = pd.DataFrame({"status": ["pending", "invalid", "approved", "bad"]})
try:
validated_df = DataFrame[Schema](df)
except ValidationError as e:
print(e)
Output:
Column 'status': contains values not in allowed values
Sample invalid values (showing first 2 of 2):
Row 1: 'invalid'
Row 3: 'bad'
Regex Validator
from pavise.validators import Regex
class Schema(Protocol):
email: Annotated[str, Regex(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')]
df = pd.DataFrame({"email": ["alice@example.com", "invalid", "bob@test.com", "bad@"]})
try:
validated_df = DataFrame[Schema](df)
except ValidationError as e:
print(e)
Output:
Column 'email': contains values that don't match the pattern
Sample invalid values (showing first 2 of 2):
Row 1: 'invalid'
Row 3: 'bad@'
MinLen/MaxLen Validators
from pavise.validators import MinLen
class Schema(Protocol):
username: Annotated[str, MinLen(3)]
df = pd.DataFrame({"username": ["alice", "ab", "bob", "x"]})
try:
validated_df = DataFrame[Schema](df)
except ValidationError as e:
print(e)
Output:
Column 'username': contains strings shorter than minimum length
Sample invalid values (showing first 2 of 2):
Row 1: 'ab' (length: 2)
Row 3: 'x' (length: 1)
Strict Mode Errors
When strict mode is enabled and extra columns are present:
class Schema(Protocol):
user_id: int
name: str
df = pd.DataFrame({
"user_id": [1, 2, 3],
"name": ["Alice", "Bob", "Charlie"],
"age": [25, 30, 35], # Extra column
"email": ["a@test.com", "b@test.com", "c@test.com"] # Extra column
})
try:
validated_df = DataFrame[Schema](df, strict=True)
except ValidationError as e:
print(e)
Output:
Strict mode: unexpected columns ['age', 'email']
Performance Notes
To avoid overwhelming output and maintain performance:
Type error checking examines only the first 100 rows
Error messages show at most 5 sample invalid values
Duplicate detection shows at most 5 duplicate value groups
For large DataFrames, consider sampling before validation during development.