Error Messages
==============

Pavise provides detailed error messages to help you quickly identify and fix validation issues.

Type Errors
-----------

When a column has the wrong type, Pavise shows:

* Expected type and actual type
* Sample invalid values (first 5 maximum)
* Row numbers for each invalid value
* Actual type of each invalid value

Example
~~~~~~~

.. code-block:: python

   from typing import Protocol
   from pavise.pandas import DataFrame
   from pavise.exceptions import ValidationError
   import pandas as pd

   class Schema(Protocol):
       age: int

   df = pd.DataFrame({"age": [25, "invalid", 30, None, 35, "bad", 40]})
   try:
       validated_df = DataFrame[Schema](df)
   except ValidationError as e:
       print(e)

Output:

.. code-block:: text

   Column 'age': expected int, got object

   Sample invalid values (showing first 3 of 4):
     Row 1: 'invalid' (str)
     Row 3: None (NoneType)
     Row 5: 'bad' (str)

Missing Columns
---------------

When a required column is missing:

.. code-block:: python

   class Schema(Protocol):
       user_id: int
       name: str

   df = pd.DataFrame({"user_id": [1, 2, 3]})  # Missing 'name'
   try:
       validated_df = DataFrame[Schema](df)
   except ValidationError as e:
       print(e)

Output:

.. code-block:: text

   Column 'name': missing

Validator Errors
----------------

Each validator provides detailed error messages.

Range Validator
~~~~~~~~~~~~~~~

.. code-block:: python

   from typing import Annotated
   from pavise.validators import Range

   class Schema(Protocol):
       age: Annotated[int, Range(0, 150)]

   df = pd.DataFrame({"age": [25, 200, 30, -5, 35, 300]})
   try:
       validated_df = DataFrame[Schema](df)
   except ValidationError as e:
       print(e)

Output:

.. code-block:: text

   Column 'age': values must be in range [0, 150]

   Sample invalid values (showing first 3 of 4):
     Row 1: 200
     Row 3: -5
     Row 5: 300

Unique Validator
~~~~~~~~~~~~~~~~

.. code-block:: python

   from pavise.validators import Unique

   class Schema(Protocol):
       user_id: Annotated[int, Unique()]

   df = pd.DataFrame({"user_id": [1, 2, 2, 3, 5, 5, 5]})
   try:
       validated_df = DataFrame[Schema](df)
   except ValidationError as e:
       print(e)

Output:

.. code-block:: text

   Column 'user_id': contains duplicate values

   Sample duplicate values (showing first 2):
     Value 2 at rows: [1, 2]
     Value 5 at rows: [4, 5, 6]

In Validator
~~~~~~~~~~~~

.. code-block:: python

   from pavise.validators import In

   class Schema(Protocol):
       status: Annotated[str, In(["pending", "approved", "rejected"])]

   df = pd.DataFrame({"status": ["pending", "invalid", "approved", "bad"]})
   try:
       validated_df = DataFrame[Schema](df)
   except ValidationError as e:
       print(e)

Output:

.. code-block:: text

   Column 'status': contains values not in allowed values

   Sample invalid values (showing first 2 of 2):
     Row 1: 'invalid'
     Row 3: 'bad'

Regex Validator
~~~~~~~~~~~~~~~

.. code-block:: python

   from pavise.validators import Regex

   class Schema(Protocol):
       email: Annotated[str, Regex(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')]

   df = pd.DataFrame({"email": ["alice@example.com", "invalid", "bob@test.com", "bad@"]})
   try:
       validated_df = DataFrame[Schema](df)
   except ValidationError as e:
       print(e)

Output:

.. code-block:: text

   Column 'email': contains values that don't match the pattern

   Sample invalid values (showing first 2 of 2):
     Row 1: 'invalid'
     Row 3: 'bad@'

MinLen/MaxLen Validators
~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from pavise.validators import MinLen

   class Schema(Protocol):
       username: Annotated[str, MinLen(3)]

   df = pd.DataFrame({"username": ["alice", "ab", "bob", "x"]})
   try:
       validated_df = DataFrame[Schema](df)
   except ValidationError as e:
       print(e)

Output:

.. code-block:: text

   Column 'username': contains strings shorter than minimum length

   Sample invalid values (showing first 2 of 2):
     Row 1: 'ab' (length: 2)
     Row 3: 'x' (length: 1)

Strict Mode Errors
------------------

When strict mode is enabled and extra columns are present:

.. code-block:: python

   class Schema(Protocol):
       user_id: int
       name: str

   df = pd.DataFrame({
       "user_id": [1, 2, 3],
       "name": ["Alice", "Bob", "Charlie"],
       "age": [25, 30, 35],  # Extra column
       "email": ["a@test.com", "b@test.com", "c@test.com"]  # Extra column
   })

   try:
       validated_df = DataFrame[Schema](df, strict=True)
   except ValidationError as e:
       print(e)

Output:

.. code-block:: text

   Strict mode: unexpected columns ['age', 'email']

Performance Notes
-----------------

To avoid overwhelming output and maintain performance:

* Type error checking examines only the first 100 rows
* Error messages show at most 5 sample invalid values
* Duplicate detection shows at most 5 duplicate value groups

For large DataFrames, consider sampling before validation during development.