Strict Mode
===========

By default, Pavise ignores extra columns not defined in the schema. This follows the philosophy of structural subtyping: a DataFrame with more columns can be used where fewer are expected.

However, in some cases you may want to reject DataFrames with unexpected columns. This is where strict mode comes in.

Enabling Strict Mode
--------------------

Pass ``strict=True`` to the DataFrame constructor:

.. code-block:: python

   from typing import Protocol
   from pavise.pandas import DataFrame

   class UserSchema(Protocol):
       user_id: int
       name: str

   # This will fail if df has columns other than user_id and name
   validated_df = DataFrame[UserSchema](df, strict=True)

Error Message
-------------

If the DataFrame contains extra columns, you'll get a clear error:

.. code-block:: text

   ValueError: Strict mode: unexpected columns ['age', 'email', 'address']

Use Cases
---------

Strict mode is useful when:

1. **Enforcing exact schemas**: You want to ensure the DataFrame has exactly the columns you expect
2. **Detecting typos**: Extra columns might indicate typos in column names
3. **API contracts**: You're receiving data from an external source and want to enforce a strict contract

Example: API Data Validation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from typing import Protocol
   from pavise.pandas import DataFrame
   import pandas as pd

   class APIResponseSchema(Protocol):
       id: int
       timestamp: datetime.datetime
       value: float

   # Validate API response strictly
   response_data = fetch_from_api()
   df = pd.DataFrame(response_data)

   # Fail if API returns unexpected columns
   validated_df = DataFrame[APIResponseSchema](df, strict=True)

When Not to Use Strict Mode
----------------------------

Avoid strict mode when:

1. **Internal processing**: For internal functions, extra columns are usually harmless
2. **Data pipelines**: Intermediate steps may add temporary columns
3. **Flexibility needed**: You want to allow DataFrames to have additional context

Example: Flexible Processing
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   class MinimalSchema(Protocol):
       user_id: int
       name: str

   # Allow extra columns for flexibility
   def process_users(df: DataFrame[MinimalSchema]) -> DataFrame[MinimalSchema]:
       # df might have age, email, etc. - that's OK
       # We only care about user_id and name
       return df

   # Don't use strict=True here
   validated_df = DataFrame[MinimalSchema](df)  # Extra columns are ignored
   result = process_users(validated_df)

Combining with Validators
--------------------------

Strict mode works with validators:

.. code-block:: python

   from typing import Annotated
   from pavise.validators import Range

   class StrictSchema(Protocol):
       age: Annotated[int, Range(0, 150)]
       score: Annotated[float, Range(0.0, 100.0)]

   # Both type validation, validators, and column strictness are enforced
   validated_df = DataFrame[StrictSchema](df, strict=True)