Metadata-Version: 2.1
Name: azure-ai-evaluation
Version: 1.0.0b1
Summary: Microsoft Azure Evaluation Library for Python
Home-page: https://github.com/Azure/azure-sdk-for-python
Author: Microsoft Corporation
Author-email: azuresdkengsysadmins@microsoft.com
License: MIT License
Project-URL: Bug Reports, https://github.com/Azure/azure-sdk-for-python/issues
Project-URL: Source, https://github.com/Azure/azure-sdk-for-python
Keywords: azure,azure sdk
Classifier: Development Status :: 4 - Beta
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: promptflow-devkit>=1.15.0
Requires-Dist: promptflow-core>=1.15.0
Requires-Dist: websocket-client>=1.2.0
Requires-Dist: jsonpath_ng>=1.5.0
Requires-Dist: numpy>=1.22
Requires-Dist: pyjwt>=2.8.0
Requires-Dist: azure-identity
Requires-Dist: azure-core>=1.30.2
Provides-Extra: pf-azure
Requires-Dist: promptflow-azure<2.0.0,>=1.15.0; extra == "pf-azure"

# Azure AI Evaluation client library for Python

## Getting started

### Install the package

Install the Azure AI Evaluation library for Python with:

```bash
pip install azure-ai-evaluation
pip install azure-identity
```

## Key concepts

Evaluators are custom or prebuilt classes or functions that are designed to measure the quality of the outputs from language models.

## Examples

Users can create evaluator runs on the local machine as shown in the example below:

```python
import os
from pprint import pprint

from promptflow.core import AzureOpenAIModelConfiguration

from azure.ai.evaluation.evaluate import evaluate
from azure.ai.evaluation.evaluators import RelevanceEvaluator
from azure.ai.evaluation.evaluators.content_safety import ViolenceEvaluator


def answer_length(answer, **kwargs):
    return {"value": len(answer)}


if __name__ == "__main__":
    # Built-in evaluators
    # Initialize Azure OpenAI Connection
    model_config = AzureOpenAIModelConfiguration(
        azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
        api_key=os.environ.get("AZURE_OPENAI_KEY"),
        azure_deployment=os.environ.get("AZURE_OPENAI_DEPLOYMENT"),
    )

    # Initialzing Relevance Evaluator
    relevance_eval = RelevanceEvaluator(model_config)

    # Running Relevance Evaluator on single input row
    relevance_score = relevance_eval(
        answer="The Alpine Explorer Tent is the most waterproof.",
        context="From the our product list,"
        " the alpine explorer tent is the most waterproof."
        " The Adventure Dining Table has higher weight.",
        question="Which tent is the most waterproof?",
    )

    pprint(relevance_score)
    # {'gpt_relevance': 5.0}

    # Content Safety Evaluator

    # Initialize Project Scope
    project_scope = {
        "subscription_id": "e0fd569c-e34a-4249-8c24-e8d723c7f054",
        "resource_group_name": "rg-test",
        "project_name": "project-test",
    }

    violence_eval = ViolenceEvaluator(project_scope)
    violence_score = violence_eval(question="What is the capital of France?", answer="Paris.")
    pprint(violence_score)
    # {'violence': 'Very low',
    # 'violence_reason': "The system's response is a straightforward factual answer "
    #                    'to a geography question. There is no violent content or '
    #                    'language present.',
    # 'violence_score': 0}

    # Code based evaluator
    answer_length("The Alpine Explorer Tent is the most waterproof.")
    # {'value': 48}

    # Using multiple evaluators together using `Evaluate` API

    result = evaluate(
        data="evaluate_test_data.jsonl",
        evaluators={
            "answer_length": answer_length,
            "violence": violence_eval,
        },
    )

    pprint(result)
```

## Troubleshooting

## Next steps

## Contributing


# Release History

## 1.0.0b1 (Unreleased)

### Features Added

- First preview
- This package is port of `promptflow-evals`. New features will be added only to this package moving forward.
