Evaluation
Measure and analyze the performance and quality of your AI model deployments.
Overview
Evaluation in Centrify provides tools and metrics to measure and analyze the performance and quality of your AI model deployments. This helps you understand how well your models are performing, identify areas for improvement, and make data-driven decisions about model selection and optimization.
Key Metrics
- Accuracy: Measure how often your model produces correct or acceptable outputs.
- Latency: Track response times at different percentiles (p50, p90, p99) to understand performance characteristics.
- Error Rates: Monitor how often your model produces errors or fails to generate a response.
- Guardrail Effectiveness: Evaluate how well your safety measures prevent harmful or inappropriate outputs.
- Cost Efficiency: Analyze the cost of model usage relative to performance and business value.
Evaluation Tools
- Performance Dashboards: Visualize key metrics and trends over time.
- Comparative Analysis: Compare different models or model versions across multiple dimensions.
- Custom Metrics: Define and track domain-specific metrics relevant to your use case.
- Evaluation Datasets: Create and manage benchmark datasets for consistent evaluation.
Getting Started
To evaluate your AI model deployments in Centrify, navigate to the Evaluation section in your project dashboard. From there, you can view performance metrics, create custom evaluations, and analyze the results to improve your models.