Welcome to Arrakis Viewer
A comprehensive platform for AI model evaluation, analysis, and quality assessment
System Information
- Version: 0f4ac6d
- Environment: prod
Configuration Status
Cosmos DB:
Configured
Local Filesystem:
Disabled
Available Features
Evaluation Management
Manage, import, and run evaluation test suites.
- Import test definitions from YAML files
- Run evaluations against any server
- Track progress and view results
Traces
View and explore trace data from evaluation runs.
- Browse all available traces
- Filter by device ID
- View detailed trace information
Evaluation Results
View evaluation results from all storage sources.
- Browse results from local files and Cosmos DB
- View unified evaluation run details
- Explore detailed test traces and LLM requests
Spice Registry
Manage spice definitions and tool configurations.
- Register and manage spice definitions
- Configure tool parameters and metadata
- View live spice information
Spice Ensembles
Configure ensemble compositions and inheritance.
- Create ensemble configurations
- Set up inheritance hierarchies
- Override spice versions per environment
Getting Started
For First-Time Users
Welcome to the Arrakis Viewer platform! This tool helps you test and evaluate AI model performance:
- Check the configuration status above to see which features are available
- For developers: Run evaluations with
uv run python -m peanut_eval.bin.run_evalsto generate test results in theeval_results/directory - Access these results through the Local Results section if filesystem access is enabled
- For stakeholders: Use the Evaluations and Runs sections in the hosted environment (requires Cosmos DB)
Configuration & Workflow
Local development workflow:
- Define test cases in YAML configuration files
- Run evaluations against a local server
- Review results in the Local Results section
- Verify assertions through detailed test reports
Hosted environment:
- Cosmos DB enables evaluation management and trace features
- Import evaluation definitions from YAML files
- Run evaluations against any server
- Track progress and view detailed results
Contact your administrator if you need help configuring these options.