SCALABLE PROMPT EVALUATION
• Use an automated way of scoring prompt output.
• Send samples for human evaluation.
• Create human-validated pre-annotated samples.
• Use a more advanced LLM to evaluate the results.
• Send samples for human evaluation.
• Use an automated way of scoring prompt output.
• Send samples for human evaluation.
• Create human-validated pre-annotated samples.
• Use a more advanced LLM to evaluate the results.
• Send samples for human evaluation.