SCALABLE PROMPT EVALUATION
Use an automated way of scoring prompt output.
Send samples for human evaluation.
Create human-validated pre-annotated samples.
Use a more advanced LLM to evaluate the results.
Send samples for human evaluation.
Previous Page Next Page