人工智能 · 2024年2月23日

openmagic_ai_banner

“弈衡”通用大模型评测体系白皮书下载

该白皮书是行业中首个大模型评测领域白皮书，基于客观全面、公平公正和用户视角的三大原则，创新性地提出了“2-4-6”的“弈衡”通用大模型评测体系。

openmagic_ai_banner

该体系将评估场景划分为基础任务和应用任务，明确四项主要评测要素，并制定涵盖六大维度的50余个评测指标。

“弈衡”评测体系可对国内外大模型开展有效评测分析，充分揭示大模型在应用中的固有问题，客观反映各模型在准确性、可靠性以及安全性等方面的差异，为大模型的评测实践和产业应用提供指导。

“弈衡”通用大模型评测体系白皮书（附下载）

OpenMagic API

Need more than content? Move into the product flow.

If you are here for model access, pricing, developer docs, or the future API console, the dedicated product path now lives on api.openmagic.ai.

Open product site View pricing

openmagic_ai_banner