A new community-driven initiative evaluates large language models using Italian-native tasks, with AI translation among the ...
The testing sparked internal frustration about the progress of the Llama models. Yann LeCun, Meta’s outgoing chief AI ...
Joining the ranks of a growing number of smaller, powerful reasoning models is MiroThinker 1.5 from MiroMind, with just 30 ...
Large language models frequently misrepresent verbal risk terms used in medicine, potentially amplifying patient misunderstandings and diverging from established clinical definitions, according to a ...
This study introduces MathEval, a comprehensive benchmarking framework designed to systematically evaluate the mathematical reasoning capabilities of large language models (LLMs). Addressing key ...
Beijing-based Ubiquant launches code-focused systems claiming benchmark wins over US peers despite using far fewer parameters ...
Z.ai released GLM-4.7 ahead of Christmas, marking the latest iteration of its GLM large language model family. As open-source models move beyond chat-based applications and into production ...
Open-weight LLMs can unlock significant strategic advantages, delivering customization and independence in an increasingly AI ...