Long Context Language Model

New Llama 4 AI Model 10 Million Token Context Window

Meta has unveiled Llama 4, its latest artificial intelligence model, designed to redefine the boundaries of AI technology. This advanced model comes in two distinct variants—Maverick and Scout—each ...

TechCrunch

DeepSeek releases ‘sparse attention’ model that cuts API costs in half

Researchers at DeepSeek on Monday released a new experimental model called V3.2-exp, designed to have dramatically lower inference costs when used in long-context operations. DeepSeek announced the ...

ZDNet

What does a long context window mean for an AI model, like Gemini?

Imagine binge-watching a TV series, but you can only remember one episode at a time. When you move on to the next episode, you instantly forget everything you just watched. Now, imagine you can ...

DeepSeek rolls out DeepSeek-V4 with 1M token context, agentic AI, and coding capabilities

DeepSeek-V4 is available through web access and API, with support for standard developer integrations. DeepSeek has also confirmed that the following models will be retired: These will become ...

9to5Mac

Apple trained a large language model to efficiently understand long-form video

Apple researchers have developed an adapted version of the SlowFast-LLaVA model that beats larger models at long-form video analysis and understanding. Here’s what that means. Very basically, when an ...

4don MSN

China’s DeepSeek releases new AI model V4. Here’s everything to know as the AI race speeds up

China’s AI startup is back a year after it stirred up the AI industry with ‘world-leading’ processing power at a fraction of ...

InfoQ

Gemma 3 Supports Vision-Language Understanding, Long Context Handling, and Improved Multilinguality

A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...

Semiconductor Engineering

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)

A new technical paper titled “Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention” was published by DeepSeek, Peking University and University of Washington.

Hackaday

How Anthropic’s Model Context Protocol Allows For Easy Remote Execution

AI’ into more and more places, Anthropic’s Model Context Protocol (MCP) has been adopted as the standard to connect LLMs ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results