Context Engineering
Wiki

58 articles from arXiv, OpenAI, Anthropic, Google AI, and built-in terms. Auto-fetched and searchable.

arXivgeneral

Engineering Tagging Languages for DSLs

To keep a DSL clean, readable and reusable in different contexts, it is useful to define a separate tagging language. A tag model logically adds information to the tagged DSL model while technically...

2016-06-16Timo Greifenberg, Markus Look +2
cs.SE
arXivtoken optimization

Data Engineering for Scaling Language Models to 128K Context

We study the continual pretraining recipe for scaling language models' context lengths to 128K, with a focus on data engineering. We hypothesize that long context modeling, in particular \textit{the...

2024-02-15Yao Fu, Rameswar Panda +3
cs.CLcs.AI
arXivtool use

Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents

Large language models (LLMs) have achieved success in acting as agents, which interact with environments through tools such as search engines. However, LLMs are optimized for language generation...

2024-02-18Renxi Wang, Haonan Li +3
cs.CL
arXivtoken optimization

How Important Is Tokenization in French Medical Masked Language Models?

Subword tokenization has become the prevailing standard in the field of natural language processing (NLP) over recent years, primarily due to the widespread utilization of pre-trained language...

2024-02-22Yanis Labrak, Adrien Bazoge +3
cs.CLcs.AIcs.LG
arXivgeneral

Token Weighting for Long-Range Language Modeling

Many applications of large language models (LLMs) require long-context understanding, but models continue to struggle with such tasks. We hypothesize that conventional next-token prediction training...

2025-03-12Falko Helm, Nico Daheim +1
cs.CL
arXivgeneral

On the solution existence and stability of polynomial optimization problems

This paper introduces and investigates a regularity condition in the asymptotic sense for optimization problems whose objective functions are polynomial. Under this regularity condition, the...

2018-08-18Vu Trung Hieu
math.OC
arXivcaching

Caching with rental cost and zapping

The \emph{file caching} problem is defined as follows. Given a cache of size $k$ (a positive integer), the goal is to minimize the total retrieval cost for the given sequence of requests to files. A...

2012-08-13Monik Khare, Neal E. Young
cs.DS
arXivprompt engineering

StruQ: Defending Against Prompt Injection with Structured Queries

Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications, which perform text-based tasks by utilizing their advanced language understanding capabilities. However,...

2024-02-09Sizhe Chen, Julien Piet +2
cs.CR
arXivrag

How and Where to Translate? The Impact of Translation Strategies in Cross-lingual LLM Prompting

Despite advances in the multilingual capabilities of Large Language Models (LLMs), their performance varies substantially across different languages and tasks. In multilingual retrieval-augmented...

2025-07-21Aman Gupta, Yingying Zhuang +3
cs.CLcs.AI
arXivgeneral

Exploiting Context to Identify Lexical Atoms -- A Statistical View of Linguistic Context

Interpretation of natural language is inherently context-sensitive. Most words in natural language are ambiguous and their meanings are heavily dependent on the linguistic context in which they are...

1997-01-02Chengxiang Zhai
cs.CL
arXivtoken optimization

Towards Adaptive Context Management for Intelligent Conversational Question Answering

This particular paper introduces an Adaptive Context Management (ACM) framework for the Conversational Question Answering (ConvQA) systems. The key objective of the ACM framework is to optimize the...

2025-09-22Manoj Madushanka Perera, Adnan Mahmood +2
cs.CL
arXivrag

Impacts of National Cultures on Managerial Decisions of Engaging in Core Earnings Management

This study investigates the impact of Hofstede's cultural dimensions on abnormal core earnings management in multiple national cultural contexts. We employ an Ordinary Least Squares (OLS) regression...

2024-07-23Muhammad Rofiqul Islam, Abdullah Al Mehdi
econ.GN
arXivrag

EVOR: Evolving Retrieval for Code Generation

Recently the retrieval-augmented generation (RAG) has been successfully applied in code generation. However, existing pipelines for retrieval-augmented code generation (RACG) employ static knowledge...

2024-02-19Hongjin Su, Shuyang Jiang +3
cs.CLcs.AI
arXivrag

AR-RAG: Autoregressive Retrieval Augmentation for Image Generation

We introduce Autoregressive Retrieval Augmentation (AR-RAG), a novel paradigm that enhances image generation by autoregressively incorporating knearest neighbor retrievals at the patch level. Unlike...

2025-06-08Jingyuan Qi, Zhiyang Xu +2
cs.CV
arXivgeneral

Intelligent Interaction Strategies for Context-Aware Cognitive Augmentation

Human cognition is constrained by processing limitations, leading to cognitive overload and inefficiencies in knowledge synthesis and decision-making. Large Language Models (LLMs) present an...

2025-04-18Xiangrong, Zhu +3
cs.HC
OpenAI Cookbookprompt engineering

How to work with large language models

[Large language models][Large language models Blog Post] are functions that map text to text. Given an input string of text, a large language model predicts the text that should come next.

tokensprompts
OpenAI Cookbookprompt engineering

Techniques to improve reliability

When GPT-3 fails on a task, what should you do?

tokensprompts
OpenAI Cookbookprompt engineering

Related resources from around the web

People are writing great tools and papers for improving outputs from GPT. Here are some cool ones we've seen:

prompts
OpenAI Cookbooktoken optimization

How_to_count_tokens_with_tiktoken

{ "cells": { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": " How to count tokens with tiktoken\n", "\n", " tiktoken ...

tokenspromptsembeddings+2
OpenAI Cookbookprompt engineering

How_to_stream_completions

{ "cells": { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": " How to stream completions\n", "\n", "By default, when you request a completion...

tokenspromptsstreaming
Anthropiccaching

Prompt Caching

Claude API Documentation

cachingoptimizationtokens
Anthropicprompt engineering

Prompt Engineering Overview

Claude API Documentation

promptsengineeringbest-practices
Anthropicprompt engineering

Chain of Thought Prompting

Comprehensive guide to prompt engineering techniques for Claude's latest models, covering clarity, examples, XML structuring, thinking, and agentic systems.

promptsreasoningchain-of-thought
Anthropiccontext management

Context Windows

Claude API Documentation

contextwindowstokens
Anthropiccontext management

Long Context Window Tips

Comprehensive guide to prompt engineering techniques for Claude's latest models, covering clarity, examples, XML structuring, thinking, and agentic systems.

contextlong-contextoptimization
Anthropictoken optimization

Token Counting

Claude API Documentation

tokenscountingusage
Anthropicprompt engineering

Use XML Tags in Prompts

Comprehensive guide to prompt engineering techniques for Claude's latest models, covering clarity, examples, XML structuring, thinking, and agentic systems.

promptsxmlstructure
Anthropicprompt engineering

Extended Thinking

Claude API Documentation

reasoningthinkingchain-of-thought
Google AIcaching

Context Caching

Learn how to use Context Caching in the Gemini API

cachingcontextoptimization
Google AIcontext management

Long Context

Learn about how to get started building with long context (1 million context window) on Gemini.

contextlong-contexttokens
Google AItoken optimization

Tokens

Ya está disponible la versión preliminar de Gemini 3.1 Flash Lite. Pruébalo en AI Studio https://aistudio.google.com/prompts/new_chat?model=gemini 3.1 flash lite preview&hl=es 419 . ...

tokenscountingusage
Google AIprompt engineering

Prompting Strategies

Ya está disponible la versión preliminar de Gemini 3.1 Flash Lite. Pruébalo en AI Studio https://aistudio.google.com/prompts/new_chat?model=gemini 3.1 flash lite preview&hl=es 419 . ...

promptsstrategiesengineering
Google AIprompt engineering

System Instructions

البدء بإنشاء تطبيقات للمحادثات وإنشاء النصوص باستخدام Gemini API

system-promptsinstructionsengineering
Google AItool use

Code Execution

Learn how to use the Gemini API code execution feature.

codeexecutiontools
Built-incontext management

Progressive Disclosure

Instead of loading an entire codebase—which would immediately overwhelm the attention budget—modern agents use JIT context. The assistant dynamically loads only the necessary data at runtime.

contextjitoptimization
Built-incontext management

Lightweight Identifiers

The assistant maintains references (file paths, stored queries) and dynamically loads only the necessary data at runtime using tools like grep, head, or tail.

contextreferencesefficiency
Built-incontext management

Compaction

When a session nears its token limit, the assistant summarizes critical details—such as architectural decisions and unresolved bugs—while discarding redundant tool outputs.

contextcompressionlong-horizon
Built-incontext management

Tool Result Clearing

A light touch form of compaction where the raw results of previous tool calls (like long terminal outputs) are cleared to save space.

contexttoolsoptimization
Built-incontext management

Structured Note-taking

The agent may maintain an external NOTES.md or a to-do list to track dependencies and progress across thousands of steps, which it can read back into its context after a reset.

contextpersistencenotes
Built-incontext management

Distractors

Files or code snippets that are topically related to the query but do not contain the answer can cause the model to lose focus or hallucinate.

contextpollutionrelevance
Built-incontext management

Context Rot

As more tokens are added, the model's ability to accurately retrieve needles of information from the haystack of the codebase decreases.

contextdegradationtokens
Built-inprompt engineering

XML Tagging

Use tags like <background_information>, <tool_guidance>, <constraints> to clearly separate different types of instructions in system prompts.

promptsxmlstructure
Built-intoken optimization

High-Signal Tokens

The objective is to provide the smallest possible set of high-signal tokens that maximize the likelihood of the correct code generation.

tokensoptimizationquality
Built-incontext management

Structural Patterns

Research suggests that models often perform better on shuffled or unstructured context than on logically structured haystacks, impacting how they process long files.

contextstructureresearch
Built-inarchitecture

Agent Skills

Reusable packages of domain expertise defined in SKILL.md files that provide specialized AI agent capabilities. Introduced as GA in VS Code 1.109, skills can be invoked as slash commands or loaded...

skillsagentsvscode+1
Built-inarchitecture

Agent Hooks

Deterministic shell commands that execute at key lifecycle points during agent sessions. Unlike instructions, hooks run code with guaranteed outcomes for security policies, quality checks, or audit...

hooksagentslifecycle+1
Built-inarchitecture

Agent Orchestration

A multi-agent pattern where specialized subagents collaborate on complex tasks, each operating in its own dedicated context window. Provides context efficiency, specialization with different models,...

orchestrationmulti-agentsubagent+1
Built-incontext management

Message Steering

An agent interaction pattern where follow-up messages redirect a running agent request. The agent yields after the active tool execution and processes the new message. Alternatives include request...

agentssteeringqueueing+1
Built-inarchitecture

Terminal Sandboxing

A security mechanism restricting file system and network access for agent-executed terminal commands. Sandboxed commands have read/write access only to the workspace directory, and network access can...

securitysandboxterminal+1
Built-intoken economics

Thinking Tokens

Tokens generated during a model's internal reasoning process before producing a visible response. Thinking tokens consume context budget but improve quality on complex tasks. Anthropic models support...

thinkingreasoningtokens+1
arXivcontext management

A Survey of Context Engineering for Large Language Models

Context Engineering is a formal discipline that transcends simple prompt design to encompass the systematic optimization of information payloads for LLMs. This survey of 1,400+ papers covers context retrieval, processing, management, RAG, memory systems, tool-integrated reasoning, and multi-agent architectures.

2025-07-17Lingrui Mei, Jiayu Yao +13
cs.CLcs.AI
arXivtoken optimization

Token-Budget-Aware LLM Reasoning

LLM reasoning chains are unnecessarily long and can be compressed by including a token budget in the prompt. This framework dynamically estimates a token budget per problem based on reasoning complexity, reducing token costs with only a slight performance reduction.

2024-12-24Tingxu Han, Zhenting Wang +4
cs.CLcs.AI
arXivarchitecture

Agentic Much? Adoption of Coding Agents on GitHub

The first large-scale empirical study of coding agent adoption across 129,134 GitHub projects finds an estimated adoption rate of 15.85–22.60% by late 2025 — very high for a technology only months old. Agentic tools like Cursor, Claude Code, and Codex are rapidly replacing traditional code completion.

2026-01-30Romain Robbes, Théo Matricon +3
cs.SE
arXivcontext management

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

ACE treats the system prompt as an evolving playbook that accumulates strategies through generation, reflection, and curation. It achieves +10.6% on agent benchmarks and +8.6% on finance tasks while significantly reducing adaptation latency and rollout cost.

2025-10-06Qizheng Zhang, Changran Hu +11
cs.CLcs.AIcs.LG
arXivcontext management

Context Branching for LLM Conversations: A Version Control Approach to Exploratory Programming

ContextBranch applies version-control semantics (checkpoint, branch, switch, inject) to LLM conversations, reducing context size by 58.1% in exploratory programming. A 39% average performance drop in multi-turn conversations motivates structured context management.

2025-12-15Bhargav Chickmagalur Nanjundappa, Spandan Maaheshwari
cs.SEcs.HC
arXivcontext management

Codified Context: Infrastructure for AI Agents in a Complex Codebase

A three-component codified context infrastructure — hot-memory constitution, 19 specialist agents, and cold-memory knowledge base — deployed across 283 sessions on a 108,000-line C# codebase, preventing LLMs from forgetting project conventions across sessions.

2026-02-27Aristidis Vasilopoulos
cs.SEcs.AI
Tokalatorcontext management

SaaS Bridge Session: Context Engineering in Practice — Feedback Report

Summary of the SaaS Bridge developer session (March 2026) where Tokalator was introduced to ~90 developers. Key feedback themes: standalone CLI demand, turn-count visibility, and minor UI bugs.

2026-03-04Vahid Faraji
communityfeedbackcli+1
Tokalatorcontext management

TechCareer Community Session: Developer Feedback on Context Engineering Tools

Summary of the TechCareer community session (March 2026) introducing Tokalator to ~80 developers. Feedback covered CLI workflows, turn-budget indicators, and minor bugs. TechCareer is an open developer community.

2026-03-05Vahid Faraji
communityfeedbackcli+2