Context Engineering Wiki

2016-06-16Timo Greifenberg, Markus Look +2

Engineering Tagging Languages for DSLs

To keep a DSL clean, readable and reusable in different contexts, it is useful to define a separate tagging language. A tag model logically adds information to the tagged DSL model while technically...

cs.SE

2024-02-15Yao Fu, Rameswar Panda +3

Data Engineering for Scaling Language Models to 128K Context

We study the continual pretraining recipe for scaling language models' context lengths to 128K, with a focus on data engineering. We hypothesize that long context modeling, in particular \textit{the...

2024-02-18Renxi Wang, Haonan Li +3

arXivtool use

Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents

Large language models (LLMs) have achieved success in acting as agents, which interact with environments through tools such as search engines. However, LLMs are optimized for language generation...

2024-02-22Yanis Labrak, Adrien Bazoge +3

How Important Is Tokenization in French Medical Masked Language Models?

Subword tokenization has become the prevailing standard in the field of natural language processing (NLP) over recent years, primarily due to the widespread utilization of pre-trained language...

cs.CLcs.AIcs.LG

2025-03-12Falko Helm, Nico Daheim +1

Token Weighting for Long-Range Language Modeling

Many applications of large language models (LLMs) require long-context understanding, but models continue to struggle with such tasks. We hypothesize that conventional next-token prediction training...

2012-08-13Monik Khare, Neal E. Young

On the solution existence and stability of polynomial optimization problems

This paper introduces and investigates a regularity condition in the asymptotic sense for optimization problems whose objective functions are polynomial. Under this regularity condition, the...

2018-08-18Vu Trung Hieu

math.OC

arXivcaching

Caching with rental cost and zapping

The \emph{file caching} problem is defined as follows. Given a cache of size $k$ (a positive integer), the goal is to minimize the total retrieval cost for the given sequence of requests to files. A...

cs.DS

arXivprompt engineering

StruQ: Defending Against Prompt Injection with Structured Queries

Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications, which perform text-based tasks by utilizing their advanced language understanding capabilities. However,...

2024-02-09Sizhe Chen, Julien Piet +2

cs.CR

2025-07-21Aman Gupta, Yingying Zhuang +3

How and Where to Translate? The Impact of Translation Strategies in Cross-lingual LLM Prompting

Despite advances in the multilingual capabilities of Large Language Models (LLMs), their performance varies substantially across different languages and tasks. In multilingual retrieval-augmented...

1997-01-02Chengxiang Zhai

Exploiting Context to Identify Lexical Atoms -- A Statistical View of Linguistic Context

Interpretation of natural language is inherently context-sensitive. Most words in natural language are ambiguous and their meanings are heavily dependent on the linguistic context in which they are...

2025-09-22Manoj Madushanka Perera, Adnan Mahmood +2

Towards Adaptive Context Management for Intelligent Conversational Question Answering

This particular paper introduces an Adaptive Context Management (ACM) framework for the Conversational Question Answering (ConvQA) systems. The key objective of the ACM framework is to optimize the...

2024-07-23Muhammad Rofiqul Islam, Abdullah Al Mehdi

Impacts of National Cultures on Managerial Decisions of Engaging in Core Earnings Management

This study investigates the impact of Hofstede's cultural dimensions on abnormal core earnings management in multiple national cultural contexts. We employ an Ordinary Least Squares (OLS) regression...

econ.GN

2024-02-19Hongjin Su, Shuyang Jiang +3

EVOR: Evolving Retrieval for Code Generation

Recently the retrieval-augmented generation (RAG) has been successfully applied in code generation. However, existing pipelines for retrieval-augmented code generation (RACG) employ static knowledge...

2025-06-08Jingyuan Qi, Zhiyang Xu +2

AR-RAG: Autoregressive Retrieval Augmentation for Image Generation

We introduce Autoregressive Retrieval Augmentation (AR-RAG), a novel paradigm that enhances image generation by autoregressively incorporating knearest neighbor retrievals at the patch level. Unlike...

cs.CV

2025-04-18Xiangrong, Zhu +3

Intelligent Interaction Strategies for Context-Aware Cognitive Augmentation

Human cognition is constrained by processing limitations, leading to cognitive overload and inefficiencies in knowledge synthesis and decision-making. Large Language Models (LLMs) present an...

cs.HC

How to work with large language models

[Large language models][Large language models Blog Post] are functions that map text to text. Given an input string of text, a large language model predicts the text that should come next.

tokensprompts

Techniques to improve reliability

When GPT-3 fails on a task, what should you do?

tokensprompts

OpenAI Cookbooktoken optimization

Related resources from around the web

People are writing great tools and papers for improving outputs from GPT. Here are some cool ones we've seen:

prompts

How_to_count_tokens_with_tiktoken

{ "cells": { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": " How to count tokens with tiktoken\n", "\n", " tiktoken ...

tokenspromptsembeddings+2

cachingoptimizationtokens

How_to_stream_completions

{ "cells": { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": " How to stream completions\n", "\n", "By default, when you request a completion...

tokenspromptsstreaming

Anthropiccaching

Prompt Caching

Claude API Documentation

promptsengineeringbest-practices

Prompt Engineering Overview

Claude API Documentation

promptsreasoningchain-of-thought

Chain of Thought Prompting

Comprehensive guide to prompt engineering techniques for Claude's latest models, covering clarity, examples, XML structuring, thinking, and agentic systems.

Anthropiccontext management

Context Windows

Claude API Documentation

contextwindowstokens

Anthropiccontext management

Long Context Window Tips

Comprehensive guide to prompt engineering techniques for Claude's latest models, covering clarity, examples, XML structuring, thinking, and agentic systems.

contextlong-contextoptimization

Anthropictoken optimization

Token Counting

Claude API Documentation

tokenscountingusage

Use XML Tags in Prompts

Comprehensive guide to prompt engineering techniques for Claude's latest models, covering clarity, examples, XML structuring, thinking, and agentic systems.

promptsxmlstructure

reasoningthinkingchain-of-thought

Extended Thinking

Claude API Documentation

Google AIcaching

Context Caching

Learn how to use Context Caching in the Gemini API

cachingcontextoptimization

Google AIcontext management

Long Context

Learn about how to get started building with long context (1 million context window) on Gemini.

contextlong-contexttokens

Google AItoken optimization

Tokens

Ya está disponible la versión preliminar de Gemini 3.1 Flash Lite. Pruébalo en AI Studio https://aistudio.google.com/prompts/new_chat?model=gemini 3.1 flash lite preview&hl=es 419 . ...

tokenscountingusage

Google AIprompt engineering

Prompting Strategies

Ya está disponible la versión preliminar de Gemini 3.1 Flash Lite. Pruébalo en AI Studio https://aistudio.google.com/prompts/new_chat?model=gemini 3.1 flash lite preview&hl=es 419 . ...

promptsstrategiesengineering

Google AIprompt engineering

System Instructions

البدء بإنشاء تطبيقات للمحادثات وإنشاء النصوص باستخدام Gemini API

system-promptsinstructionsengineering

Google AItool use

Code Execution

Learn how to use the Gemini API code execution feature.

codeexecutiontools

Progressive Disclosure

Instead of loading an entire codebase—which would immediately overwhelm the attention budget—modern agents use JIT context. The assistant dynamically loads only the necessary data at runtime.

contextjitoptimization

contextreferencesefficiency

Lightweight Identifiers

The assistant maintains references (file paths, stored queries) and dynamically loads only the necessary data at runtime using tools like grep, head, or tail.

contextcompressionlong-horizon

Compaction

When a session nears its token limit, the assistant summarizes critical details—such as architectural decisions and unresolved bugs—while discarding redundant tool outputs.

Tool Result Clearing

A light touch form of compaction where the raw results of previous tool calls (like long terminal outputs) are cleared to save space.

contexttoolsoptimization

Structured Note-taking

The agent may maintain an external NOTES.md or a to-do list to track dependencies and progress across thousands of steps, which it can read back into its context after a reset.

contextpersistencenotes

contextpollutionrelevance

Distractors

Files or code snippets that are topically related to the query but do not contain the answer can cause the model to lose focus or hallucinate.

Built-inprompt engineering

Context Rot

As more tokens are added, the model's ability to accurately retrieve needles of information from the haystack of the codebase decreases.

contextdegradationtokens

XML Tagging

Use tags like <background_information>, <tool_guidance>, <constraints> to clearly separate different types of instructions in system prompts.

promptsxmlstructure

Built-intoken optimization

High-Signal Tokens

The objective is to provide the smallest possible set of high-signal tokens that maximize the likelihood of the correct code generation.

tokensoptimizationquality

Structural Patterns

Research suggests that models often perform better on shuffled or unstructured context than on logically structured haystacks, impacting how they process long files.

contextstructureresearch

Agent Skills

Reusable packages of domain expertise defined in SKILL.md files that provide specialized AI agent capabilities. Introduced as GA in VS Code 1.109, skills can be invoked as slash commands or loaded...

skillsagentsvscode+1

Agent Hooks

Deterministic shell commands that execute at key lifecycle points during agent sessions. Unlike instructions, hooks run code with guaranteed outcomes for security policies, quality checks, or audit...

hooksagentslifecycle+1

orchestrationmulti-agentsubagent+1

Agent Orchestration

A multi-agent pattern where specialized subagents collaborate on complex tasks, each operating in its own dedicated context window. Provides context efficiency, specialization with different models,...

Message Steering

An agent interaction pattern where follow-up messages redirect a running agent request. The agent yields after the active tool execution and processes the new message. Alternatives include request...

agentssteeringqueueing+1

securitysandboxterminal+1

Terminal Sandboxing

A security mechanism restricting file system and network access for agent-executed terminal commands. Sandboxed commands have read/write access only to the workspace directory, and network access can...

Built-intoken economics

Thinking Tokens

Tokens generated during a model's internal reasoning process before producing a visible response. Thinking tokens consume context budget but improve quality on complex tasks. Anthropic models support...

thinkingreasoningtokens+1

2025-07-17Lingrui Mei, Jiayu Yao +13

A Survey of Context Engineering for Large Language Models

Context Engineering is a formal discipline that transcends simple prompt design to encompass the systematic optimization of information payloads for LLMs. This survey of 1,400+ papers covers context retrieval, processing, management, RAG, memory systems, tool-integrated reasoning, and multi-agent architectures.

2024-12-24Tingxu Han, Zhenting Wang +4

Token-Budget-Aware LLM Reasoning

LLM reasoning chains are unnecessarily long and can be compressed by including a token budget in the prompt. This framework dynamically estimates a token budget per problem based on reasoning complexity, reducing token costs with only a slight performance reduction.

2026-01-30Romain Robbes, Théo Matricon +3

arXivarchitecture

Agentic Much? Adoption of Coding Agents on GitHub

The first large-scale empirical study of coding agent adoption across 129,134 GitHub projects finds an estimated adoption rate of 15.85–22.60% by late 2025 — very high for a technology only months old. Agentic tools like Cursor, Claude Code, and Codex are rapidly replacing traditional code completion.

cs.SE

2025-10-06Qizheng Zhang, Changran Hu +11

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

ACE treats the system prompt as an evolving playbook that accumulates strategies through generation, reflection, and curation. It achieves +10.6% on agent benchmarks and +8.6% on finance tasks while significantly reducing adaptation latency and rollout cost.

cs.CLcs.AIcs.LG

2025-12-15Bhargav Chickmagalur Nanjundappa, Spandan Maaheshwari

Context Branching for LLM Conversations: A Version Control Approach to Exploratory Programming

ContextBranch applies version-control semantics (checkpoint, branch, switch, inject) to LLM conversations, reducing context size by 58.1% in exploratory programming. A 39% average performance drop in multi-turn conversations motivates structured context management.

cs.SEcs.HC