Token-Budget-Aware LLM Reasoning
Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu Chen
View Original →Abstract
Reasoning is critical for large language models (LLMs) to excel in a wide range of tasks. While methods like Chain-of-Thought (CoT) reasoning enhance LLM performance by decomposing problems into intermediate steps, they also incur significant overhead in token usage, leading to increased costs. This paper finds that the reasoning process of current LLMs is unnecessarily lengthy and can be compressed by including a reasonable token budget in the prompt. The authors propose a token-budget-aware LLM reasoning framework (TALE) that dynamically adjusts the number of reasoning tokens based on the reasoning complexity of each problem. Experiments show that the method effectively reduces token costs in CoT reasoning with only a slight performance reduction.
Relevance to Tokalator
Both TALE and Tokalator treat token expenditure as a controllable variable rather than a side effect. TALE controls tokens at the model reasoning level; Tokalator controls tokens at the IDE session level via per-turn preview and budget decomposition. Published at ACL 2025 Findings.
Related Articles
Data Engineering for Scaling Language Models to 128K Context
We study the continual pretraining recipe for scaling language models' context lengths to 128K, with a focus on data engineering. We hypothesize that long context modeling, in particular \textit{the...
How Important Is Tokenization in French Medical Masked Language Models?
Subword tokenization has become the prevailing standard in the field of natural language processing (NLP) over recent years, primarily due to the widespread utilization of pre-trained language...
Towards Adaptive Context Management for Intelligent Conversational Question Answering
This particular paper introduces an Adaptive Context Management (ACM) framework for the Conversational Question Answering (ConvQA) systems. The key objective of the ACM framework is to optimize the...
How_to_count_tokens_with_tiktoken
{ "cells": { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": " How to count tokens with tiktoken\n", "\n", " tiktoken ...