Token

The basic unit of text processed by a large language model. A token is roughly equivalent to three to four characters, or about three-quarters of a word in English. AI models measure their inputs and outputs in tokens, and usage is typically billed on a per-token basis. In commercial real estate applications, understanding token limits is important when working with large documents such as leases, appraisal reports, or financial models, as content that exceeds a model’s context window must be chunked or summarized before processing.

Putting Token in Context

When an acquisitions analyst sends a 200-page offering memorandum to an AI model for summarization, the document is first converted into tokens before the model reads it. If the document exceeds the model’s context window, the team must split it into sections and process each separately, which affects how prompts are structured and how outputs are stitched together into a final deliverable.


Frequently Asked Questions about Token

Most commercial AI APIs charge on a per-token basis for both input and output. In CRE workflows involving large documents such as lease abstracts, appraisal reports, or rent rolls, the volume of tokens processed per task adds up quickly. Teams running high-frequency AI workflows should monitor token consumption to manage costs and optimize prompts to be as concise as possible without sacrificing accuracy.

When a document is too large to fit within a model’s context window, it must be broken into smaller segments before processing. For CRE use cases, this means a long lease or financial model may need to be chunked by section, with each chunk processed separately and results reassembled afterward. This introduces the risk of losing context across segments, so it is important to design chunking logic carefully and include overlapping context where needed.

Tokens do not map directly to words or characters. A single common word may be one token, while a rare or technical term may be split into multiple tokens. In CRE documents, specialized vocabulary like “capitalization rate” or “defeasance” may tokenize differently than general English, which means token counts for industry-specific documents can be higher than a simple word count would suggest.

A larger context window allows more content to be processed in a single pass, which reduces the need for chunking and preserves continuity across long documents like multi-tenant lease abstracts or full underwriting packages. However, a larger context window also increases the cost per call and can dilute the model’s attention on the most relevant sections. For many CRE workflows, a targeted prompt with a well-structured excerpt performs better than feeding the entire document at once.

Tracking token usage is a practical necessity for any team running AI workflows at scale. Without visibility into consumption, costs can escalate quickly, particularly in document-heavy processes like due diligence or lease review. Most API providers expose token counts in their response metadata, and teams can log this data to identify which workflows are consuming the most tokens and where prompt optimization or document preprocessing would reduce spend.


Click here to get this CRE Glossary in an eBook (PDF) format.