, , ,

Using Generative AI to Outperform Humans at Financial Statement Analysis

Researchers at the University of Chicago recently released the results of an important study that evaluated GPT-4’s financial analysis capabilities by predicting future earnings from financial statements. By using Chain-of-Thought (CoT) reasoning, the AI model (i.e. in this case a GPT-4 powered custom GPT) not only was capable of analyzing financial data, but also actually outperformed human analysts by a meaningful margin.

Assessing over 150,000 predictions dating back to the 1960s, the GPT had a prediction accuracy rate of 60.35% compared to only 53% – 57% for the human analysts. This level of accuracy was comparable to that of a narrowly trained state-of-the-art machine model – the type of ML model sophisticated investment firms invest millions to build and maintain.

This caught our attention here at A.CRE, because the study demonstrated that generative AI has the real potential to change not just how stock investors analyze financial data but how we as real estate investors analyze the financial and credit risk of our tenants. In this post, we’ll explore these findings in depth and discuss possible implications for commercial real estate.

  • Click here to try for yourself the custom GPT (i.e. Financial Statement Analyzer) that the team at the University of Chicago developed.

Real Estate’s Traditional (But Lacking) Approach to Financial Analysis

In commercial real estate, the value of a real estate asset is directly linked to the quantity and certainty of the future cash flow that asset can produce. That cash flow generally comes in the form of lease payments from tenants. And the certainty of those lease cash flows are often a function of the credit worthiness and financial strength of the entity or individual guaranteeing those lease payments.

To assess that creditworthiness, commercial real estate professionals have traditionally either a) relied on 3rd parties to make those judgements for them (e.g. public credit ratings from firms like S&P) or b) relied on a manual examination of the tenant’s financial statements—balance sheets, income statements, and cash flow statements—to determine the likelihood that the tenant will meet its obligations.

However, this traditional method faces several challenges: it is time-consuming, prone to human error, and quite frankly most real estate professionals are not experts on assessing credit risk.  Thus, real estate professionals have long been lacking in their ability to analyze credit risk analysis.

The Research Findings – A GPT to Predict Financial Performance

And so, with that in mind, you can understand why we were excited to come across a tool backed by real research that CRE professionals could (potentially) use to better understand the financial health of tenants. Here is how it works:

The team at the University of Chicago set out to test the ability of GPT-4 (i.e., generative AI) to predict future earnings. To do so, they built a custom GPT with ChatGPT that used OpenAI’s GPT-4 large language model. The GPT they built leverages advanced language processing capabilities to mimic the detailed and nuanced approach of human financial analysts.

The GPT model uses a “chain-of-thought” prompt to guide its analysis, breaking down the process into steps that parallel human reasoning. Initially, the AI examines financial statements, identifies significant changes, and calculates key financial ratios. It then interprets these ratios to predict whether a company’s earnings will increase or decrease. The model also provides a rationale for its predictions and assesses the confidence level of its conclusions.

The model was trained on a comprehensive dataset, including Compustat annual financial data from 1968 to 2021 and analyst forecasts from IBES starting in 1983. In evaluating the performance of this large language model (LLM) against human analysts, the results were compelling. Using chain-of-thought (CoT) reasoning, the GPT-4 model achieved an accuracy of 60.35% in predicting the direction of future earnings, significantly outperforming the first-month analyst consensus forecast accuracy of 52.71%.

This performance showcased the model’s ability to effectively analyze financial statements. More importantly, it highlighted AI’s potential to surpass human analysts in specific tasks, including as a risk analyst trained to more accurately assess the credit risk of real estate tenants.

Implications for Commercial Real Estate

The implementation of AI, particularly advanced language models like GPT-4, to assess the financial strength of tenants is likely to have a meaningful impact on the commercial real estate (CRE) industry. A few thoughts on potential implications of tech like this for our industry.

1. AI as a Companion to Real Estate Underwriters

AI can serve as a powerful companion to real estate underwriters, enabling faster analysis and more accurate predictions about a tenant’s ability to meet their lease obligations. By leveraging AI’s advanced data processing capabilities, underwriters can swiftly evaluate financial statements, identify key financial strengths and weaknesses, and predict future financial health with a greater degree of accuracy.

2. Short-Term Opportunities for Tech-Forward Owners

In the short run, tech-forward property owners can leverage AI to gain a competitive edge when investing in properties leased to non-credit tenants. Traditionally, such tenants pose a higher risk due to the lack of publicly available credit ratings, commanding a higher yield to justify the greater risk (i.e., a higher discount rate).

However, with AI-driven insights, owners can better assess the financial health of these types of tenants, making it feasible to invest in properties that would have otherwise been avoided in a human-only credit analysis context. This capability opens up new avenues to acquire assets on an attractive risk-return basis, as the risk is lowered due to a better understanding of the tenant’s credit.

3. Long-Term Implications: Lower Discount Rates

In the long run, the ability to effectively assess the credit risk of all tenants – even those that are not publicly rated – should lead to lower discount rates for non-credit tenants. As owners gain more confidence in their ability to evaluate the financial stability of these tenants, they can justify lower risk premiums. This reduction in discount rates translates to higher property values, more attractive financing terms, and ultimately more capital flowing into CRE.


While AI tools like this offer significant potential, they are not a panacea. AI predictions, including those from advanced models like GPT-4, are still often wrong, even if in this case they were shown to be more accurate than human analysts.

AI should be viewed as enhancements – assistants if you will – to the traditional underwriting process, not replacements. The accuracy of AI predictions, while better than the status quo, is not infallible. Real estate professionals must continue to apply their judgment and expertise, using AI as a valuable tool to support, rather than dictate, their decisions.

Practical Application of AI in Real Estate Credit Analysis

So, how might you as a real estate practitioner think about using AI in real estate credit analysis today?

Well first, the custom GPT that the team out of University of Chicago is publicly available (as of June 2024). You can find it here. Simply drop in the financial statements for a company and it will walk you through a chain-of-thought reasoning exercise to ultimately predict whether the company’s earnings will increase or decrease over the next 12 months.

Then with that comprehensive analysis done, you can ask follow-up questions such as “what would you estimate is the probability this company will be able to pay its lease obligations over the next 12 months, 60 months, 120 months, etc. Explain your reasoning.”. The model will then use the context of the comprehensive financial statement analysis to provide some probability that almost certainly will be wrong, but likely less wrong than that of a human.

Second, the key to the GPT’s success appears to be the chain-of-thought logic process that the team trained the model to use. Employing that same logic process, you could build a similar GPT that instead of leading toward an earnings prediction could lead toward a credit rating prediction: “if this company was rated by a public credit rating agency, what S&P equivalent would it likely receive.”

Again, the output from that custom GPT you built would almost certainly be wrong, but if built right would likely improve on the results of a lay real estate person with novice credit risk analysis skills.

And finally, if relying on the prediction of an AI model is a road too far at this point, you could simply use AI to speed your own analysis. It is quite adept at summarizing information, extracting key data points, building visuals, and many of the time-consuming tasks that go into manual human credit risk analysis. In essence, artificial intelligence can help you analyze more tenants in less time.


As generative AI continues to evolve, its capabilities in real estate underwriting will become more and more obvious and sophisticated. Eventually, we won’t perform the analysis but instead we will direct an AI agent who will perform that analysis for us. The quality of the output will still be a function of our instructions, training, the data that goes into the analysis, and to some degree our human intuition. Whether that day is five years from now or fifty years from now, we can only hypothesis; but that day is almost certainly coming.

About the Author: Born and raised in the Northwest United States, Spencer Burton has over 20 years of residential and commercial real estate experience. Over his career, he has underwritten $30+ billion of commercial real estate at some of the largest institutional real estate firms in the world. He is currently President and member of the founding team at Stablewood. Spencer holds a BS in International Affairs from Florida State University and a Masters in Real Estate Finance from Cornell University.