No items found.
In the last quarter of 2024 we’ve seen growing enthusiasm around AI as a number of UK practices start to emerge from a period of research and engagement across their teams and in a number of cases, going on to launch formal policies and “AI Handbooks” that permit the controlled use of Generative AI, while operating within defined guardrails.
The teams who have maintained interest and consistent experimentation over the last year are now starting to see adoption in some specific areas and niches of their project and practice work, but there is so much more ground to cover as we look ahead at the technologies now emerging.
While architects are continuing to explore the potential of image models and generative (computational) design tools, I want to focus for a moment solely on language models.
This far less visual realm is likely to continue to grow in significance for the profession, because while we tend to presume our most meaningful outputs as architects are in drawing and graphical form, we actually expend enormous energy in the processing and production of text based information every day.
Our information is highly detailed and when we work with text we have to be precise; a wrong word in a specification could leave a practice open to litigation. A misunderstood line in a design brief could cause an entirely ill-conceived competition design for a bid, that causes a significant loss. A failure to retrieve the correct section of the Building Regs could increase the risk of non-compliance.
In short, while we think in spatial and image terms, we often express our work in the form of written text reports that are extremely long, technically complex and detailed in nature. You might think that this renders LLMs (Large Language Models) useless for the purposes of processing such accuracy-dependent documents, but as we begin to improve knowledge and skills around responsible use of these new tools, confidence levels are increasing around what practices would like to use them for.
We’re now seeing practice transition from curiosity about language models to firm strategy about how to implement them in various ways across practice departments. Here are some reflections on the use of LLMs in Architecture as we close 2024 and what we might expect going forwards into 2025…
The broad category of text models (or “LLMs”) remains the area that I’m most excited about for professional practice.
We can see there is demand at a team level, because there is already very widespread use of “Free GPT” in practice. OpenAI gets around 2.9B impressions a month for Chat GPT and most of those are for the free product, a tool where your inputs and attachments will be harvested for future model training.
Free GPT also has an information limit of 8k tokens (about 6k words) which greatly impedes the quality of analysis and text processing you can do with it.
The use of these free tools is currently very ad hoc and unstructured, producing mixed results and much confusion because some of the time people are getting outstanding results and sometimes they are left nonplussed by GPT’s output.
While there is much room for improvement in the way people approach using text models in professional settings, we should take this trend towards daily usage as a clear indication and signal that people are indeed finding everyday use for LLMs impactful for their daily work.
A less charitable view would be to conclude that this trend is also confirmation that there is very little structure in most peoples’ approach to using LLM, nor methodical professionalised workflows, nor frankly — any data protection controls whatsoever. Most practices need to set some basic guidelines and policy around “controlled use of AI”, because their commercially sensitive and confidential data is being passed to OpenAI on an industrial scale and will likely be used for future model training.
While practices have a growing list of ideas they’d like to attempt to use LLMs for, the format and quality of their data is a distinct limitation. There is a widespread hope that at some point they’ll be able to turn on a “single simple switch” for a tool like Microsoft Co-pilot and suddenly have an agent that is able to retrieve and make use of their file directories instantly and to produce new high-quality outputs on this basis.
This hope could be (unfortunately) misplaced as this kind of approach is unlikely to produce the level of accuracy and performance that businesses expect because of how the data is handled. I’m yet to hear a really glowing review of Co-Pilot in those situations.
Microsoft Co-pilot works on the basis of RAG and offers users a small context window, both of which limit its ability to produce the best quality results — this method of retrieval only finds the ‘most similar’ chunks of a PDF document when usually you want a response that is aware of the entire contents of many such documents.
I think it’s more likely that over time we will gradually assemble a parallel, purpose-built and well structured data set that has been prepared specifically for ingestion by an LLM or agent (using large context windows) and compiled for a particular range of purposes that can be trusted for automatic retrieval and trustworthy responses. To get the best results we want to feed LLMs markdown versions of our documents at full length, we don’t want to attach PDFs and use RAG.
If we are going to be asking more and more from our language models and to introduce agents to perform multi step, multi tool processes, then I’m also not sure we want to let them loose on our traditional folder structures.
Better to constrain their scope to a purpose built library of tools and resources that is gradually compiled over time, based on the specific demands of a suite of solutions we implement, one by one.
Before we can start using AI effectively, an incremental step is needed to plan out the highest impact and easiest to build ideas first and to take it from there. Many ideas don’t actually need huge amounts of data, they just need some simple thinking about making good use of existing templates and testing out your instructions.
Here are current my top 10 LLM workflows that work well using (specifically) unabridged “long prompting” techniques:
This January I will be officially launching a new tool with my co-founder Stephen Hunter, called OmniChat.uk. This tool which has been through testing with a number of large practices in 2024 will offer enterprise level privacy for professionals by providing API connections to all of the top models in one interface that can be accessed via a Microsoft 365 or Google SSO login.
Within the app we have the ability to build standardised workflows that can be used again and again by teams, bringing much needed consistency to the prompting process. This methodology also opens up the potential for way more complex prompting recipes, where you apply a template approach that is maintained consistently with part of the ‘recipe’ varying each time, as a variable.
By promoting a more structured approach to prompt creation, testing and refinement across departments and teams, I think we will see the small ideas that are already being entrusted to LLMs within niches of departments and projects (such as bid-writing and communications), become substantially more widespread.
Based on early results, we expect these methods to expand dramatically to cover many areas of documentation review and report production as people become more accustomed to building structured repeatable prompts with standardised inputs from their project environments that change from task to task.
A curious trend in the world of AI is that models are getting better over time while simultaneously becoming more cost-effective, making enterprise-grade AI tools increasingly accessible across entire practices. Newer generations of AI models typically make use of the latest optimisations and are also being run on newer generations of hardware, leading to reduced energy consumption. A significant proportion of the cost of AI comes down to the energy required to run these models, so lower energy use means lower cost. Unpicking the total energy consumption of language models is complex and will be the subject of a future article, but this much is clear — while training these models requires substantial energy, their ongoing operation is becoming increasingly efficient.
Looking at the data: GPT-4 cost around $30 per million tokens in March 2023. By May the following year, it cost less than $5 for a million tokens. With the introduction of GPT-4o and then a technique called ‘Prompt Caching’, we’re observing substantial reductions in both cost and energy usage compared to the original release.
Prompt Caching is a feature where the AI system remembers and reuses previously processed portions of prompts rather than processing them from scratch each time. This is especially significant for architectural workflows, as when working with large documents we often need to process hundreds of thousands of words in a single conversation. The mechanism is straightforward: maintain a rhythm of responses within the 5-minute cache window, and the cached portions of your prompts are processed at a significantly reduced rate. For example, Claude charges just 10% of the normal cost for cached content.
For architectural practices, this evolution in pricing is particularly relevant. I regularly encounter firms where senior staff have premium, private AI access while junior team members use free, public tools — unknowingly sharing potentially sensitive project data.
With current cost reductions, there’s no longer a need for this two-tier approach to AI access within practices. OmniChat’s pay-as-you-go model means these efficiency improvements directly reduce the cost per unit of intelligence for our customers.
Recent developments in AI have shown a fascinating shift in approach. As AI becomes more efficient and cost-effective to run, the industry is focusing not just on making them bigger or training them on higher quality data, but also on allowing them to spend more time deliberating before they respond — much like a person would when tackling a complex problem.
Instead of generating immediate responses based on pattern matching, these models actively work through problems step by step, trying different strategies and recognising when they need to backtrack or revise their approach.
Many of the complex tasks architects deal with daily, from regulatory compliance to technical specifications, require careful consideration and multi-step reasoning. While earlier AI models might have struggled with these nuanced challenges, these new “thinking” models are better equipped to handle the complexity inherent in architectural work.
One of the key challenges for practices in 2025 will be identifying where these more sophisticated reasoning models can add the most value. With their higher computational costs and longer processing times, we’ll need to carefully map out which tasks truly benefit from this deeper analytical capability, and which are better served by faster, more efficient models. This balance between capability and efficiency will be crucial as practices integrate these new tools into their workflows.
Open AI have just released a new “Thinking Model” called “o3” which has caused waves among AI researchers and academics in the past fortnight. The new model “o3” is able to perform maths at PhD level and also to pass a number of the ARC AGI test criteria that no other model has yet managed to do.
As a general rule I warn people away from using language models for maths purposes and this is because they use probability distributions to generate outputs and that’s not how maths works. The exception to this rule is to use them more like a calculator, by writing python scripts that you can audit and then run with a code interpreter.
Having said that, specialised models like Open AI’s “o3” show reasoning and mathematical abilities that are on par with Fields Medal winning mathematicians and they have demonstrated abilities to solve novel maths problems of enormous complexity, so we will likely see this kind of model used in advanced research areas more and more.
A word you are going to be sick of hearing in 2025 will be ‘agents’… But this is with good reason, they show enormous promise. If you’re already using AI to help with tasks like writing or analysis, think of agents as the next logical step — they’re AI assistants that can actually reach out and use other digital tools on your behalf.
Agents are essentially still LLMs, but with the addition of a system that allows the LLM to actively perform tasks across digital infrastructure. Each agent is bespoke, and the most advanced agents can orchestrate complex workflows across multiple platforms and services. This integration allows them to move from interfacing with APIs to executing transactions and managing digital resources across different systems, in response to a single command from the user. The ability of Agents to chain these interactions together, making decisions about tool selection and execution sequence, means that AI is moving from passive response systems to active digital participants.
We can now create agents that check regulations, standards and policy across multiple documents, perhaps to coordinate with specification databases and libraries and even to interface with cost estimation tools. These agents can generate and validate compliance reports while managing document version control across different platforms, all within defined parameters and workflows.
In 2025, we will be exploring the world of AI Agents through the lens of OmniChat.uk and our early users. Stephen and I are now looking into specific “agentic workflows” that will be useful in practice, and we hope to be able to start introducing practices to these in a controlled way. These agents will be configured and accessed within OmniChat in the coming months, with a focus on maintaining security and ensuring reliable, accurate outputs through carefully defined tool descriptions and parameters.
As we enter 2025, architectural practices are moving beyond AI experimentation toward strategic implementation. The practices that thrive will be those that take a methodical approach — starting with proven use cases and then exploring more advanced capabilities as they emerge.
This shift is being driven by three key developments: dramatically reduced costs through innovations like prompt caching, the emergence of “thinking” models that can handle complex technical tasks and the promise of agent-based workflows which will begin to show practical results in the coming months.
The challenge now isn’t whether to use AI, but how to use it effectively. Practices need to move away from ad-hoc use of free tools toward structured, secure approaches that protect sensitive data. This means developing clear AI policies, building standardised workflows, and carefully choosing which tasks benefit from different types of AI capabilities.
Even if AI development stalled as it stands today, we could easily spend the next decade unlocking the full potential of existing technology for professional settings like architecture. However progress isn’t stalling … if anything, it’s speeding up.