Image of reforestation

InferESG: Harnessing agentic AI for due diligence

In response to an innovation challenge set by FinTech Scotland, we created a tool called InferESG that leverages Generative AI (GenAI) in the task of monitoring and validating data disclosed in Environmental, Social, and Governance (ESG) reports. Importantly, this tool wraps GenAI in a smart, agentic architecture to produce responses of a high level of precision and reliability. InferESG has the potential to transform how analysts approach ESG data analysis in the sustainable investment sector, and is a tool that’s adaptable to a wide range of other use cases. 

Understanding the scope for greenwashing

FinTech Scotland’s innovation challenge – ‘Shaping the Future of ESG in Financial Services’ – sought “data-led solutions and technology-enabled approaches to new ESG regulatory requirements, helping drive responsible outcomes for people and the environment.” Following a successful proposal, Scott Logic was one of twenty companies chosen to advance their solutions in collaboration with universities and industry challenge partners. Our challenge partner for the three-month Innovation Process was global investment company aberdeen

Through our collaborative discussions with aberdeen, we comprehended a range of critical challenges faced by ESG analysts in the sustainable investment sector. A key challenge involved ‘materiality frameworks’ which relate to assessing what truly matters for a specific industry or company type (for instance, water usage is likely to be highly material for a beverage manufacturer). Another challenge pertained to companies operating in multiple countries, with different regulatory environments in relation to ESG reporting and materiality. Exacerbating these challenges was the frequency with which regulations and reporting methodologies changed.  

Wind turbines in dawn light rising above low cloud

All of this complexity provides the opportunity for organisations to greenwash, on the assumption that it will be hard to detect. Existing methods for assessing ESG data rely on deterministic rules set by humans, and they are increasingly unsuited to the scale and complexity of the challenge.

From aberdeen, we learned that ESG analysts usually begin their investigation of a company with an extensive review of its sustainability report. This insight planted the seed that led us to develop an AI-powered tool that could rapidly analyse reports and support deeper investigation by human analysts: InferESG.

The InferESG solution will help anyone who doesn't have detailed knowledge of a company understand the key sustainability and material issues, and for anyone who does know a company, investigate them in more detail.

Alexandre Popa, Sustainability Analyst, aberdeen 

A co-pilot for ESG analysis

InferESG leverages task-oriented agentic AI, along with GenAI’s natural language processing capabilities, to provide both a rapid analysis of ESG reports and a conversational interface through which an analyst can dig deeper into areas of interest or concern. To start the process, the tool assigns AI agents to conduct a thorough analysis of a company's sustainability report, including a hugely time-saving materiality assessment to navigate industry-specific factors. This initial review identifies areas of strong performance and potential concerns, giving analysts a clear starting point for their evaluation.

Next, the conversational interface enables the analysts to query the report summary through natural language queries. The system processes these queries through a framework of multiple AI agents which bring back comprehensive, evidence-based responses. As well as a free-text conversational interface, a set of suggested questions is also offered, and these update dynamically as the analyst progresses through their interrogation of the report. 

A scratchpad feature provides a clear evidence trail of how the AI agents arrived at their responses, providing the transparency needed to maintain trust in the tool’s findings. In addition, InferESG was designed to focus on facts, not judgements; in this way, it’s truly a co-pilot for humans in complex analytical tasks. Rather than supplanting the ESG analyst, it performs the role of a junior analyst, producing an assessment on which the human analyst can base their judgements.

Circular power plant of solar panels in Spain

Architecting for reliability and precision

AI agents provide the tools to conduct monotonous tasks, whilst enabling the human to focus on decision-making and solving complex problems. One approach to leveraging AI agents is to allow them to be fully autonomous in carrying out tasks and returning results. However, this approach brings with it significant concerns about transparency and data quality, making it an unsuitable option for InferESG.

Instead, our design wrapped the AI agents in a smart architecture designed to optimise the reliability, precision and trustworthiness of their responses. Furthermore, this architecture was designed intentionally to reduce its environmental impact and financial cost. Rather than using fully autonomous AI which is computationally intensive and unfettered, we reined in the AI to perform targeted, specialist tasks that focused on the quality and relevance of the results.

The initial review of the sustainability report is a definitive process, so we implemented an orchestrated workflow architecture, rather than a conversational, agentic approach. Following the upload of a sustainability report, InferESG uses a combination of Large Language Model (LLM) and AI agents to sift through the noise and highlight key issues. The Report Agent coordinates the analysis process across multiple ESG dimensions.

Meanwhile, the Materiality Agent works with a carefully curated library of materiality documents from leading organisations in sustainable reporting and finance; using these, the agent identifies the relevant materiality topics for the industry in question and evaluates the organisation’s disclosures with reference to them. The combined results of these assessments are then returned in a summary report presented to the analyst in one-half of the user interface. In this way, InferESG effectively trains the AI on the report, priming the pump before the analyst asks their questions.

Screenshot of the InferESG user interface

Following the report stage, the analyst can interact via a conversational interface with the AI. Again, our team de-risked this approach by constraining the AI agents within a hierarchical structure. In the architecture we designed, a supervisor agent orchestrates the analysis process, coordinating specialist agents that each handle specific aspects of the evaluation.

To give a simple example, an analyst might ask, "What external evidence exists to support or contradict these carbon reduction claims?" The Intent Agent then determines that the Web Agent should be used to make an internet search for current news, regulatory filings, and other trusted public sources to provide context and verification.

Importantly, the Web Agent is constrained to run a limited number of queries rather than figuratively boiling the ocean. The Answer Agent digests the results and responds to the analyst in natural language via an LLM.

Our architecture represented a significant step forward from traditional ESG analysis tools, providing a framework that can adapt and scale as the complexity of ESG reporting continues to grow. The result is a robust, transparent, and effective analysis process – one that keeps the human analyst firmly in the driver's seat, with AI agents serving as powerful analytical tools rather than autonomous decision-makers.

Validation through rigorous testing

To ensure InferESG effectively met aberdeen's challenge requirements, we implemented a testing methodology across three milestone releases. This validation included manual review of reports, document search, online search, and cross-validation using multiple AI systems. Our testing framework borrowed concepts from machine learning evaluation, implementing a modified F-score system to assess each release for precision (the accuracy of the positive predictions made by the model) and recall (the model's ability to identify all instances of true positives). 

InferESG gained high precision and recall scores, and the results validated that the system evolved with each release to become more nuanced and capable, particularly in handling industry-specific contexts.

The testing phase not only validated InferESG's capabilities but also the importance of the agentic architecture in providing guardrails while increasing the reliability of the tool.

InferESG’s transformative potential

Through its comprehensiveness, consistency and transparency, InferESG has the potential to transform how analysts approach ESG data analysis in the sustainable investment sector. Rather than simply automating existing manual processes, the tool instead provides a solution that tackles the increasingly complex landscape of ESG monitoring and reporting – a challenge which only AI has the capacity to solve.

The productivity and efficiency gains could be substantial, with tasks that required hours of manual work taking minutes, and allowing the human analyst to focus more time on strategic decisions and deeper investigation of potential issues.

Scott Logic has open sourced InferESG (which builds on the groundwork laid by another tool we open sourced, InferLLM); this will foster continuous improvement through community-driven development. In combination, InferESG’s modular architecture and open source status will enable it to evolve alongside the rapidly changing landscape of sustainable investment.

Beyond ESG monitoring and reporting, our approach to harnessing and targeting agentic AI is adaptable to a wide range of other business use cases – anything which requires close scrutiny and interrogation of written material. We’re excited to see how InferESG evolves and realises its potential in promoting sustainable business operations and ethical practices.