There is no question that machine learning (ML) is big in the back office. In a survey of more than 100 banks, credit brokers, e-money institutions, financial market infrastructure firms, investment managers, insurers, non-bank lenders and principal trading firms last year, the Bank of England (BoE) and Financial Conduct Authority (FCA) learned [pdf] that risk management and compliance are currently the most frequent use cases for ML in the financial services sector.

Large financial market infrastructures tend to incorporate artificial intelligence (AI) -based software components to their existing multi-component platforms, as opposed to using standalone AI solutions. For instance, a substantial and complex clearing system may have multiple components in addition to the clearing engine – a platform that incorporates reference data, risk management, collateral management, payment systems, and other functions, often totalling dozens of different components.

Some of these functions can be replaced with AI-based applications, and as the BoE and FCA found, usually AI/ML is deployed for risk calculators, market surveillance, and fraud detection systems. When a traditional infrastructure is infused with AI components like this, the hybrid nature of the resulting software systems creates challenges around ensuring smooth integration and assurance of the resulting system’s quality. As the methods for market abuse continue to evolve at the same pace as the sophistication of market surveillance systems deployed to fight fraud, firms should rethink how they approach validating and verifying these AI-based applications.

Complexity

When merging traditional technology with AI, the biggest mistake is underestimation of the original technology. The technology underpinning capital markets is notoriously complex and prone to non-deterministic behaviour.

When AI/ML are integrated into these existing systems, their complexity and nondeterminism in system behaviour are magnified as these characteristics are inherited by systems enhanced with AI. The key to mitigating this challenge is testing the market surveillance system with an equally powerful AI application.

Specifically, the approach should account for a set of market surveillance subsystems:

• a gateway subsystem that obtains data from different data sources
• a data enrichment subsystem
• real-time and offline alert engines to detect abusive market behaviour
• a data repository to store structured and unstructured data
• a GUI module that provides drill-down capabilities to investigate the detected alerts.

Data

Market surveillance systems are built on data analytics and pattern recognition, typically interconnected with numerous trading platforms, market data feeds and news feeds. The introduction or upgrade of any of those systems can severely impact the effectiveness of the monitoring platform, and so testing requires a comprehensive understanding of the surveillance monitoring function.

The best approach is end-to-end testing for the whole message flow. This starts with injecting data in the upstream system via the trading interface (e.g. FIX Gateway), receiving messages via the market surveillance stream gateway, and then comparing the received messages with the expected ones. If a response is not available from the market surveillance system in real-time via a channel, testing by comparison is used. This type of test comprises comparing the contents of output data from the system against the actual results.

The data consistency verification test is performed after each operational cycle of the system and compares messages from different end points of the market surveillance system under test and the external (integrated) systems. Data from the downstream gateways of the exchange system and the trade reporting system, data from the input gateway of the market surveillance system and data from the market surveillance system’s data warehouse are the end points tested to verify data consistency.

The Oracle problem

Another challenge commonly associated with AI systems is known as the Oracle problem, when there are no predefined rules on which input-output combinations are right and which are wrong. Though in most cases the decision as to whether the test passed or failed is not made by humans, human intervention is useful when the test’s outcome is not strictly defined. When the Oracle problem comes into play, testing should be significantly enhanced by automation, and, more importantly, equipped for collection and storage of test execution data. The enables the results to be analysed first the scripts and algorithms and evaluated and verified by humans.

In addition to verifying the popularity of ML and AI applications for market surveillance, the BoE/FCA survey found that of the respondents currently using ML, more than half indicated their applications are governed through their existing model risk management framework or enterprise risk function. Given that the large number of connections and components contributes to the probabilistic nature of the behaviour of an ML-empowered surveillance mechanism, an AI-based system is unlikely to be successfully validated through traditional testing. An equality-sophisticated approach to testing not only ensures the success of a market surveillance system, but optimises the system to increase efficiency, mitigate costs, ensure regulatory compliance and better prepare a firm to mitigate the ongoing evolution of financial fraud.