Shedding Light on HSBC’s Use of Machine Learning in Transaction Monitoring

AML / KYC

04:36 AM 14th April 2025 GMT+00:00

Olivier Franses talks to Regulation Asia about HSBC’s use of machine learning in transaction monitoring, the challenges of implementation, and the resulting benefits.

Regulation Asia sat down with Olivier Franses, Regional Head of Financial Crime Detection for Asia Pacific at HSBC (Hong Kong), to discuss the bank’s efforts to develop machine learning models for transaction monitoring and implement the solution across Asia Pacific and beyond.

—

When did your organisation start using machine learning in transaction monitoring and how widespread is this in your APAC operations?

Olivier Franses: On the AML side, we are already covering more than 85% of our customer base in Asia through our machine learning solution called the Dynamic Risk Assessment (DRA).

We deployed the DRA in a production environment first in Singapore towards the end of 2022 and then in Hong Kong. We chose Singapore first to experiment and it was part of the pilot we ran at the group level with the UK. Shortly after that in 2023, we went live with Hong Kong, which is by far our biggest market in Asia.

We’ve been blown away by the performance of the model, so much so that we’ve accelerated our rollout plans for the rest of Asia. In 2024, we further deployed the DRA in India, Malaysia, and Australia and continue to accelerate deployment to the rest of our Asia markets.

What have been the benefits of using machine learning in transaction monitoring?

Olivier Franses: Machine learning has been such a large step change for us. In the early days, a lot of stakeholders were initially considering it being mostly an efficiency deployment and machine learning was applied as an overlay to our existing systems. With the DRA, our goal has been to be far more effective than traditional rules-based monitoring. It has become something much bigger, a paradigm shift away from rules-based monitoring, helping us detect more complex patterns faster, greatly enhancing the rate at which we detect financial crime and reducing unnecessary customer friction.

We’ve seen strong results both in terms of the quality of alerts and the accuracy with which we detect suspicious activities. We’re now finding more financial crime and doing that faster, in some cases up to 80% faster than before. In the past, some of the suspicious activity may have only been found through manual processes or other controls that were not necessarily as formalised as transaction monitoring.

For example, we’ve been able to optimise our money mule detection using machine learning and certain network analytics features that allow us to look at all customers’ counterparties and their counterparties as well, i.e. up to two hops from our customers. This ability to look at where the money is being transferred, and other attributes such as whether it involves retail or commercial entities, has been quite useful, especially in markets where we have a large presence like Hong Kong.

As part of the move to machine learning, we also moved away from the traditional investigations process with multiple queries to customers. Having fewer requests for information to customers helped reduce customer friction and has been well received by front-line relationship managers. As a customer-centric trade and commercial bank, this seemingly minor detail is crucial. We thrive on long-term trusting relationships with customers, so any step we can take to make their experience smoother is a big deal for us.

How has moving to machine learning-based transaction monitoring changed your investigations process?

Olivier Franses: We have not yet moved to using, for example, generative AI (GenAI) in the investigation stage, because we believe that the human element is crucial, especially because we’re already using machine learning in the detection layer. We are comfortable conceptually with the idea as a future evolution, but we want to be careful and want to make sure human experts remain accountable for our investigation work, so that our customers are confident in their accuracy and their own ability to redress grievances.

Machine learning in the detection layer has enabled us to move away from the traditional investigation process where operations staff would handle Level 1 and Level 2 investigations and more risky alerts would be escalated to a Level 3 investigator.

We’ve managed to reduce the false alert volume and improve alert quality through machine learning so much that we’ve been able to move to what we call a ‘single case handler’, i.e. directly sending all our alerts to skilled investigators, who would investigate each and every case from beginning to end. This is good for the bank as it’s more efficient – and good for our customers as we get fewer false positives.

We find the use of investigators still necessary and useful because it provides a feedback loop that we need for the machine learning model on the detection side. Keeping a human in the loop is the most effective way to gain a holistic understanding of our customers, their behaviours and how bad actors operate within our system, so that we can stop illegal activity quickly and fine-tune the model.

How were you able to address the data challenges commonly associated with machine learning models?

Olivier Franses: Data is a challenge. Not just the quality of the data but also its availability. And like most banks there are constraints on the compute and storage you have on your premises. In order to move to using machine learning, we decided to move to the cloud, and that in itself is a challenge.

You have internal and external stakeholders, such as your data risk owner and your regulators, who need convincing on how you can safely move your data to the cloud, especially sensitive data like financial crime risk data. That took time to overcome and is a challenge that is still present in certain APAC markets.

The data quality is important, although to a lesser extent than we initially thought. When you have a rules-based system, you need quite good data quality for the system to work. For example, if you have a missing transaction for whatever reason in your rules-based system, you will not get any alert.

We found that the machine learning-based system was somewhat less sensitive to data completeness or quality issues, because it has other features that still enable the alert to be generated. For example, you have a longer history of transactions considered that will allow the score to remain high, meaning you still have a score that triggers an alert.

The flip side of this is that you then need stronger data controls in your machine learning-based system. In traditional rules-based transaction monitoring systems, for example, when you have missing transaction data, you will quite quickly see your alert volumes go down and know that something has gone wrong.

With machine learning-based systems, data issues don’t become obvious, and you won’t stumble across these issues in the same way as you would with rules-based systems, so there is a need to enhance your data control. To mitigate this, we have implemented additional checks as part of the end to end to process.

What other challenges did you encounter when implementing machine learning for transaction monitoring?

Olivier Franses: Another challenge we had was around explainability, i.e. how do we explain how the model makes scoring decisions and how do we provide this information to the investigator. You need to find a way to get your model to speak and communicate the key reasons why the score for a given customer is high. In the DRA, contributing features are rolled up as high-level activity information that is then shared with investigators to support their investigation.

Data scientists needed to work in an environment where they had to explain and provide transparency as to how the model is working. For the investigators, this involved a change in mindset so that they think about the customer more holistically rather than just looking at a few transactions based on a set of scenarios like in the past. Using machine learning means you need to surround yourself with the right people, upskill everyone, and change the mindset of every partner involved in the process, from the first line to regulators.

We also had to overcome challenges demonstrating coverage of certain risks. In a rules-based system, you can easily link a typology (e.g. rapid movement of funds into and out of an account) to a rule. This is more difficult with machine learning. While you may have trained against historical typologies, it can be difficult to prove to regulators in terms that are familiar to them that those features are working.

In addition, when you have risk that you haven’t seen in your book, but that you know could be present or that the regulator wants you to cover, it can be a lot more difficult to evidence this ‘theoretical’ coverage in terms that are familiar. We had to demonstrate to the regulator that performance was so high on known risks that the performance for theoretical coverage would also be high as well.

How do you manage model governance and ensure the models stay fit for purpose?

Olivier Franses: We have a thorough model risk policy that our AML models fall under, following a similar methodology as we would follow if it were a credit risk model. This means we document and monitor the model in the same fashion.

For machine learning models, we have additional monitoring that focuses specifically on aspects like FEAT [Fairness, Ethics, Accountability, Transparency] and whether there is bias being introduced, and whether this is evolving over time. For this we use specific metrics to determine whether the model is drifting and whether it is drifting to a degree that requires a retrain.

We are also trying to improve the model by adding new features. Every time you add a new feature, you have to retrain your model, which is expensive and takes time. We are trying to streamline the process and make it simpler, so that we’re not retraining too frequently but are doing so often enough to allow us to add new features while also improving performance.

One challenge with this involves addressing markets that are very small, where we don’t have the scale to get the model to be trained locally. Currently, our approach is to use a model that is pre-trained on a similar looking market.

We are also exploring an approach where we would cluster our data in such a way that allows us to use data from many markets together to train a model on that universe of data. For this process we need to ensure the data quality in each individual market is good enough to link up data from many markets together.

In terms of further development, at this stage, our focus is on increasing coverage across more customers and markets, enhancing the model to add features needed to adapt to new typologies and further reducing noise [false positives].