5 KEY CHALLENGES TO IMPROVE HITS
IN NAME SCREENING

Much is at stake when it comes to name screening technology for sanction and PEP (politically exposed person) lists. To comply with AML (antimoney laundering) regulation, financial institutions have to check these lists before accepting applications to open a new account or subscribe to a new financial product.

Inadequate name screening can be very detrimental and severely impacts the bottom line of financial institutions. On one hand, they risk hefty fines and reputation damage in case of systematically missing sanctioned individuals or organizations. On another hand, in case of mistaking a customer for a sanctioned individual, loss of business and a degraded experience.

Now the question is : How do you know if you are really making good decisions about your business relationships ? In other words, is your name screening solution providing you with the right and reliable information to ensure compliance with KYC regulations ?

Let’s have a close look at major challenges to be taken up when searching for name matchings around the KYC process.

Sanction and watch lists

Information is your hard currency

Name screening can only be effective when it’s fed the correct and complete data. That being said, official sanction and watch lists, whether international, local and private lists, should all be taken into account when it comes to satisfy your KYC requirements.

As multiplicity of sources means generally a variety of formats and shapes, the access to information doesn’t seem to be an easy task. Add to that the data quality and constant updates of these sources.

Redundant, missing, out-of-date and inaccurate data are the most limiting factor for accurate name screening and may pose a compliance risk for financial institutions.

How to cope with it?

In order to make sure you’re properly using data from sanction and watch lists, one ideal solution is to consider this data as a centralized source of information. Holding data this way makes it much easier to manage redundancy and detect inaccuracy during the name matching process.

An ideal name screening solution should also provide regular updates ensuring the useful information at the right time. Reliable information is then the basis for better business decisions.

Latency of name search queries

Better results in less time

Checking a name matching in sanction and watch lists is like looking for a needle in a haystack. Long lists of names mean hundreds of millions of data and thus unlimited comparisons either when searching for person or entity names.

But when it comes to extract critical information in your customer onboarding and, quite importantly, during your periodic KYC investigations, dealing with such a huge amount of data isn’t the only concern. Doing it in real time is actually the real challenge.

How to cope with it?

When it comes to improving the speed of data retrieval operations, data structure and analytical capacities are among the major matters in name search engines. One of popular tools known for its analytical performance but also for its capacity of data storing is Elasticsearch.

Elasticsearch is a search engine used by major name screening solutions making it possible to execute complex queries in a least amount of time.

Input data preprocessing

Quality over quantity

Too much information is equal to no information. Indeed, stop words (the, is, at, which, on, etc.), titles (Dr, Mrs, etc.) or even nicknames appearing in input data can lead to many false positives; which means countless irrelevant results so that the real significant information becomes no more easily reachable.

Besides, false positives may impact in a bad way your team’s productivity as your analysts must take an important time to extract the useful information from a huge bunch of results.

In order to reduce false positives in names matchings, no data is used as it is before executing search queries, but should be rather preprocessed.

How to cope with it?

Input data preprocessing is a set of operations aiming to improve the quality of data before executing the search process. It mainly consists of data cleansing and name standardization.

Data cleansing is about removing insignificant words such as titles, stop words, prefixes and every irrelevant information that doesn’t make sense to the search query. Then, comes the process of name normalization.

The approach of name normalization is to use references called translation dictionaries in order to “translate” aliases to the original name. Thus, this makes sure we keep searching for the right name no matter what the input nickname is.

As person and entity names come from different languages, preprocessing operations become more challenging and complicated. Consequently, a good name screening solution must then be flexible while defining irrelevant words to be discarded and common used aliases to be normalized no matter what the source language is.

Transliteration complexity

Maintain accuracy with linguistic variations

As international sanction and watch lists contain names belonging to Arabic, Chinese, Persan, Russian and many other nationalities, the task of detecting name matchings becomes further complicated.

The current difficulty is that when a non-Latin script name must first be converted to Latin characters before comparison can be executed. Given the linguistic variations between both languages (Latin and non-Latin ones), conversion potentially does not deliver accurate results.

How to cope with it?

Eventually, there is no unique magic solution to completely clear up the transliteration complexity in name screening. Rather, many techniques are implemented for this purpose like phonetic matching algorithms, distance algorithms, etc.

One famous example of distance algorithms is Levenshtein Distance which aims to measure the similarity between two strings by returning as a result the number of transformations required to convert a word to another one.

Another example is phonetic matching algorithms which focus on pronunciation while comparing two words. It’s very useful with names and surnames but not with other attributes when semantic similarity is required.

Linguistic variations are still there and remain a big challenge for name searching engines when it comes to an accurate and consistent matching. Nevertheless, each of these techniques has its weaknesses and strengths related to accuracy, maintenance, latency, etc. An ideal approach applied by some screening solutions relies on merging all of these methods into a single one ensuring then quick and accurate results whatever the name’s original language is.

Fuzzy matching

Name searching in screening solutions is all about fuzzy matching algorithms. Unlike exact matching methods, these algorithms aim to identify nearly similar names with related scores. Score is then a key element to decide whether the person’s name actually appears in sanction or PEP lists.

Score calculation refers to multiple factors and uses multiple methods taking into account language variations, input data quality and further measures which make it increasingly complicated.

The complexity level is even higher when it comes to iterating these calculations through long lists of names. Indeed, doing things in that way must be both time-consuming and resource-intensive.

How to cope with it?

In order to reduce score calculation’s complexity while obtaining more accurate results, some developers of screening solutions prefer doing things more properly and break the problem into two parts: coarse-grained search & fine-grained search.

Actually, the two-pass process in score calculation seems to be doing a great job. In fact, the first pass consists of extracting a set of likely matching candidates using fast methods. The second pass, fine-grained search, aims to apply advanced statistical methods to compare names with a greatprecision level.

In this way, the hardest work of score calculation is focused on the most qualified data. Consequently, the complexity level is decreased and the quality of results is improved.

How does Vneuron’s Risk & Compliance solution improve hits in name screening ?

In order to meet all the challenges at once, Vneuron’s Risk & Compliance solution relies on a complete approach of name screening. Many technologies are used in order to optimize practices of data extraction and exploring while ensuring the most rigorous results.

The right information at the right time

Vneuron’s Risk & Compliance solution is equipped with connectors to all sanction and watch lists allowing smooth access to all major data providers, no matter what the data format is. Data is then imported, converted to one single format and finally stored in one place.

In result, Vneuron’s Risk & Compliance solution provides a centralized database regularly updated making it a reliable and easily accessible source for all your name screening queries. With such a global overview of sanction and PEP lists, you are unlikely to miss any important detail in your KYC process.

Robust indexing and analytical features

Vneuron’s Risk & Compliance solution supports the Elasticsearch tool which means a lot of indexing features including stemming, tokenization, custom analyzing, etc. As a powerful text search and analytical engine, Elasticsearch enables search from millions of data and returns results in just a few milliseconds.

Furthermore, the tool supports a performant aggregation module and this is where the real power of Elasticsearch kicks in. It makes it a powerful analytical engine able to easily deal with complexity of search queries.

Rich sources for input data preprocessing

Data cleansing and name standardization are the key operations held by Vneuron’s Risk & Compliance solution while input data preprocessing. For optimum results, Vneuron’s Risk & Compliance solution provides rich lists very useful as a reference to clean data no matter what the target language is.

It’s also equipped with translation dictionaries in multiple languages enabling better results in names normalization. All resources are configurable and can be expanded in any language according to your needs, which makes data preparation more flexible with Vneuron’s Risk & Compliance solution.

Optimized fuzzy matching with different languages

The name screening approach in Vneuron’s Risk & Compliance solution applies fuzzy matching through a two-pass hybrid approach (coarsegrained search & fine-grained search) in order to both reduce language complexities and give relevant scores.

In the first pass, the coarse-grained search, the tool uses common key methods to extract the most possible good candidates for name matching. Then, advanced score calculations are carried out, as a fine-grained search, to determine the final matching results while referring to a certain limit:

the fuzzy threshold. This value is configurable according to your precision preferences.

The high point of Vneuron’s name screening solution relies on the fact that it holds multiple languages at once with respect to common linguistic variations. Accurate and quick results are ensured in the best way no matter what the language is.

Name screening is your foundation stone in your process to meet your compliance and KYC regulations. So, choosing the right name screening solution is the first-ever best decision you should make before being able to decide correctly about your business relationships. If you need advice concerning the choice of your name screening solution, feel free to ask our experts.

5 KEY CHALLENGES TO IMPROVE HITS IN NAME SCREENING

Sanction and watch lists

Information is your hard currency

Latency of name search queries

Better results in less time

Input data preprocessing

Quality over quantity

Transliteration complexity

Maintain accuracy with linguistic variations

Fuzzy matching

How does Vneuron’s Risk & Compliance solution improve hits in name screening ?

The right information at the right time

Robust indexing and analytical features

Rich sources for input data preprocessing

Optimized fuzzy matching with different languages

5 KEY CHALLENGES TO IMPROVE HITS
IN NAME SCREENING