Red Flags and Outlier Detection

“The uneducated person perceives only the individual phenomenon, the partly educated person the rule, and the educated person the exception.”
Franz Grillparzer

Although there is still considerable debate on exactly what ‘best execution’ means in the FX markets, one component that has become clear is that any best execution policy should include a process to identify, monitor and record outliers. The recently published Q&A from ESMA reiterated this:

Firms’ processes might involve some combination of front office and compliance monitoring and could use systems that rely on random sampling or exception reporting. ESMA Level 3 Q&A on MiFID 2/R Investor Protection issues.

The question now arises – how should I define what is an outlier? As with most things, as soon as you start getting into the details it becomes clear that this is not necessarily straightforward and involves a number of factors. In this article, we explore these factors and suggest some approaches for what we are seeing at BestX evolve as best practice.

MiFID II Article 27(1) defines best execution as the obligation on firms to “take all sufficient steps to obtain . . . the best possible result for their clients taking into account price, costs, speed, likelihood of execution and settlement, size, nature or any other consideration relevant to execution” (emphasis added). 
The core components of defining an exception, or outlier, reporting process for each of the best execution factors, namely price, costs, speed, likelihood of execution and settlement, size, nature or any other consideration can be summarised as follows:

  1. What metrics should be used to define an outlier?
  2. What time stamp should be used as a reference?
  3. What values should be set as thresholds?
  4. What frequency should the exception report be run?

Challengingly, each of these core components must be analysed for each of the best execution factors which is why state of the technology is such a core part of a satisfactory regulatory compliance program in MiFID ll

[a]dvances in technology for monitoring best execution should be considered when applying the best execution framework. MiFID II Recital 92


Choice of metrics to be used for defining outliers should be driven by the overriding objectives of the best execution policy. As a minimum, including some measure of price and cost is key although there are different options here as we will explore shortly. Within FX, price and cost tend to be used interchangeably as costs are generally captured within the bid-offer spread, and therefore, within the price. As the market moves to a more order driven market, with explicit commission, it will become more relevant to measure the explicit costs arising from fees from brokers and venues separately.

However, there are also other metrics that may be relevant depending on the best execution policy. For example, for a passive, equity index tracking fund where the resulting FX is all benchmarked to the WMR Fix, it may be appropriate to also include slippage to the WM benchmark as a metric. For a quant fund, which is focused on minimising slippage to Arrival Price, it may be relevant to include slippage to this benchmark in addition to cost. Or, for an active fund that trades a significant proportion of its volumes via algos, often in large sizes, then the best execution policy may require a focus on identifying those algo trades that potentially create more signalling risk than others.

With regards to price and cost, it is important to understand the precise measure of cost to be measured and used for outlier detection. Clearly, a key value that needs to be computed as part of any best execution process is the actual cost incurred, as a simple measure of the difference in the price at which a trade was filled and the prevailing market mid. We will discuss the issue of which time stamp to choose for this exercise later, but for now we will assume this is the mid at the time stamp that the trade was filled (i.e. the completion time stamp). Such values are a key requirement as stipulated by various regulators and legislation, for example the recent FCA paper on disclosure of costs for pension funds  and requirements under PRIIPs.

Such a measure can be used as a metric for outlier detection. For example, an institution may want to be notified of any trade which generates a cost of greater than 10 bps (as defined as a spread to mid for the completion time stamp). Many institutions which use such a measure may need to set up multiple reports to cope with the fact that different thresholds need to be applied to different currency pairs or groups of currencies. For example, typically different thresholds would be set for EURUSD and USDJPY, compared to USDZAR and USDTRY.

However, increasingly the feedback from clients is that a simple comparison to mid is too much of a one-dimensional measure, and does not help with differing liquidity and costs across currency pairs and time of day. Clients have asked for a specific fair value cost measure, against which the actual cost (as defined above) can be compared. So, for example, if a 100 EURUSD trade has generated an actual cost of 4 bps, and the fair value, or expected, cost for such a trade at that time of day in that size was 3.2 bps, then a useful metric to monitor may be the difference between the two (i.e. 0.8 bps). Exceptions can be defined on this difference in that any trade that generates an actual cost of more than, for example, 3 bps than the expected cost would be deemed an outlier. This provides an elegant way of providing a consistent benchmark, allowing for different liquidity conditions etc. In addition, it doesn’t require absolute precision in the time stamp, as the actual and expected costs are computed at the same time, which can be particularly useful for voice trades.

Time Stamps

When computing the cost for which to determine outliers, what time stamp for the trade should be used? There are options here, for example, the time stamp for when the order is first originated could be used, or the time it arrives at the execution desk or the completion time stamp when the trade was actually executed, as referred to earlier.

Again, there does not yet appear to be a standard approach and, to some extent, it depends on what you are trying to measure and manage. If the time stamp is taken when the order arrives at the execution desk, then two key elements are included: i) the slippage arising from market movements during the time taken for the desk to deliver the order to the market and achieve execution, and ii) the actual cost (i.e. spread) applied by the executing counterparty. An advantage of using this time stamp is that many institutions can record the desk arrival time stamp with some accuracy via their OMS, whereas for voice trades the completion time stamp can be subject to error.

If, however, you want to purely focus on the performance of the counterparty, and you are confident in the accuracy of the time stamps, then the completion time stamp is more applicable. This would allow outliers to be identified that are based purely on the actual execution cost, unpolluted by slippage from adverse market movements whilst the order was delivered to the market.


So, you’ve decided the metrics and the time stamps to be used in the exception reporting process, but what values should you set for the thresholds above which outliers are reported? Given the credit driven nature of the FX market, and the heterogeneity of participants, there are no simple rules here. Threshold values are going to be institution specific, and should be agreed and included in the relevant best execution policy. Clearly, the goal here is to identify trades that warrant further investigation and explanation; i.e. trades that are executed outside of ‘normal’ expectations. Thresholds should therefore be set at levels that do not create thousands of outliers per day as such noise masks the real red flag trades that should be investigated, never mind the time taken to process such a volume of exceptions. At BestX we’ve seen many institutions set threshold values based on empirical results, i.e. review results over historical periods of time and estimate appropriate values per currency pair group, product and tenor.

It is important to regularly review whether such levels remain appropriate given changing market conditions and structural execution changes. For example, an institution that has historically outsourced its execution to a custodian may need to review the threshold values if the policy is changed such that execution is brought in house and then traded with multiple counterparties in competition. Equally, if the FX market moves into a new regime of volatility, then this may warrant levels to be adjusted accordingly. There is an argument that levels should be set dynamically to also take into account forthcoming key event days. For example, it would have been justified to widen the threshold levels for GBP pairs for Brexit.


We have seen a variety of use cases with regards to the frequency of outlier reporting and monitoring. Typically, many institutions have implemented a daily process, usually run at the end of the day or overnight. The advantage of such frequency is that outliers are explained, recorded and managed whilst the experience is fresh in everyone’s minds. Where OMS/EMS allow, real-time outlier identification can also be valuable as it allows discussion with executing counterparties at the point of trade, rather than waiting on a t+1 basis.

Generally, such a process is supplemented with aggregated summary reporting on a lower frequency. For example, Heads of Execution often receive a weekly summary of all outliers generated that week, with explanations and approval status clearly recorded. In addition, many institutions have monthly Best Execution committees, where summary outlier reports are presented and reviewed.
It should also be noted that on demand reporting, for any time period, is increasingly important in order to respond to adhoc requests from, for example, regulators.


As with most aspects of best execution within OTC markets, there is currently a lack of standardisation. In some respects, this feels appropriate as best execution is a concept that is very specific to a particular institution, albeit there are core components that would benefit from some market standards. Outlier detection and reporting is no exception, and as discussed in this article, processes for identifying exception trades are also specific to some extent.  As we have seen there are a number of factors to consider in any outlier process, and a key conclusion from this is that it is important to have a flexible technology solution that allows the exception reporting process to be tailor-made and adjusted dynamically.

A well designed and implemented exception reporting process can add more value than simply satisfying fiduciary and regulatory best execution responsibilities. With appropriate reporting and analysis, it is possible to use the output of such a process to identify potential abnormalities in an execution process, e.g. maybe one particular counterparty is responsible for the vast majority of outliers in a specific EM currency pair. This may then allow the positive feedback loop discussed in a previous article  to be engaged, whereby adjustments are made to the execution policy (e.g. this particular counterparty is no longer used for this specific pair) resulting in improved performance.