Quantcast
Channel: Datameer Blog » Stefan Groschupf
Viewing all articles
Browse latest Browse all 17

How Big Data Helps to Find Rogue Traders

$
0
0

After the problems in the financial services industry and the billions invested in “rescuing” some of the banks, you would think folks would have already figured it out. Well, it’s obviously not an easy feat, so lets discuss what a rogue trader is and what the technical challenges are to identifying them.

Based on Basel II, legislation initially created in 2004, banks can only hold as much adequate financial capital for the risk they are exposed to through lending and investing (e.g. assets such as stocks, loans). Obviously, the price of assets are constantly changing, and therefore they have a rating associated with them that indicates their particular risk. Traders can only buy assets within a certain risk category. Higher risk assets theoretically have greater potential return, so traders that are paid based on commission are somewhat incentivized to buy and sell higher risk assets because the return on investment (the bonus check) is clearly associated with risk taking – and it’s not even their money.

If an asset or a portfolio turns upside down and cannot be sold for a profit, the trader has a problem. Optimistically, a rogue trader might hang on to this asset that now represents a high loss and hope the asset recovers to a higher value. He might even trade other risky assets to cover the loss by the new trade’s potential big profits, and so the downward spiral continues.

You would think that it would be impossible to hide a loss of hundreds of millions or even billions of dollars, due to the fact that all trades are done digitally and that assets technically are just records in a database.

If only it were that easy…

After the financial shake out, many banks merged. Banks have hundreds of databases storing asset data today. Almost every mutual fund company, trading group, or asset portfolio group can have their own database. Just one of our customers alone has over 250 databases. That’s a jungle in and of itself, without even taking into consideration external data sources, like the data that comes from the rating providers, semi-structured trading log files, and more. Data analytics is extremely slow to implement in this environment – we’re talking months, if not years – because of the complex, 3-tiered data architecture. First there is an ETL process that extracts data from one database and transforms it into the static schema of another database. We call this process schema on write. Then, on top of that, we install a BI system where the only people who can do the analytics are the ones that understand the static schema and ETL process.

Now think about coming up with a perfect schema for merging 250+ different databases into a data warehouse. This would be an academic exercise that quite frankly isn’t solvable, especially with the environment changing faster than this all can be implemented.

A rogue trader takes advantage of this data jungle and either moves assets around or hides it where others aren’t looking for it, like in account 88888 in the famous movie “Rogue Trader” with Ewan McGregor.

Using this technique, certain portfolios might have an unreasonably high risk exposure today, but because the asset can be moved around in the data jungle – tomorrow things can appear to be just fine.

However, there is light on the end of the tunnel, and it’s called Hadoop. Hadoop is what will make it possible to cut through the jungle because it will allow banks to implement a concept called “schema on read” instead of the slow “schema on write” process we previously discussed. What this means is that instead of the slow ETL process where data needs to be modeled before it’s put into a data warehouse, now banks can just dump all of their raw, untouched data into a gigantic (still cheap) Hadoop cluster and instead weave the data together when it needs to be read, on an on-demand basis. So instead of spending months or years making the data ready before it is housed in a single store, now it is possible to weave all data together in hours or days in a Hadoop cluster.

It’s also important to note that Hadoop is by no means just a data store. Hadoop is a storage and compute engine, and it is highly optimized for analytical workloads, unlike traditional data warehouses.

As to be expected with any technological disruption, at first, traditional vendors didn’t take Hadoop seriously, and then they said it wasn’t enterprise ready. Today we see they’re “connecting” to Hadoop in an effort to maintain the status quo of their traditional 3-tier approach, by claiming that Hadoop “is just another data source”.

We’re proud to report that at Datameer, we’ve helped multiple financial institutions cut through the data jungle without interruption. Datameer makes it possible for business users themselves to integrate a large number of data sources and get the fast insights about trading patterns and current risk exposure that they’ve long needed.

If you’re trying to setup a rogue trader early warning system, give us a call.


Viewing all articles
Browse latest Browse all 17

Trending Articles