Experiment: How irrational is exchange trading at short intervals (scalping)

The developer and trader Joan Christian Lotter, the creator of the Financial Hacker blog, wrote an interesting material in which he spoke about his experiment to find out whether it makes sense to trade using short and ultrashort intervals for making deals. We bring to your attention the main thoughts of this article.

Novice traders often want to work at very short time intervals. Some of them are inspired by stories about how someone earned $ 2,000 in five minutes - there is a lot of such good on financial companies. Others have heard of high-frequency trading and are now confident that the higher the frequency of transactions, the better the final result. Is this completely useless activity or does such short-term trading have any quantitative advantages? The experiment, the purpose of which was to find out, showed unexpected results.
')
The temptation to receive income for some minutes is very great. In addition, when trading on short time intervals (timeframes), more deals are made and there are more bars to analyze and improve the strategy. The quality of tests and training depends on the amount of data available. Nevertheless, scalping - that is, opening and closing deals in minutes or even seconds - among algo-traders is considered by many to be nonsense and irrational. There are several reasons for this:

Short timeframes lead to increased costs - slippage, spread, commission - in relation to the expected income.
With short intervals in the demand curve more noise, randomness and artifacts - all this reduces income and increases risk.
Each trading algorithm needs to be adapted to work with a specific broker or financial data provider, since high-quality work with the price data stream is extremely important for successful trading.
Algorithmic strategies usually just stop working on timeframes below a certain value.

High costs, less profit, greater risk, dependence on data feed, non-working algorithms are quite good arguments against scalping (HFT is a completely different story). Despite the fact that the above may be true, and for example, experiments show that when backing up strategies on timeframes for less than 10 minutes on data from different brokers, the results are very different. But this does not mean that this way of trading on the stock exchange makes no sense, maybe working with small timeframes simply requires a special approach?

To find out, Lotter conducted an experiment that included the creation of a real scalping strategy.

Impact of collateral costs

The first part of the experiment was to analyze the statistical impact on the final financial result of the associated costs. Logically, higher costs require more revenue to somehow offset them. How many successful trades need to be made in order for the profit to exceed the cost of trading on different timeframes? A small script below answers this question:

function run() { BarPeriod = 1; LookBack = 1440; Commission = 0.60; Spread = 0.5*PIP; int duration = 1, i = 0; if(!is(LOOKBACK)) while(duration <= 1440) { var Return = abs(priceClose(0)-priceClose(duration))*PIPCost/PIP; var Cost = Commission*LotAmount/10000. + Spread*PIPCost/PIP; var Rate = ifelse(Return > Cost, Cost/(2*Return) + 0.5, 1.); plotBar("Min Rate",i++,duration,100*Rate,AVG+BARS,RED); if(duration < 10) duration += 1; else if(duration < 60) duration += 5; else if(duration < 180) duration += 30; else duration += 60; } Bar += 100; // hack! }

The script calculates the minimum degree of success of transactions, which is needed in order to offset the costs of trading at different time intervals. It is assumed that there is a spread of 0.5 pips (the minimum value of the change in the price of a currency pair) and a total commission of 60 cents per 10,000 contracts - the average value of transactions in the financial market. PIPCost / PIP in the code above is a factor in the conversion from price difference to an increase or loss of money in the account. It is also assumed the absence of the influence of psychological factors on trade: all traders should, on average, make a profit or lose in the same way. This allows you to split the value of the Return of the transaction on its profitability or loss, which are determined by the WinRate. That is, if the transaction is profitable, then we will get WinRate * Return, and if it is unprofitable, then (1-WinRate) * Return. To reach a return on investment, it is necessary that the difference between successful transactions and unprofitable costs be covered. The success rate in this case should be as follows:

WinRate = Cost/(2*Return) + 0.5

This indicator is averaged over all bars and is depicted as a histogram of the duration of transactions from 1 minute to 1 day. The increment timeframe is 1, 5, 30 and 60 minutes. A deal for any duration is made every 101 minutes (Bar + = 100 in the script is a trick to start the simulation at the step starting from the 101st, while keeping the bar period equal to 1 minute).

The script takes a few seconds, and then it issues the following histogram for the EUR / USD currency pair in 2015:

To cover the costs of trading in the interval of 1 day (the rightmost bar) you need a success rate of 53%, and for 1-minute trades it already grows to 90%. That is, the ratio of reward to risk is 9: 1 for a success rate of 50%. This greatly exceeds the real income generated by trading systems. It would seem that the first punt from the list above is confirmed, and it becomes clear that the stories of “successful scalpers” from the trading forums should be taken with an occasional degree of skepticism.

But what about the second thesis that on short timeframes there is more noise and randomness? Or is there some way to work with them that will allow for greater predictability. Find out is more difficult.

Measure randomness

“Noise” is often described using a high frequency trading component called a signal. Usually, short time intervals generate more hft components than longer timeframes. They can be found with a high-pass filter and eliminated with a low-pass filter . But there is a problem: the noise of the demand curve is not always associated with high frequencies. Noise is just a part of a curve that does not carry information about trading signals. For cyclical trading, high frequencies are signals, and low frequency trends are noise. In other words, what is noise depends on a specific strategy — there is no “universal” noise.

Thus, to determine the “tradeability” of the demand curve, some other criterion is needed. This is randomness. It can be measured using the informational fullness of the demand curve. A good measure of content is Shannon's entropy . It is defined as:

This formula measures the disorder. In the case of an ordered, predictable signal, the entropy will be low. Random, unpredictable signal has a high entropy. In the formula, P (s _i ) is the relative frequency of occurrence of a particular pattern s _i in the signal s. Entropy will be maximized when all patterns are evenly distributed, and all P (s _i ) are approximately equal. If some patterns occur more often than others, the entropy decreases. In this case, the signal will be less random and more predictable. Shannon entropy is measured in bits.

Below is the code of the Shannon entropy indicator, depicted as a sequence of characters:

 var ShannonEntropy(char *S,int Length) { static var Hist[256]; memset(Hist,0,256*sizeof(var)); var Step = 1./Length; int i; for(i=0; i<Length; i++) Hist[S[i]] += Step; var H = 0; for(i=0;i<256;i++) { if(Hist[i] > 0.) H -= Hist[i]*log2(Hist[i]); } return H; }

The character is 8 bits, so the line can be 2 ⁸ = 256 characters. The frequency of each character is calculated and stored in the Hist array. This array contains P (s _i ) from the entropy formula above. Then these values are multiplied by the binary logarithm and summed. The result is H (S) - this is Shannon's entropy.

In the code above, the symbol is a patten of the signal. Thus, it is necessary to convert the demand curve into character patterns. This can be done using the ShannonEntropy function, which calls the previous one:

 var ShannonEntropy(var *Data,int Length,int PatternSize) { static char S[1024]; // hack! int i,j; int Size = min(Length-PatternSize-1,1024); for(i=0; i<Size; i++) { int C = 0; for(j=0; j<PatternSize; j++) { if(Data[i+j] > Data[i+j+1]) C += 1<<j; } S[i] = C; } return ShannonEntropy(S,Size); }

PatternSize determines the segmentation of the demand curve. The pattern is determined by the number of price changes. Each new price is either higher than the previous one or not - this is binary information describing one bit of the pattern. A pattern can include up to 8 bits, which is equivalent to 256 combinations of price changes. Patterns are also stored in a character string. Their entropy is determined by calling the ShannonEntropy function for a particular string (as you can see, both functions have the same name, but the compiler distinguishes them by parameters). Patterns are generated based on price and subsequent PatternSize prices. Then the procedure is repeated for the subsequent price, thus the imposition of patterns.

Unexpected result

Now it only remains to draw the Shannon entropy histogram, as we did above with the success rate:

 function run() { BarPeriod = 1; LookBack = 1440*300; StartWeek = 10000; int Duration = 1, i = 0; while(Duration <= 1440) { TimeFrame = frameSync(Duration); var *Prices = series(price(),300); if(!is(LOOKBACK) && 0 == (Bar%101)) { var H = ShannonEntropy(Prices,300,3); plotBar("Randomness",i++,Duration,H,AVG+BARS,BLUE); } if(Duration < 10) Duration += 1; else if(Duration < 60) Duration += 5; else if(Duration < 240) Duration += 30; else if(Duration < 720) Duration += 120; else Duration += 720; } }

Entropy is calculated for all timeframes on every 101st bar (this number is chosen to avoid synchronization problems). You can’t just skip a hundred bars, as in the previous script, as this will not allow you to adequately handle the price series. Therefore, the script needs to examine every trading minute in the last three years, which takes a few minutes.

There are two lines in the code that are important to explain - they are important for measuring the entropy of day candles using bars for periods of less than a day:

StartWeek = 10000;

The week starts at midnight on Monday (1 = Monday, 00 00 = midnight), instead of 11 pm Sunday. If you select the last value, the script will consider this one Sunday hour as a full day, which increases the randomness of the daily candles.

TimeFrame = frameSync(Duration);

This line is needed to synchronize the timeframe with the clock during the day. If it is removed, then the entropy of Shannon day candles will be too great.

Shannon entropy is calculated for a pattern with a size of 3 price changes, which results in 8 different patterns. The maximum entropy for 8 patterns is 3 bits. Since the price changes are not completely random, one would assume that the entropy will be less than 3, increasing with a decrease in the timeframe. However, for the data on the trading of the EUR / USD currency pair in 2013-2015, the following histogram is obtained:

Entropy is almost 3 bits. This confirms the thesis that price is not completely random. As can be seen, Shannon's entropy will be the smallest for the 1440 timeframe the size of minutes - it is about 2.9. This is an expected result, since daily cycles seriously affect the demand curve, so daily candles are more predictable. Therefore, price pattern algorithms often use daily candles. Entropy increases with decreasing timeframe, but this happens only up to ten minute timeframes. And even shorter time intervals are actually less random.

And this is a surprise. The smaller the timeframe, the lower the price quotes it contains, this should increase the element of randomness. But the opposite happens - this happens for different time intervals (even ultrashort 2, 5, 10, 15, 30, 45 and 60 seconds) and even for different assets.

Entropy vs timeframe (in seconds)

Now the x axis is seconds, not minutes. Still, chance decreases with a decrease in the timeframe.

There may be several explanations for this. Price granularity is higher on small timeframes, since they contain fewer ticks (minimum price fluctuations). Large deals are often broken up into many small parts (“Iceberg-orders”), which can lead to the appearance of sequences of similar price quotes at short time intervals. All this reduces the price entropy of short timeframes, but does not necessarily lead to the emergence of trading opportunities.

The sequence of identical quotes has zero entropy, they are 100% predictable, but cannot be used for trading. Iceberg transactions represent a certain market inefficiency that can theoretically be used - but this is hampered by the high costs of the transaction. If there were no brokerage commissions, then this would be possible, but not in the current situation.

findings

Scalping is not complete craziness. Even very small timeframes have some kind of regularity.
However, this regularity is difficult for a private trader to use due to the high transaction costs.
For timeframes above 60 minutes, prices become less random and more predictable. This means that such timeframes are good to use for algorithmic trading.
The most predictable prices for day bars. When trading with their use will also be the minimum cost.

Source: https://habr.com/ru/post/277337/

All Articles

Experiment: How irrational is exchange trading at short intervals (scalping)

Impact of collateral costs

Measure randomness

Unexpected result

findings

More articles: