The purpose of this article is to share the results of a study to identify the structure in the values of stock prices that are traded on the Moscow Exchange and on the NYSE, by checking them for stationarity using the Dicky-Fuller test.
There is a small class of stocks, which is a non-stationary process with stationary increments and the distribution of whose t-statistics behaves in a rather curious way, namely, it does not tend to the standard normal distribution with an increase in the number of observations. How to identify such shares?
Data collection
The first thing we need is a list of tickers that can actually be traded through one of the brokers. A ticker is a short name in the stock information of quoted instruments (in this case, stocks). I will start with the Russian stock market. We have the most popular exchange - Moscow, so let's talk about it.
At some point in my life, I went offline and entered into a brokerage service agreement with Sberbank. In this scenario, the list of tickers can be obtained quite simply. It is enough to load the QUIK system, order the stock quotes flow:
')
Then put them on the Trade tab and save the table to a file. A total of 296 tickers. If you do not really want to arrange a quest for yourself, then a list of tickers can be found in the
Data examples section on the Moscow Exchange website. The only thing, perhaps, this list is outdated.
The second thing we need is stock price data. How many of them are needed to compile an adequate picture of the market and for exactly what interval to take them is a matter of dispute. The technical limitation that we will encounter next is at least 10 values for each share.
I wanted for the tickers received earlier to take data on the closing prices of shares for 2016 (252 trading days), but you can take data for a shorter period, you can generally do research within a day. By and large, it is not so important to us now what data we will take. The only thing is, if we go to the Moscow Exchange and request archival data for 2016 from it, then it will tell us that for all this business you need to pay 32,400 rubles (earlier, by the way, the data were even more expensive). I wrote a parser out of harm.
In fact, it is much faster to get data from a finam or from Yahoo Finance, but for some reason it’s not so fun. At one time I even contacted the head of the information audit team and asked if I could parse their website for research. I was allowed.
[There were a lot of letters and code about the parser, but I deleted everything, because it's probably not very interesting. If I am mistaken, write, - I will publish the parser as a separate article.]
I managed to collect data for 289 shares (for the rest of the data there was not). For convenience, tickers and prices have been saved in a Microsoft SQL Server database. Now go to math.
About stationarity
If we talk about stationarity without any formulas and complex concepts, the point is that a stationary series does not change its characteristics over time, such as expectation, dispersion, and covariances.
The stock price can be viewed as an autoregression process of the order 1:

Where

- model parameter,

- White noise,

. Such a process is stationary provided

.
Suppose we have stock prices for 252 trading days. How can we determine from the available observations whether such an autoregressive process is stationary or not? A standard hypothesis testing procedure is required.

:

(i.e., non-stationary process) versus alternative hypothesis

:

(that is, the process is stationary).
Dickie-Fuller distribution
In fact, testing the hypothesis is not so simple, because if the true value

, t-statistics is not distributed according to Student's law and its distribution does not tend to the standard normal with an increase in the number of observations. In this case, we can not just take the table of student’s critical values and test a hypothesis using it.
The t-statistic here refers to the ratio of the deviation of the estimate of the parameter of the autoregressive model from its true value to the standard error of the coefficient estimate:
Where

- parameter estimate of the autoregressive model (1),

- standard error of assessment

. Coefficient estimate

in an alternative model, it can be performed using the conventional least squares method (OLS).
For the first time that everything is not normal with t-statistics, in 1976 Wayne Fuller began to speak. Then in 1979, he and David Dicky wrote a curious article titled “Distribution of Time for the Unit with a Root”.
To take a sober head it is almost impossible to disassemble it, but it was there that they presented the distribution of t-statistics provided

that is, when

(which was named Dickie-Fuller statistics), for equation (1) and its two modifications:


For equation (1), the Dickey-Fuller distribution is:
Where

- t-statistics for the process (1),

- standard Wiener process.
The critical values of the Dickie-Fuller statistics are given in the book “Introduction to Statistical Time Series”. Thus, to test the autoregressive process for stationarity, it is necessary to use the standard hypothesis testing procedure with the difference that instead of a table of critical values for Student’s distribution, it is necessary to use a table of critical values for the Dickey-Fuller distribution.
It is also important to note that equations (1), (2) and (3) can be rewritten in the form:



Where

, but

. Processes (4), (5) and (6) can be evaluated and tested with

similar to testing a hypothesis when

. Consequently, Dickie-Fuller statistics allows you to check for stationarity not only of the process itself, but also of its first-order differences.
About the test Dicky-Fuller
The Dickie-Fuller test is in all standard packages, so we can check stock prices obtained at the data collection stage for stationarity, for example, in MATLAB. Below is the code that establishes a connection to a Microsoft SQL Server database (where stock price values and tickers are stored) and two arrays are created. The first is directly for prices, the second is only for those tickers for which price data is available:
conn = database.ODBCConnection('uXXXXXX.mssql.masterhost.ru', 'uXXXXXX', 'XXXXXXXXXX'); curs = exec(conn, 'SELECT ALL PriceId, StockId, Date, Price FROM StockPrices'); curs = fetch(curs); data = curs.Data idsArr = unique(cell2mat(data(:,2))); sqlquery = 'SELECT ALL StockId, ShortName, Code FROM Stocks WHERE StockId IN ('; for i=1:length(idsArr) if i==length(idsArr) sqlquery = strcat(sqlquery,int2str(idsArr(i)),')'); else sqlquery = strcat(sqlquery,int2str(idsArr(i)),','); end end curs = exec(conn, sqlquery); curs = fetch(curs); names = curs.Data close(conn);
The Dickey-Fuller test is performed using the adftest function, which accepts a one-dimensional time series as input, and returns a logical value of 1 at the output if the null hypothesis is rejected in favor of the alternative one, and 0 otherwise. Perform the Dickey-Fuller test at a 5% level of significance for a model of the form (1):
for i = 1:length(names)
The program rejects the null hypothesis 5 times in favor of an alternative model. We depict these time series:
for i=1:length(stat) indexes = find(cell2mat(data(:,2)) == cell2mat(names(stat(i),1))); figure plot(datetime(data(indexes,3)), cell2mat(data(indexes,4))) legend(names(stat(i),3)); end
Let's look at the chart of the price change of one of the shares.
Here it can be seen that the time series of the stock price is not stationary.
We construct first-order differences for a given time series.
It seems that first-order differences for a time series of stock prices do satisfy the stationarity condition.
NYSE Results
The same studies were carried out for the American stock market, namely for the New York Stock Exchange. The list of tickers was taken from the
NASDAQ website. There are currently 2,714 adequate tickers. Pricing data was taken from Yahoo Finance. There were 2647 tickers, for which there is data on stock prices for 2016, and as a result of testing for stationarity, 26 stocks with stationary increments were obtained.
findings
In stock markets there are a large number of assets for which the price change is a non-stationary process with stationary increments. The presence of such processes provides the basis for further research and a stable profit, but we'll talk about it next time.
What to read on the topic?
Magnus, Ya.R. Econometrics. Initial course / Ya.R. Magnus, P.K. Katyshev, A.A. Peresetsky. - M .: Delo, 2004. - 576 p.
This is a very good textbook on econometrics, no worse than bourgeois, and written well, so you can figure it out.
UPD.
Analytics on shares with stationary increments for 2017 on the Moscow Stock Exchange .