X

Download Can you Deep Learn the Stock Market? PowerPoint Presentation

SlidesFinder-Advertising-Design.jpg

Login   OR  Register
X


Iframe embed code :



Presentation url :

Home / Business & Management / Business & Management Presentations / Can you Deep Learn the Stock Market? PowerPoint Presentation

Can you Deep Learn the Stock Market? PowerPoint Presentation

Ppt Presentation Embed Code   Zoom Ppt Presentation

PowerPoint is the world's most popular presentation software which can let you create professional Can you Deep Learn the Stock Market? powerpoint presentation easily and in no time. This helps you give your presentation on Can you Deep Learn the Stock Market? in a conference, a school lecture, a business proposal, in a webinar and business and professional representations.

The uploader spent his/her valuable time to create this Can you Deep Learn the Stock Market? powerpoint presentation slides, to share his/her useful content with the world. This ppt presentation uploaded by gaetan in Business & Management ppt presentation category is available for free download,and can be used according to your industries like finance, marketing, education, health and many more.

About This Presentation

Can you Deep Learn the Stock Market? Presentation Transcript

Slide 1 - Can you Deep Learn the Stock Market? Gaetan Lion, March 20, 2022
Slide 2 - 2 Introduction Objectives: We will test whether : Sequential Deep Neural Networks (DNNs) can predict the stock market better than OLS regression; DNNs using smooth Rectified Linear activation functions perform better than the ones using Sigmoid (Logit) activation functions. Data: Quarterly data from 1959 Q2 to 2021 Q3. All variables are fully detrended as quarterly % change or first differenced in % (for interest rate variables). Models are using standardized variables. Predictions are converted back into quarterly % change. Data sources are from FREDS for the economic variables, and the Federal Reserve H.15 for interest rates. Software used for DNNs. R neuralnet package. Inserted a customized function to use a smooth ReLu (SoftPlus) activation function.
Slide 3 - The underlying OLS Regression model 3
Slide 4 - 4 The best underlying OLS Regression model After testing many macroeconomic variables (interest rates, monetary policy (QE), fiscal variables, and many others) the best OLS regression included the following variables, in order of predominant selection: Consumer Sentiment (U of Michigan); Housing start; Yield curve. Difference between 5 Year Treasury minus Federal Funds; Real GDP growth.
Slide 5 - 5 Explanatory logic of OLS Regression to estimate and predict the S&P 500 level Consumer Sentiment is by far the most predominant variable. This is supported by the behavioral finance (Richard Thaler) literature. Housing Start (the 2nd variable), is supported by the research of Edward E. Leamer advancing that the housing sector is a leading indicator of overall economic activity, which in turn impacts the stock market. Next, the Yield Curve (5 Year Treasury minus FF), and economic activity (RGDP growth) are well established exogenous variables that influence the stock market. Both are not quite statistically significant. And, their influence is much smaller than for the first two variables. Nevertheless, they add much explanatory logic to our OLS regression fitting the S&P 500.
Slide 6 - Relationships between the S&P 500 and the independent variables 6
Slide 7 - 7 Scatter Plot Matrix of Variables The Yield curve has a surprisingly low correlation with the S&P 500 quarterly % change. Otherwise, the three other independent variables have material correlation with the mentioned S&P 500. There is no multicollinearity between the X variables, as their respective correlations are way below standard multicollinearity thresholds.
Slide 8 - 8 A closer look: Consumer Sentiment, Housing Start Both variables have a correlation around 0.4 with the S&P 500 quarterly % change. As shown, a 0.4 correlation is associated with much randomness. The data points show a wide dispersion around the estimated regression trend line.
Slide 9 - 9 A closer look: Yield Curve, and RGDP Same comment as on the previous slide. Also, you can see how the relationship between the S&P 500 and the Yield Curve (on the left) is the weakest as the slope of the regression trendline is almost flat (close to Zero).
Slide 10 - 10 A quick word about DNNs Activation Functions
Slide 11 - 11 Common DNNs Activation Functions Until around 2017, the preferred DNN activation function was the Sigmoid or Logistic one as it had an implicit probabilistic weight to a Yes or No loading of a node or neuron. However, soon after the Rectified Linear Unit (ReLU) became the preferred DNN activation function. We will advance that SoftPlus, also called smooth ReLU, should be considered a superior alternative to ReLU. See further explanation on the next slide.
Slide 12 - 12 The Sigmoid or Logistic Activation Function There is nothing wrong with the Sigmoid function per se. The problem occurs when you take the first derivative of this function. And, it compresses the range of the values by 50% (from 0 to 1, to 0 to 0.5 for the first iteration). In iterative DNN models, the output of one hidden layer becomes the input for the sequential layer. And, this 50% compression from one layer to the next can generate values that converge close to zero. This problem is called the “vanishing gradient descent.” We will see that in our situation, this problem is not material.
Slide 13 - 13 ReLU and smooth ReLU or SoftPlus Activation Functions SoftPlus appears superior to ReLu because it captures the weights of many more neurons’ features, as it does not zero out any such features with an input value < 0. Also, it generates a continuous set of derivatives values ranging from 0 to 1. Instead, ReLu derivatives values are limited to a binomial outcome (0, 1).
Slide 14 - 14 The Models
Slide 15 - 15 The DNNs structure One input layer with 4 independent variables: Consumer Sentiment, Housing Start, Yield Curve, and RGDP. Two hidden layers. The first one with 3 nodes, and the second one with 2 nodes. Activation function for the two hidden layers are SoftPlus for the 1st DNN model, and Sigmoid for the second one. One output variable, with one node, the dependent variable, the S&P 500 quarterly % change. The output layer has a linear activation function. The DNN loss function is minimizing the sum of the square errors (SSE). Same as for OLS. The balance of the DNN structure is appropriate. It is recommended that the hidden layers have fewer nodes than the input one; and, that they have more nodes than the output layer. Given that, the choice of nodes at each layer is just about predetermined. More extensive DNNs would not have worked anyway. This is because the DNNs, as structured, already had trouble converging towards a solution given an acceptable error threshold.
Slide 16 - 16 The 3 Models’ fit of the historical data Despite the mentioned limitation of the Sigmoid activation function, the SoftPlus and Sigmoid DNN models perform virtually identically. And, they both fit the complete historical data quite a bit better than the OLR regression model. However, as we will soon see, none of the models fit the historical data particularly well.
Slide 17 - 17 The three models’ fit of the historical data: scatter plots Visually, you can’t distinguish any difference in tightness of fit between the two DNNs (SoftPlus on the left, Sigmoid in middle). As mentioned, the Sigmoid “vanishing gradient descent” problem did not materialize. R Square 0.415 R Square 0.27 R Square 0.412
Slide 18 - 18 The DNN models’ fit of the historical data: time series plots Again, you can’t visually distinguish between the SoftPlus (top) vs. the Sigmoid (bottom) model.
Slide 19 - 19 The OLS Regression model fit of the historical data: time series plots The OLS Regression model fit is weaker than the two DNNs. This is by definition. The DNNs use so many non linear segmentation of the variables relationships that it is bound to generate a superior fit of historical data. As we will see, the DNNs’ superior fit does not translate in superior out-of-sample predictions.
Slide 20 - 20 All model estimates (or fit) time series on the same graph
Slide 21 - 21 Same visual data as on previous slide but disaggregated The DNN models capture a bit more of the volatility in the S&P 500 quarterly % change. The standard deviation of Actuals is 7.4%; for the DNNs it is about 4.8%; and for the OLS regression it is 3.8%.
Slide 22 - 22 How do the models fit abrupt changes in S&P 500 defined as absolute changes of > 14%. The models do not do a very good job at picking these outliers. The performance of the two DNNs is indistinguishable. And, it is only incrementally better than the OLS Regression model.
Slide 23 - 23 Testing the 3 models Can these 3 models predict? By predicting we mean whether they can generate descent S&P 500 quarterly % estimates based on “new data” not included in the training of the models.
Slide 24 - 24 Three different Testing Periods Each testing period is 12 quarters long. And, it is a true Hold Out or out-of-sample test. The training data consists of all the earlier data from 1959 Q2 up to the onset of the Hold Out period. Thus, for the Dot.com period, the training data runs from 1959 Q2 to 2000 Q1. The quarters highlighted in orange denote recessions. We call the three periods, Dot.com, Great Recession, and COVID periods as each respective period covers the mentioned events.
Slide 25 - 25 Testing Performance Part 1: Dot.com period The performance of all 3 models during the Dot.com period is really bad. None of them captured the severe market downturn over this entire period. But, at the margin notice that the OLS model performed best. We are showing the model predictions on an indexed basis where Period 0 or 2002 Q2 is equal to 100. The next 12 quarters represent the 12 quarterly periods of the forecast within this Hold Out test.
Slide 26 - 26 Testing Performance Part 2: Dot.com period Here we are showing the annual % change in the S&P 500 in the 1st, 2nd, and 3d year of projections. And, we are aggregating the predictions by models. So, we see what the “skyline” looks like for each different models. As shown, for all 3 models, the predictions are really pretty bad. None of the models captured the Dot.com protracted long market correction.
Slide 27 - 27 Testing Performance Part 3: Dot.com period This is the same visual data as shown on the previous slide, except that the data is clustered by Years instead of by models. The conclusion is the same. All three models predicted poorly over the Dot.com period.
Slide 28 - 28 Testing Performance Part 4: Dot.com period This compares the Goodness-of-fit metrics for the Training model vs. the same metrics for the 12 quarters Testing period, consisting of new data. Surprisingly, in this case the R Square is often higher during the Testing period vs. the Training one. This is unusual. Yet, despite those occasional higher R Squares during the Testing periods, the predictions were rather dismal. Focusing on the OLS Regression is interesting. It has a surprisingly high R Square of 0.76. So, it picked up the directional changes of the S&P 500 reasonably well. However, it grossly overestimated the average quarterly change at + 1.3% vs. Actual of – 2.7% during this Dot.com period. As result, despite a surprisingly high R Square, the OLS Regression generated a really poor prediction. Yet, it was still better than the DNNs.
Slide 29 - 29 Testing Performance Part 5: Dot.com period Here we are comparing the R Square and the Mean Absolute Error (MAE) during the Training period vs. the Testing one. By doing so, we derive an Overfit multiple. If this Overfit multiple is > 1, it means a model may be overfit, otherwise not. Surprisingly, when looking at R Squares, none of the models suffer from any material overfitness. When we look at MAEs, the Overfit multiples are > 1. This suggests that on this count, the models could be considered overfit. However, this may be simply due to the greater data volatility during the Testing period. The main takeaway is that the DNNs, despite their greater complexity did perform worse than the OLS Regression.
Slide 30 - 30 Testing Performance Part 1: Great Recession period The models’ projections look quite a bit better than during the Dot.com period. At least they are directionally correct. All three models convey a market downturn during the Great Recession.
Slide 31 - 31 Testing Performance Part 2: Great Recession period The “skylines” are quite a bit better for this Great Recession period than the ones for the Dot.com period. The skyline of the Sigmoid and OLS Regression models are more convergent with Actuals than the SoftPlus model.
Slide 32 - 32 Testing Performance Part 3: Great Recession period Same comment as on the previous slide.
Slide 33 - 33 Testing Performance Part 4: Great Recession period Focusing on the Testing period, the R Square and MAE both show fairly material deterioration. This is expectable since the models have not been trained on the new data, as specified. However, the projections are better than during the Dot.com period because the models’ predicted averages quarterly % change in the S&P 500 are at least of the same sign as the Actual data. The performance of the DNNs is not readily differentiable from the OLS one. Again, no gain from the added complexity. Note that the SoftPlus model with the better activation function has the worst R Square and MAE.
Slide 34 - 34 Testing Performance Part 5: Great Recession period Now, we see rather stronger cases of model overfitting. And, the overfitting is typically more pronounced for the DNNs, just as we expected.
Slide 35 - 35 Testing Performance Part 1: COVID period The SoftPlus model exaggerated the market downturn in 2020 Q1. As a result, the predictions out to 2021 Q3 ended up way too low. The Sigmoid pretty much missed all the market turns. But, ended up generating the best begin-point to end-point prediction. The OLS model tracked Actuals best up to 2020 Q1. But, thereafter it missed much of the strength of the spectacular Bull market over the remaining quarters. On a relative basis, these projections are not quite as good as during the Great Recession period. But, they are better than during the Dot.com period.
Slide 36 - 36 Testing Performance Part 2: COVID period Looking at these skylines, none of them look visually convergent with Actuals.
Slide 37 - 37 Testing Performance Part 3: COVID period Same comment as on previous slide.
Slide 38 - 38 Testing Performance Part 4: COVID period During the Testing period, all models underestimate the average pace of the market. They all underestimate by a wide margin the bull market strength during the 3d year.
Slide 39 - 39 Testing Performance Part 5: COVID period Not much overfitting, as specified. But, as expected overfitting if any is lesser within the OLS Regression than within the DNNs.
Slide 40 - 40 Testing Performance just looking at Averages None of the models do that well on this count. As mentioned elsewhere, the simpler OLS Regression model is typically competitive with the more complex DNNs models.
Slide 41 - 41 Testing Performance looking at Averages and Standard Deviation Given DNNs’s structures, you expect DNNs to better capture the volatility (standard deviation) of a Y variable than the OLS Regression. But, it is not always the case.
Slide 42 - Why the Models do not perform well? 42
Slide 43 - 43 The models do not fit the historical data well enough to predict well
Slide 44 - 44 The models’ weak historical fit is due to the variables relationships being very unstable The graphs show 12 quarters correlations between Y and Xs variables. Correlations are very volatile. They often flip sign.
Slide 45 - 45 Correlations during Training and Testing are very different Correlations between Y and Xs are very different during the respective Training and Testing periods. Given that, the models have no chance to predict reasonably accurately.
Slide 46 - 46 Considerations Macroeconomic relationships are way too unstable to facilitate the development of effective predictive models. Even fitting historical data is already challenging. DNNs provide no advantage whatsoever over simpler OLS Regression. DNNs promoted capacity of capturing non-linear relationship is more likely to overfit on randomness. The lack of these models ability to predict the stock market is probably not due to any missing confounding variables, but more due to unstable variable relationships, and pervasive data randomness. More complex DNNs with more variables, more hidden layers, more nodes would probably not perform better. They may not even be feasible. The presented DNNs already had challenges converging towards a solution.