In this contribution we use yahoo finance, wiki data and networkx to download the daily stock prices for the list of companies.
We examine how the comanies are related to each other in average daily stock price. Data are downloaded from wiki and yfinance and missing values are imputed. There is provided correlation graph of average daily stock price among companies in given time period. Company clusters are displayed based on average daily stock price. Dependencies in average daily stock prices in given time period are displayed based on their correlation coefficient (treshhold).
1. Theory
We can create threshold slider and network figure and display the multicolinearity, we can set optimal node distance to k=2.5. This contribution can display the result with help of Fruchterman Reingold Layout algorithm. Fruchterman Reingold Layout can minimize the error finding the eqilibrium (attraction) between attractive and repellig force. Layout is designed for social networks
Maximum spanning tree - spanning tree with weight greater than or equal to the weight of every other spanning tree. We find MST by Prim, Kruskal or Boruvka algorithm after multiplying the edge weights by -1.
Kruskal algorithm - adds the next lowest-weight edge that will not form a cycle to the minimum spanning forest.
Triu indices - upper triangular part of matrix in an array.
2. Python code
3. References
https://en.wikipedia.org/wiki/Minimum_spanning_tree
https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.tree.mst.maximum_spanning_tree.html
https://numpy.org/doc/stable/reference/generated/numpy.triu_indices.html
https://en.wikipedia.org/wiki/Kruskal%27s_algorithm
https://www.researchgate.net/figure/Force-functions-of-the-Fruchterman-Reingold-Layout-algorithm-The-vertical-line_fig6_256443143
https://www.sciencedirect.com/book/9780128177563/analyzing-social-media-networks-with-nodexl
https://medium.com/@kenanekici/visualizing-multicollinearity-in-python-b5feedc9b3f1
https://giphy.com/apps/giphycapture
https://python-bloggers.com/2020/10/corr-correlation/
https://en.wikipedia.org/wiki/Imputation_(statistics)
https://www.section.io/engineering-education/missing-values-in-time-series/#4-next-observation-carried-backwardnocb
https://towardsdatascience.com/4-techniques-to-handle-missing-values-in-time-series-data-c3568589b5a8
https://techhelpnotes.com/python-seaborn-heatmap-display-the-heatmap-only-if-values-are-above-given-threshold-2/
https://networkx.org/documentation/networkx-1.11/reference/generated/networkx.drawing.layout.fruchterman_reingold_layout.html
https://www.sciencedirect.com/topics/computer-science/reingold-layout
Comments