top of page

Python - Network Multicolinearity

In this contribution we use yahoo finance, wiki data and networkx to download the daily stock prices for the list of companies.

We examine how the comanies are related to each other in average daily stock price. Data are downloaded from wiki and yfinance and missing values are imputed. There is provided correlation graph of average daily stock price among companies in given time period. Company clusters are displayed based on average daily stock price. Dependencies in average daily stock prices in given time period are displayed based on their correlation coefficient (treshhold).


1. Theory


We can create threshold slider and network figure and display the multicolinearity, we can set optimal node distance to k=2.5. This contribution can display the result with help of Fruchterman Reingold Layout algorithm. Fruchterman Reingold Layout can minimize the error finding the eqilibrium (attraction) between attractive and repellig force. Layout is designed for social networks


Maximum spanning tree - spanning tree with weight greater than or equal to the weight of every other spanning tree. We find MST by Prim, Kruskal or Boruvka algorithm after multiplying the edge weights by -1.


Kruskal algorithm - adds the next lowest-weight edge that will not form a cycle to the minimum spanning forest.


Triu indices - upper triangular part of matrix in an array.



2. Python code







3. References


https://en.wikipedia.org/wiki/Minimum_spanning_tree

https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.tree.mst.maximum_spanning_tree.html

https://numpy.org/doc/stable/reference/generated/numpy.triu_indices.html

https://en.wikipedia.org/wiki/Kruskal%27s_algorithm

https://www.researchgate.net/figure/Force-functions-of-the-Fruchterman-Reingold-Layout-algorithm-The-vertical-line_fig6_256443143

https://www.sciencedirect.com/book/9780128177563/analyzing-social-media-networks-with-nodexl

https://medium.com/@kenanekici/visualizing-multicollinearity-in-python-b5feedc9b3f1

https://giphy.com/apps/giphycapture

https://python-bloggers.com/2020/10/corr-correlation/

https://en.wikipedia.org/wiki/Imputation_(statistics)

https://www.section.io/engineering-education/missing-values-in-time-series/#4-next-observation-carried-backwardnocb

https://towardsdatascience.com/4-techniques-to-handle-missing-values-in-time-series-data-c3568589b5a8

https://techhelpnotes.com/python-seaborn-heatmap-display-the-heatmap-only-if-values-are-above-given-threshold-2/

https://networkx.org/documentation/networkx-1.11/reference/generated/networkx.drawing.layout.fruchterman_reingold_layout.html

https://www.sciencedirect.com/topics/computer-science/reingold-layout

Recent Posts

See All

Python - Basic regression comparison

Regression models are the principles of machine learning models as well. They help to understand the dataset distributions. The objective...

Comments


bottom of page