I was introduced to networks and graph theory by a TED Talk that I prepared for my former boss, who happens to be the Deputy Managing Director of the International Monetary Fund.
He talked about the results of an IMF paper in 2011 on interconnectedness and clusters. I read this paper as I prepared for his TED talk, I had to read the paper, and that was my curious introduction to graph theory. So when the time came when the findings needed to be updated (and the IMF mentioned no intent of doing so), I thought I'd dabble in it along with a few other interested researchers.
I used the IMF 2011 paper as an inspiration for my experiment, so I didn't really replicate it. Instead, I added a few more dimensions to the analysis. First, I only looked into trade networks. In this exercise I examined trade flows (exports plus imports) from one country to each of its trading partners.
Second, I added a time dimension. I used three snapshots, 1995, 2005, and 2015. Below is an outline of my data prep and analysis. One thing to note is that I didn't use Python. I used STATA (every IMF economist's best friend) and this easy cool tool called Gephi. But, as always, my github will have the full information.
My data source is UNCTAD. I was introduced to their merchandise trade matrix by a colleague here at the IMF. You can generally map how one country trades with 200+ countries (and country groups) for any product.
Data transformation (spelling, data reshape, etc.) was in STATA. After all the magic of data transformation, you should be able to whip up a dataset in the following format (take note that "weight" is a number from 0 to 1, which describes how heavy you trade with the target country):
Source Target weight 0 Algeria Lithuania 0.01 1 Argentina Australia 0.02 2 Argentina Belgium 0.01 3 Argentina Bolivia 0.01 4 Argentina Brazil 0.18 5 Argentina Canada 0.01 6 Argentina Chile 0.07 7 Argentina China 0.05 8 Argentina ChinaHongKong 0.01
The thing to note here is that it will initially be a very dense network, as each country is connected to at least 200 countries. So for purposes of simplification, I eliminated those with weights near-zero. Example, if Russia trades with the Cook Islands with a value of 0.0001, that connection is eliminated. This leaves me with a network which I can work with for analysis. Suggestion for further study: if one were to study small island states (and I have done work on them before too), one can just filter on these near-zero weights and analyze them separately. But I don't cover them here.
I used this neat little freeware called Gephi. It has a decent amount of graph layouts and a host of network statistics available for analysis, which you can test and play around with. It took me a number of youtube videos to come up with my desired visualization, such as making sure that Gephi reads my data as an "undirected" graph.
A directed graph is one-way, but since I was using data from both ways (exports and imports) it had to be undirected. One can figure this out from playing with the tool for a while, plus watching youtube tutorials. I suggest the one by Jen Golbeck . There was a point I even tracked her down at UMd and emailed her!
I initially unpacked my network using the standard (force-directed) network layout. But my boss suggested that I visualize them using a geogrpahic layout, which was a great suggestion since it made me realize that it is an intuitive layout for a layman to understand. Again, the charts below are aggregate trade networks. The different colors are detected communities/clusters based on the strength of interconnections for a specific group of countries. This is called "modularity" and is computed in Gephi. Basically, high modularity means strong clustering.
All images are the author's own.
Explore on your own!
Japan was the gateway to Asia in the 1990s. Now, it's China.
Trade between US and Mexico has increased substantially from 2005 to 2015. (If you follow the news, that's most probably NAFTA). No wonder Mr. Trump is upset!
Stronger EU integration, which used to be dominated by Germany in 1995, but now it seems like trade is more fairly integrated within the region.
Africa doesn't seem trade heavily to the rest of the world, perhaps suggestive of the dominance of inter-regional trade in Africa.
On communities formed: In 1995 and 2005, there were only two groups found. An EU cluster and the rest of the world. In 2015 it seems like a third cluster is emerging, as the Asian cluster led by China seems to break away.
Database is from the UNCTAD merchandise trade matrix. Python code is available on my github page.