#4 Social Networks in Rural South India

This project is the 4/8 part of my mini-series on social network analysis. The goal of this series is to start with the very basics of network analysis, such as with the concepts of centrality and assortativity, and progress towards more advanced topics, including Erdős-Rényi and Configuration models, as well as Exponential family random graphs. In this fourth episode, I look into the concepts of assortativity and transitivity in rural social networks. I analyze social ties of one South Indian village that was featured as village 35 in a research paper titled Social Capital and Social Quilts: Network Patterns of Favor Exchange (2012) by Matthew Jackson, Tomas Rodriguez-Barraquer, and Xu Tan. The researchers surveyed inhabitants of the village about their religion, caste and wealth (proxied by the number of rooms in their house), and also asked them who they would go to, to borrow money or to borrow food. Similarly, they asked the participants which fellow villagers would ask them for help when in need of money or food. The central question of the present project is whether the exchange of food and money happens preferentially among households from the same caste, same religion and among households with similar levels of wealth (closely tied to the concept of homophily). Or, conversely, whether people exchange money and food with people who are different than them (relating to the concept of heterophily).

Prepping the Data

I use the adjacency matrices to make networks for each relationship type and the metadata file to assign vertex attributes to the vertices in each of those networks. More precisely, I use the attributes for caste (using the castesubcaste column in metadata file), religion (using the hohreligion column), and wealth (in this case, I am using a proxy - the room_no column, making the assumption that wealthier households have bigger houses with more rooms. The four datasets with adjacency metrices are named borrow-money, lend-money, borrow-fufo and lend-fufo. The borrow_money dataset was compiled by asking households in the village whom they would ask, if they suddenly needed to borrow Rs. 50 for a day. Conversely, the lend-money was compiled by asking the households who they would trust enough that if this person needed to borrow Rs. 50 for a day they would lend it to him/her. The Borrow-fufo and Lend-fufo datasets in similar manner ask which households would be willing to exchange kerosene or rice. Also for context, in the money borrowing and lending relationships, fifty Rupees are roughly a dollar and the per capita income in the areas surveyed is currently on the order of three dollars per day or less, although a precise income census is not available.

Creating and Visualizing the Merged Network

I create one merged network by adding all the adjacency matrices together and generating a new network based on that summed adjacency matrix. I plot this network twice, once with nodes coloured by caste, and once coloured by religion, with nodes sized by wealth (number of rooms) and edge width scaled by the number of connections the households have. A legend is included to show the label associated with each colour (so, the colour associated with each religion or each caste).

Calculating Transitivity and Assortativity

I calculate the transitivity of each of the 4 relationship type networks.

t1 <- transitivity(borrow_money_g)
t2 <- transitivity(lend_money_g)
t3 <- transitivity(borrow_fufo_g)
t4 <- transitivity(lend_fufo_g)

Additionally, I calculate the assortativity of each of the 4 relationship type networks based on caste, religion, and wealth. Religion and caste are categorical variables and so, for these, I will calculate modularity/homophily based on enumerative characteristics using the command assortativity.nominal(). In contrast, the number of rooms (proxy of wealth) is a continuous variable and so I will caluculate homophily based on scalar characteristics using the assortativity() command (Newman, 2010).

a11 <- assortativity.nominal(borrow_money_g,factor(V(borrow_money_g)$caste)) 
a21 <- assortativity.nominal(lend_money_g,factor(V(lend_money_g)$caste)) 
a31 <- assortativity.nominal(borrow_fufo_g,factor(V(borrow_fufo_g)$caste)) 
a41 <- assortativity.nominal(lend_fufo_g,factor(V(lend_fufo_g)$caste)) 

a12 <- assortativity.nominal(borrow_money_g,factor(V(borrow_money_g)$religion)) 
a22 <- assortativity.nominal(lend_money_g,factor(V(lend_money_g)$religion)) 
a32 <- assortativity.nominal(borrow_fufo_g,factor(V(borrow_fufo_g)$religion)) 
a42 <- assortativity.nominal(lend_fufo_g,factor(V(lend_fufo_g)$religion)) 

a13 <- assortativity(borrow_money_g,factor(V(borrow_money_g)$wealth)) 
a23 <- assortativity(lend_money_g, V(lend_money_g)$wealth)
a33 <- assortativity(borrow_fufo_g,factor(V(borrow_fufo_g)$wealth)) 
a43 <- assortativity(lend_fufo_g,factor(V(lend_fufo_g)$wealth)) 

Combining into a Dataframe

I now combine these transitivity and assortativity measures into a dataframe (so, each row will be a relationship type, with a column for each of the transitivity and assortativity measures).

df <- data.frame(network = c("money borrowing", "money lending", "fuel/food borrowing", "fuel/food lending"),
                 transitivity_calc  = c(t1, t2, t3, t4),
                 assortativity_caste = c(a11, a21, a31, a41),
                 assortativity_religion = c(a12, a22, a32, a42),
                 assortativity_wealth = c(a13, a23, a33, a43))
print(df)
##               network transitivity_calc assortativity_caste
## 1     money borrowing              0.17                0.30
## 2       money lending              0.19                0.34
## 3 fuel/food borrowing              0.24                0.38
## 4   fuel/food lending              0.27                0.43
##   assortativity_religion assortativity_wealth
## 1                   0.59                0.008
## 2                   0.44               -0.079
## 3                   0.60                0.034
## 4                   0.55               -0.036

Interpreting Transitivity

arranged_transitivity <- arrange(df, transitivity_calc)
select(arranged_transitivity, network, transitivity_calc)
##               network transitivity_calc
## 1     money borrowing              0.17
## 2       money lending              0.19
## 3 fuel/food borrowing              0.24
## 4   fuel/food lending              0.27

Fuel/food lending is the most transitive relationship of the four; 27% of all triads are closed triangles in the fuel/food lending network. In other words, when 2 HHs both name a third person as someone they lend kerosene and rice to, there’s more than 1/4th chance that the 2 HHs also name each other. In contrast, money borrowing is the least transitive. Only 17% of all triads in the money borrowing network are closed triangles. Put differently, when two HHs both name a third person as someone they borrow money from, there’s a 17% chance that they also name each other. To test how our transitivity results compare to random networks with the same number of nodes and edges as our 4 networks, I generate the following random networks:

(A) transitivity of a random network with equivalent characteristics as the kerosene/rice lending network

random_lend_fufo <- replicate(1000, transitivity(sample_gnm(n=vcount(lend_fufo_g), m=ecount(lend_fufo_g), directed = FALSE)))
hist(random_lend_fufo, col = "#CD7F32", breaks = 12, main = "histogram of transitivity values")

range(random_lend_fufo)
## [1] 0.00 0.04
plot(density(random_lend_fufo), xlim=c(0,0.30), "density curve of transitivity values")
par(xpd=FALSE)
abline(v=c((transitivity(lend_fufo_g)), mean(random_lend_fufo)), col=c("blue", "red"), lty = c(2, 3))

In black, the density plot outlines the null distribution of transitivity in random networks with the edge and vertex characteristics of our fuel/food lending network (Erdos-Renyi model). The mean of this null distribution of transitivity is highlighted in red. Our obtained transitivity value from the actual fuel/food lending network is in blue. There is clearly more triadic closure in the actual fuel/food lending network (i.e. more clustering of ties) than in randomly generated networks. This implies that the actual fuel/food lending network has much more triadic closure than what would be predicted by chance.

(B) transitivity of a random network with equivalent characteristics as the kerosene/rice borrowing network

random_bor_fufo <- replicate(1000, transitivity(sample_gnm(n=vcount(borrow_fufo_g), m=ecount(borrow_fufo_g), directed = FALSE)))
hist(random_bor_fufo, col = "#CD7F32", breaks = 12, main = "histogram of transitivity values")

range(random_bor_fufo)
## [1] 0.000 0.042
plot(density(random_bor_fufo), xlim=c(0,0.25), main = "density curve of transitivity values")
par(xpd=FALSE)
abline(v=c((transitivity(borrow_fufo_g)), mean(random_bor_fufo)), col=c("blue", "red"), lty = c(2, 3))

In the observed kerosene/rice borrowing network, we again see much more triadic closure than in random networks with similar vertex and edge characteristics.

(C) transitivity of a network with equivalent characteristics to the money lending network

random_lend_money <- replicate(1000, transitivity(sample_gnm(n=vcount(lend_money_g), m=ecount(lend_money_g),directed = FALSE)))
hist(random_lend_money, col = "#CD7F32", breaks = 12, main = "histogram of transitivity values")

plot(density(random_lend_money), xlim=c(0,0.20), main = "density curve of transitivity values")
par(xpd=FALSE)
abline(v=c((transitivity(lend_money_g)), mean(random_lend_money)), col=c("blue", "red"), lty = c(2, 3))

In the observed money lending network, we also see much more triadic closure than in random networks with similar vertex and edge characteristics.

(D) transitivity of a random network with equivalent characteristics to the money borrowing network

random_bor_money <- replicate(1000, transitivity(sample_gnm(n=vcount(borrow_money_g), m=ecount(borrow_money_g), directed = FALSE)))
hist(random_bor_money, col = "#CD7F32", breaks = 12, main = "histogram of transitivity values")

range(random_bor_money)
## [1] 0.00 0.04
plot(density(random_bor_money), xlim=c(0,0.20), "density curve of transitivity values")
par(xpd=FALSE)
abline(v=c((transitivity(borrow_money_g)), mean(random_bor_money)), col=c("blue", "red"), lty = c(2, 3))

Finally, the money borrowing network had the lowest transitivity of the 4 networks, the transitvity was around 17%. As before, the density plot in black outlines the null distribution of transitivity in random networks with the edge and vertex characteristics of our money borrowing network. In contrast, our obtained transitivity value from the actual money borrowing network is outlined in blue.There is clearly more triadic closure in the actual money borrowing network (i.e. more clustering of ties) than in the random network. This implies that even the money borrowing network has much more triadic closure than what would be predicted by chance.

Interpreting Assortativity by Caste

arranged_caste_assortativity <- arrange(df, assortativity_caste)
select(arranged_caste_assortativity, network, assortativity_caste)
##               network assortativity_caste
## 1     money borrowing                0.30
## 2       money lending                0.34
## 3 fuel/food borrowing                0.38
## 4   fuel/food lending                0.43

All of the 4 favor networks exhibit positive assortative mixing by caste. This implies that there are more favor exchanges (edges) between households belonging to the same caste (e.g. Forward Caste, Scheduled Caste) than what would be predicted by chance. Of the 4 networks, kerosene/rice lending is the most assortative network by caste with modularity value of Q=0.43. And money borrowing is the least assortative by caste with modularity value of Q=0.30.

Interpreting Assortativity by Religion

arranged_religion_assortativity <- arrange(df, assortativity_religion)
select(arranged_religion_assortativity, network, assortativity_religion)
##               network assortativity_religion
## 1       money lending                   0.44
## 2   fuel/food lending                   0.55
## 3     money borrowing                   0.59
## 4 fuel/food borrowing                   0.60

All of the 4 networks have positive assortative mixing by religion. This implies that there are more favor exchanges (edges) between households of the same religion (e.g. Muslim) than what would be predicted by chance. Money lending is the least assortative by religion with modularity value of Q=0.44. In contrast, kerosene/rice borrowing is the most assortative by religion with modularity value of Q=0.60. This could potentially indicate that different religious groups (Hindu vs. Muslim) do not exchange food in favor networks because they might have different food customs/diets. A further research focusing specifically on the high assortativity by religion in the fuel/food borrowing network would be welcome, in order to establish the underlying mechanisms.

Interpreting Assortativity by Wealth

arranged_wealth_assortativity <- arrange(df, assortativity_wealth)
select(arranged_wealth_assortativity, network, assortativity_wealth)
##               network assortativity_wealth
## 1       money lending               -0.079
## 2   fuel/food lending               -0.036
## 3     money borrowing                0.008
## 4 fuel/food borrowing                0.034

Of the 4 networks, fuel/food (kerosene/rice) borrowing is the most assortative by wealth. And money lending is the least assortative (most disassortative) of the 4 networks.

Negative Assortativity

For fuel/food and money borrowing networks, the wealth assortativity coefficient is positive, indicating homophily. In other words, there are more edges between vertices with similar amount of wealth in the kerosene/rice and money borrowing networks, relative to what we would expect by chance. The other 2 networks - money and fuel/food (kerosene/rice) lending networks have negative assortativity coefficients based on wealth, indicating heterophily. In other words, money and fuel/food lending is disassortative by wealth. This implies that 2 households which have different wealth levels are more likely to lend kerosene/rice/money to each other than 2 households which have similar wealth levels. Negative assortativity is relatively rare in social networks, since people usually tend to associate with others who are like them (Newman, 2010). However, there are some well-known examples of dissasortative matching. For example, the majority of romantic and sexual relationships are between people of the opposite sex (same-sex partnerships are less common).

Double Sampling

Since we are considering undirected networks, transitivity and assortativity of the double sampled networks should - in theory - be the same. Fuel/food lending and borrowing networks should yield exactly the same clustering coefficients and equal values on the 3 assortatitivity measures. And, likewise, transitivity and assortativity measures should be equal for the money lending and money borrowing networks. However, we observe considerable differences in these measures due to measurement error. Jackson et al. (2012) point out that there are several potential sources of measurement error in their data on the 75 Indian villages (of which our dataset is a subset, as it covers one of the 75 villages). Firstly, the data was collected using surveys. Generally, survey data is prone to measurement error because people often forget to mention some of their connections, they get fatigued by interviews. And specifically in this research, Jackson et al. (2012) set a cap on how many borrowers/lenders each surveyed participant can name (the survey did not allow individuals to name more than five or eight other people). This means that some households with many connections might have omitted some of their lender/borrower connections (although the authors claim that the cap was reached only in a negligible number of cases).

Comparing Assortativity Results, Suggesting Further Data

mean_transitivity <- (t1+t2+t3+t4)/4
print(mean_transitivity)
## [1] 0.22
mean_assortativity_caste <- (a11+a21+a31+a41)/4
print(mean_assortativity_caste)
## [1] 0.36
mean_assortativity_religion <- (a12+a22+a32+a42)/4
print(mean_assortativity_religion)
## [1] 0.55
mean_assortativity_wealth <- (a13+a23+a33+a43)/4
print(mean_assortativity_wealth)
## [1] -0.018

Overall, we find homophily based on religion and caste, and some degree of heterophily based on wealth in the village. Of the 3 assortativity attribute types considered, we see that the assortativity coefficient for religion is the highest - there is substantial (positive) assortative mixing by religion. This means that a substantial (above-chance) proportion of surveyed households within the village reported favor exchange with households of the same religion.

One potential mechanism underlying such homophily based on religion could be selection (Easley and Kleinberg,2010). A number of studies from rural India report substantial religious segragation. According to Mishra and Bhardwaj (2021), members of the Muslim minority in India often live in separate neighborhoods than Hindus. And, clearly, when people live in neighborhoods, attend schools or work in companies that are relatively religiously homogeneous compared to the population at large (spatially segregated), than the social environment favors opportunities to form friendships and exchange favors with others who have the same religion. Overall, in order to better understand the mechanisms undelying the observed religion homophily, it would be beneficial to also obtain data on spatial distribution of the households within the village and to find out whether the religious groups are spatially segregated.

We also find some, although lower (on average), positive assortative mixing by caste. This means that there is a higher probability of favors being exchanged, for example, between 2 members of the Forward Caste, than what would be predicted by chance. Generally, our dataset differentiates among 5 different caste categories: (1) Forward Caste (incoporating Brahmans, Kshatriyas, Vaishyas), (2) Scheduled Caste (Dalits or “untouchables”), (3) Other Backwards Caste (Shudras), (4) Scheduled Tribe (ethnic minority Adivasis) and (5) Muslims. To better understand the mechanisms behind this caste-homophily, it would be interesting to see whether favors are exchanged primarily among friends or among households that are related. Clearly, related households usually belong to the same casts and so, the caste homophily in favor networks could potentially be inscribed to favor exchange among relatives (Banerjee et al. 2013). In fact, according to Ray (1998), poor Indian families who lack access to formal lending markets most commonly rely on relatives for money loans.

Finally, our results suggest that favor exchange in the village is, on average disassortative by wealth. In other words, villagers are more likely to provide favors to other villagers who own dissimilar amounts of wealth, rather than to villagers who have similar wealth levels. However, all the assortativity coefficients are close to 0 (especially in the money borrowing network). And so, wealth is overall not a strong predictor of connectivity in the 4 favor networks. Furthermore, it must be pointed out, that the wealth assortativity results might have been influenced by our chosen proxy for wealth (i.e. we proxied wealth by the number of rooms). Clearly, future research should establish whether number of rooms is, indeed, a good proxy of wealth in the rural Indian context - whether it closely correlates with wealth. Potentially, future research might consider creating a more complex proxy of wealth, which might include variables electricity, rooftype or number of beds all of which are in the attribute dataset “vil35_meta.csv”

mean_transitivity <- (t1+t2+t3+t4)/4
print(mean_transitivity)
## [1] 0.22

Jackson et al. (2012) raises a further issue with respect to the collected data; he points out that not all villagers were surveyed and so there are missing nodes and links in the dataset. This might have biased the results downwards; especially the reported transitivity coefficient might be biased downward. Therefore, future research might consider surveying a larger proportion of the village population. Also, it might be worth considering to calculate support coefficient (metric proposed by Jackson et al. 2012), in addition to the clustering coefficient. In the village, there is overabundance of informal lending/borrowing, yet the mean clustering coefficient is only about 22% in the 4 networks. This brings about a puzzle - how can pro-social norms of returning favors be enforced if the clustering coefficient is so low? Generally, high clustering coefficient is seen as a social enforcement mechanism/a signal of trust in the network - if two individuals are connected by an embedded edge, it is easier for them to trust one another and to have confidence in the integrity of transactions that take place between them. As Granovetter (1992) states: “My mortification at cheating a friend of long standing may be substantial, even when undiscovered. It may increase when a friend becomes aware of it. But it may become even more unbaarable when our mutual friends uncover the deceit and tell one another.” Overall, villagers in the 75 Indian villages have disjoint groups of connections. And the study by Jackson et al. (2012) presents evidence that - in network structures that contain a lot of disjoint connections - support tends to be several times higher than transitivity, and it might be generally a more approporiate measure of social enforcement in such networks.

References

Banerjee, A., Duflo, E., Ghatak, M., & Lafortune, J. (2013). Marry for what? Caste and mate selection in modern India. American Economic Journal: Microeconomics, 5(2), 33-72.

Easley, D., & Kleinberg, J. (2010). Networks, crowds, and markets (Vol. 8). Cambridge: Cambridge university press.

Granovetter, M. (1992). Problems of explanation in economic sociology. Networks and organizations: Structure, form, and action, 25-56.

Jackson, M. O., Rodriguez-Barraquer, T., & Tan, X. (2012). Social capital and social quilts: Network patterns of favor exchange. American Economic Review, 102(5), 1857-97.

Mishra, A. K., & Bhardwaj, V. (2021). Welfare implications of segregation of social and religious groups in India: analyzing from wealth perspectives. International Journal of Social Economics.

Newman, M. (2018). Networks. Oxford university press.

Ray, D. (1998). Development economics. Princeton University Press.