DRAFT – Note: As I post more of my paper-reading notes into this site, most of them will not be as long or detailed as this one. This was one of our course requirements for a graduate class, and since it overlaps with my research interests and papers that I usually read, I might as well post it here. Ignore the poor math parsing for now. (01Nov2021)
The paper presents an approach for performing inference on social behavior that incorporates network structure by combining multiple concepts in statistical physics. The three key components are 1) the use of a vector space to describe socio-demographic parameters, 2) use of spin models to represent social dynamics in a network, and 3) capturing dynamics and network structure by inferring model parameters. By combining these concepts, the authors highlight that their approach allows for getting insight on social behavior with sociographic data that incorporates social network effects without explicit information on the network structure.
Blau’s theory of social structure
Blau’s theory claims that individuals can be represented as a point in a high-dimensional space of sociodemographic parameters such as age, gender, and income level. The space can then be used to study social structure as the points cluster in space and create correlates between parameters. We must note that Blau’s theory ignores the possibility of social parameters being confounding factors (i.e., the parameters being independent/orthogonal) and the relevance of historical effects. Embedding sociodemographic parameters into a space implies that some measure of distance can be computed, such as how alike are two individuals based on how close they are in the space. Assuming homophily, where similar vertices are more likely to be connected with each other, one can use the distance to create random geometric graphs (RGG) where the probability of connection is usually inversely proportional to their distance. The Blau space can then be combined with physical distance from coordinates to define an extensive distance metric used for composing the RGG.
Spin models in social science
There is no shortage in the use of spin models in “sociophysics”: spin models are commonly used for modeling opinion dynamics and social behavior. At the core of spin models is the idea of interaction, where each vertex takes certain states while incorporating the influence of the state of neighboring vertices. While spin models in physics usually take the form of regular and symmetric topologies such as N-dimensional lattices, the ability of spin models to be performed in complex networks allows it to be applied to social network.
An example of spin model as applied to social sciences is the classical voter model on a network. The voter model involves simplified Ising states and interaction, where states align/flip based on the state of adjacent vertices. These vertices take discrete binary state $\lbrace -1,1 \rbrace$ like the Ising model or some likelihood that the vertices align with these states within $\sigma \in [-1,1]$. The states can then be interpreted as some choice by individual in the form of a vote or opinion, where there is some form of influence from the opinion of other people related to them. Since spin models highlight the effect of interaction, the network structure becomes important in determining its outcome state $\lbrace \sigma \rbrace$.
As with its physical counterpart, spin models can also be modified to incorporate model nuances such as vertices having vector states as with the Potts and XY-model. Individuals may have multiple states or have a more nuanced way that they interact with each other on their corresponding states. The use of these modifications incurs certain assumptions as well.
Network inference
Given a model, the goal is to infer the parameters used for generating a particular network configuration. This requires a network model, which is usually described in the form of how the links (from the adjacency matrix $A$) are determined given the model parameter $\theta$, i.e. $p(A|\theta)$. Bayesian inference then allows for determining what parameters were used to generate a known network under the model assumptions. Usually, network inference problems come in the form of link prediction, where one defines the probability that two vertices are connected based on some other information tied to them. For example, stochastic block models assume that a network structure can be inferred based on the membership of each vertex to a community and the probability that members of these community connect with each other. One can then infer the membership of each vertex given an unlabeled network by Bayesian inference. The Bayesian approach usually taken when performing network inference allows for using complex models with several parameters which can even incorporate dynamics.
Kernel-Blau-Ising model
The authors combine these three concepts by developing the kernel-Blau-Ising (KBI) model. First, they assume that the connection between each individual can be determined by homophily in sociodemographic parameters. Since each individual are represented by a point $z_i$ in Blau space, the define the following connectivity kernel:
$$ \rho(z_i, z_j, \theta) = \frac{1}{1 + \exp(d_{ij})} ;\quad d_{ij} = \theta_0 + \sum_{k} \theta_k |z_{ik} - z_{jk}| $$
The connectivity kernel describes the probability of connection between individuals as a logistic function of the distance between two individuals in Blau space, making the model a “soft” RGG. Each dimension is weighted by the parameter $θ_k$ which adjust the scale and its influence in determining the distance. The offset parameter $\theta_0$ is effectively a network density parameter when incorporated with the connectivity kernel.
Next, the authors then assume that social network structure has a significant effect in population behavior by modeling voting outcomes with an Ising model. The individuals $z_i$ now has a corresponding state $σ_i$ that represents their opinion or vote. The outcome of the network $\lbrace σ \rbrace$ is governed by the following Hamiltonian:
H({σ},h,J,A)=-∑_i▒〖(h⋅z_i ) σ_i 〗-J ∑_ij▒〖A_ij σ_i σ_j 〗
As with the Hamiltonian for the Ising model, J determines the interaction strength, A_ij limits the interaction between connected vertices, and h determines the effect of an external field. They define the external field as a vector where, by taking the dot product between this vector and its Blau space coordinates, each vector component effectively assigns a weight of how sociodemographic parameters affect outcomes at the individual level. Similarly, the probability that a network will have an outcome of states $p(\lbrace σ \rbrace | β,h,J,A)$ is described by the Boltzmann function.
The p({σ}┤|Θ) is then used as the network model, where the parameters Θ can be inferred by Bayesian computation. For the paper, the authors used borough-level census data in London for the Blau space, and three scenarios to describe states: the 2012 and 2016 mayoral election, and the 2016 EU referendum. Since the Blau space are composed of sociodemographic parameters that are not all in discrete or continuous space, the values are then binned into C≫N points in the space. Having a binning scheme conveniently allows for creating a low-dimensional summary statistic $S(\lbrace σ\rbrace ) \in [0,1]^C$, which can then be used to compare how similar outcomes are.
Key findings
When applying the model with election data (mayoral elections, EU referendum), the posterior distribution for the Blau space external field $h$ acts similarly to correlates in census data. The authors set two exceptions: the external field for distance is set to zero, and the external field for education is pinned to 0.45 as they consistently obtain this value from regression across all datasets. From the obtained posterior distribution for each external field parameter, the authors have the following observations: the skew in h_age reveals a preference of older voters to vote conservative (mayor) or leave (referendum), while the skew in h_gender reveals a preference of men to vote conservative/leave. The negative skew of h_income for the mayoral election indicates that high-income individuals are more likely to vote conservative, but approximately zero-centered distribution for the EU referendum shows that there is no income correlate for the leave/stay preference.
Note that the h modulates the influence of each sociodemographic parameter to states at the individual level, outside of the network structure. As such, the inferred $p(h)$ will only reveal information that can already be determined by simpler approaches that also ignore network structure such as logistic regression.
Despite this, the authors highlight that the power of their approach lies in the interpretability of the inferred parameters. The bias $θ_i$ for the connectivity coefficient effectively regulates the scale of each Blau dimension, given that we assumed with homophily and Ising dynamics that the outcomes are influenced by the network structure. Changing $θ_0$ alters the connectivity density, which allows for fitting the model with setting certain network measures like average degree if these are known. The Ising parameters also carry over their interpretation: changing $J$ alters the connectivity strength (influence) and $β$ alters the noise parameter (how likely individuals will flip from the preferred state)
Review of strengths, limitations, and applications
The key strength of the approach developed by the authors is indeed the interpretability of parameters: acknowledging model assumptions, the parameters effectively obtain insights on social behavior that incorporates network effects while limited to sociodemographic data. These data are more easily obtained from census data, unlike social network information which requires surveying people about other people they know and is hard to anonymize.
We highlight certain limitations of this approach, beginning with those acknowledged by the authors. Since all data are taken as snapshots of events, the historical component of social behavior through time-series information is explicitly ignored. Next, the authors disclaim that the model only infers posterior distributions of the parameters for generating network configurations, not the network itself. There is indeed a temptation to take the outcomes of their paper and consider the approach to be a way to unveil the underlying social network structure given sociodemographic data. Instead, they present it as alternative for studying social behavior beyond regression models used by social scientists, importantly with network effects incorporated in the analysis.
The biggest limitation of the paper is perhaps the limits of interpretability of the parameters. Specifically, the significance of parameter homophily (with age, income, etc.) cannot be inferred since their approach does not provide null distribution for these parameters in the first place. Hence, their analysis of the parameter posterior distributions tends to be subjective and only provides descriptive insights. In its current form, the advantage of the approach against simple regression analysis is the inclusion of network effects, and not much beyond that. Despite these, the simplicity of the approach and the data it demands makes it feasible for applying into large-scale analysis of social behavior. Countries readily have census data, and certain elections and survey data gives a snapshot of individual opinion and social behavior. Possible extensions to the work can also be done by having a comprehensive Blau space that incorporates more sociodemographic measures, and by having a more nuanced network model which can capture of structure of complex networks (such as having fat-tailed degree distributions) that goes beyond naively using the distance in Blau space to create a geometric network.