How Big Data took graph theory to new dimensions

Graph theory is not quite.

The mathematical language for talking about connections, which usually depends on networks – vertices (points) and edges (lines connecting them) – has been an invaluable means of modeling real-world phenomena since at least the 18th century. But decades ago, the emergence of giant datasets forced researchers to expand their toolboxes and, at the same time, gave them sprawling sandboxes in which to apply new mathematical knowledge. Since then said Josh Grochow, a computer scientist at the University of Colorado at Boulder, there has been an exciting period of rapid growth as researchers have developed new types of network models capable of finding complex structures and signals in big data noise.

Grochow is part of a growing chorus of researchers who point out that when it comes to finding connections in big data, graph theory has its limits. A graph represents each relationship as a dyad or a paired interaction. However, many complex systems cannot be represented by binary connections alone. Recent advances in the field show how to move forward.

Consider trying to forge a parent network model. Obviously, each parent has a bond with a child, but the parental relationship is not just the sum of the two bonds, as graph theory might model it. The same goes for trying to model a phenomenon like peer pressure.

“There are many intuitive models. The effect of peer pressure on social dynamics is only captured if you already have groups in your data, ”said Leonie Neuhäuser from RWTH Aachen University in Germany. But binary networks don’t capture group influences.

Mathematicians and computer scientists use the term “higher order interactions” to describe those complex ways in which group dynamics, rather than binary relationships, can influence individual behaviors. These mathematical phenomena appear in everything from entangling interactions in quantum mechanics to the trajectory of a disease spreading through a population. If a pharmacologist wanted to model Drugs interactions, for example, graph theory might show how two drugs react to each other, but what about three? Or four?

While the tools for exploring these interactions are not new, it is only in recent years that large-dimensional datasets have become a powerhouse of discovery, giving mathematicians and theorists networks new ideas. These efforts have yielded interesting results on graph limits and scaling possibilities.

“Now we know the network is just a shadow of it,” Grochow said. If a dataset has a complex underlying structure, modeling it as a graph may reveal only a limited projection of the whole story.

“We realized that the data structures that we used to study things, from a mathematical standpoint, don’t quite match what we see in the data,” the mathematician said. Emilie Purvine of the Pacific Northwest National Laboratory.

This is why mathematicians, computer scientists and other researchers are increasingly focusing on ways to generalize graph theory – in its many forms – to explore higher-order phenomena. The last few years have brought a torrent of means proposed to characterize these interactions and verify them mathematically in large data sets.

For Purvine, the mathematical exploration of higher-order interactions is like mapping new dimensions. “Think of a graphic as a foundation on two-dimensional terrain,” she said. The three-dimensional buildings that can go to the top can vary greatly. “When you’re at ground level, they look the same, but what you build above is different. “

Enter the hypergraph

The search for these higher dimensional structures is where mathematics becomes particularly obscure and interesting. The higher-order analog of a graph, for example, is called a hypergraph, and instead of edges, it has “hyper-edges.” These can connect multiple nodes, which means they can represent multidirectional (or multilinear) relationships. Instead of a line, a hyperbord can be thought of as a surface, like a tarp staked in three or more places.

Which is good, but there’s still a lot we don’t know about how these structures relate to their conventional counterparts. Mathematicians are currently learning which rules of graph theory also apply to higher-order interactions, suggesting new areas of exploration.

To illustrate the types of relationships that a hypergraph can extract from a large data set – and that an ordinary chart cannot – Purvine cites a simple example close to his home, the world of scientific publishing. Imagine two datasets, each containing papers co-authored by up to three mathematicians; for simplicity, let’s name them A, B, and C. A dataset contains six articles, with two articles by each of the three distinct pairs (AB, AC, and BC). The other contains only two articles in total, each co-authored by the three mathematicians (ABC).