Trust the AI Scientist? What Machine Learning Can—and Cannot—Teach Us About Sex Differences in the Brain

A headline recently published by the Institute for Family Studies confidently proclaims: “New Brain Research Confirms That Gender Differences Are Hardwired.” Eye-catching as it is, the title actually buries the lede: this finding was uncovered through the use of artificial intelligence.

In the article, psychologist Leonard Sax reports and comments on a scientific paper published earlier this year in the Proceedings of the National Academy of Sciences. That paper describes a study by researchers at Stanford University that applied artificial neural network algorithms to find sex differences in functional brain organization. The premise of the study is that large datasets of brain activity recorded via functional magnetic resonance imaging (fMRI) can be mined algorithmically at a scale that exceeds human ability in order to identify new patterns distinct to male and female brains.

This is an exciting avenue of research at the frontier of biology—one with potentially important social implications. Still, before jumping to conclusions, it’s important to consider both the promise and the limitations of studies that combine artificial intelligence with neuroimaging techniques. At present, AI tells us much of what we already know: sex differences are real, but we cannot yet reliably distinguish nature from nurture when it comes to the organization and function of the brain.

Is AI Learning Signal or Noise?

The term “artificial intelligence” is a catch-all phrase in computer science, encompassing a wide range of technologies. In the context of the Stanford study, AI means machine learning—the ability for an algorithm to learn from data. More specifically, the machine learning used in this study is a discriminative approach that allows the algorithm to make a decision about whether a signal is characteristic of a male or female brain, and to detect the salient areas of the brain that contributed to that decision. That algorithm is an artificial neural network that can process the time series data generated by fMRI recordings.

Artificial neural networks have achieved success in a number of domains, from recognizing objects in scenes for visual search engines to detecting malignant tumors in radiology. Given previous success, it is intuitive to apply artificial neural networks to the study of sex differences. That study is an area of biology that is both fundamental to basic science and practical for clinical research, which makes good use of new research methods emerging in brain science.

Machine learning is excellent at finding patterns in vast collections of high dimensional data that humans do not have the ability to sift through by hand (like the millions of images that need to be cataloged by a search engine). But sometimes those patterns are meaningless noise. It’s quite possible that machine learning is able to differentiate sex based on irrelevant information. There are several well-known sources of noise in fMRI studies of the brain that could lead to incorrect conclusions about sex differences, including subject motion while in the scanner and physiological fluctuations in other parts of the body.

This problem has surfaced before in biological research with profound social implications. For example, as researchers began to realize the utility of artificial neural networks, several studies were published that made the astonishing claim that it was possible to use machine learning to predict whether or not somebody was a criminal based on the anatomical features of the human face. After much scrutiny, it was determined that the algorithms in all of the studies were learning latent artifacts in the dataset that had nothing to do with biology, such as the color of mugshot photos. In that instance, it was easy to identify the source of the noise. But this is not always the case.

A strength of this new study is that the researchers’ machine learning approach can identify the salient regions of the brain that serve as discriminating features of the detected difference patterns. This is what is called “explainable AI” (XAI). This term means that the researchers know what data the algorithm uses to come to a decision. Yet, in reality, this doesn’t render the approach fully explainable. To possess that property, the algorithm would have to express the exact function that it uses to make decisions, which was learned from its training data. This is something no artificial neural network can do: they exist as black boxes. While we have more trust in the approach used in this study because it points us to specific regions in the brain used in the decision, the risk of a spurious result remains.

We can’t easily characterize the information that factored into the decision made by the algorithm used in this study, because the fMRI input data is highly abstract; it could be either signal or noise. Indeed, fMRI is generally considered to be a noisy way to measure the brain. It has good spatial resolution compared to other measurement techniques like electroencephalogram (EEG), but poor temporal resolution. This means that fMRI can isolate activity in specific regions of the brain, but the timestamp of that activity may be fuzzy. Furthermore, fMRI does not measure neural activity in a direct manner: it uses the proxy signal of changes in blood flow within the brain. Both the poor temporal resolution and the use of a proxy signal can induce significant noise in the resulting measurements.

Neuroscientists tend to prefer more precise recording tools that directly measure neurons when attempting to make strong claims related to neural activity (for example, multi-electrode array recordings or two-photon microscopy targeted at cell populations). Of course, fMRI is non-invasive and very safe for human experimentation, making it a good starting place for a study of sex differences in functional brain organization. But the fact remains that the data used in the Stanford study are noisy recordings of activity in large brain regions.

Structure, Function, Genetics, and Learned Behavior

The Stanford study is also limited to functional brain organization. In other words, it does not consider the underlying anatomical structure that generates the neural activity that is being measured. Structure, in this sense, is the network of individual neurons in the brain, which connect via their axons and dendrites. A great deal of that structure, which continuously changes over a lifetime, is learned through interaction with an environment. This is what is commonly referred to as “neural plasticity.” What isn’t learned is determined by genetics. But for the most part, genetics defines the basic building blocks (e.g., cell types) used to form the brain’s structure, not the bulk of its connectivity.

To establish that sex difference is “hardwired,” as the headline of Sax’s article claims, one would have to demonstrate that such a difference is not learned. The paper’s authors are in fact more cautious in their claims, as this is not possible by studying functional brain organization alone.

Sax raises doubt over a critique made by the neurobiologist and feminist Gina Rippon, who correctly noted that subjects in the study were at least 20 years old and had common experiences in the Western cultural environments where they lived. Because one’s environment defines so much of the brain’s connectivity, which in turn influences its activity and functional organization, she raises the possibility that the sex differences discovered by the AI algorithm were the result of learned behaviors. Rippon’s line of reasoning is consistent with what we’ve said above about the organization of the brain. Learned behavior is one possible explanation for the studies’ findings. Another, of course, is a genetically determined difference. Given the black box nature of the artificial neural network algorithm used in the study, we simply do not know which type of evidence (nature or nurture) was considered in its decisions.

Sax cites a host of other studies that show male/female differences across brains as a counterpoint to Rippon, but he mixes together studies looking at structure, function, and genetics, arguing they all come to the same conclusion. However, that doesn’t follow logically, as these things exist at different levels of the nervous system, with significant aspects of structure and function known to be influenced by learning and the environment. That leaves genetics as an important differentiator that is far less influenced by environmental pressures.

What about Hormones?

Genetics also define hormone regulation. This is a big part of the study of sex differences. Surprisingly, Sax doesn’t mention hormones, which do in fact differ by sex (estradiol vs. androgens) and affect the activity patterns of the brain.

The female hormone estradiol has been studied extensively during brain development in a number of animal model systems. Scientists have observed that estradiol promotes neurite growth, a key process of brain development involved in its wiring. Studies have shown that estradiol and other sex hormones do impact the function of the brain in specific regions, but it is important to note that the differences observed during brain development tend to be allomorphic, meaning that they exist along a continuum with large overlap between populations. Chemical signaling throughout the body can be and is driven by the different male and female sex hormones, but as any cell biologist knows, signaling pathways frequently overlap, with many canonical and non-canonical routes of interaction.

A helpful example by analogy to consider: USPS, DHL, and FedEx all deliver mail via paid postage, but they require their own postage and take their own routes, working independently of each other. You can get mail delivered by any of them, but they may deliver faster or slower depending on the chosen route and postage. Sex hormones function a bit like that. They may increase growth or activation in a unique manner, and each hormone is not interchangeable in a one-to-one manner. But when one signaling molecule is missing, it doesn’t necessarily mean everything in the system is halted. Cell signaling pathways frequently possess redundancy, with multiple activators to keep the normal developmental processes moving under multiple environmental conditions. If one mail service is experiencing delivery delays, one can opt to use an alternative service, although the route and delivery time may be different.

The brain is a complex and adapting organ that develops through stimulatory input from the environment and from internal signaling mechanisms. Thus, sex hormones could set “hardwired” default pathways in the brain specific to each sex. But the overlaps and complex interactions between cellular signaling pathways likely create patterns of brain activity reinforced by individual actions that lead more to sex commonality than difference at the level of the entire brain. More sophisticated methods of study, perhaps including the use of AI algorithms, are needed to disentangle the sex-specific role of hormones from the overall operation of the brain.

It is quite possible that the conclusions in the Stanford study are valid, but it is too early to use strong language like “confirmed.” Perhaps we will soon see complementary studies that investigate genetics and hormones using the tools of machine learning. Enormous databases of information related to these aspects of biology are waiting to be mined with artificial intelligence to aid in a rigorous understanding of fundamental sex differences.

One result from a black box machine learning algorithm might be misleading or incorrect. But if, over time, a consensus arises from the findings of multiple studies using AI to investigate different interconnected features of the brain, we can be much more confident in accepting their findings. Even if we readily embrace the reality of sex differences in our daily lives, it’s not yet time to trust the AI scientist and declare that those differences are hardwired in our brains.

Latest Book

Trust the AI Scientist? What Machine Learning Can—and Cannot—Teach Us About Sex Differences in the Brain