The article discusses how artificial intelligence (AI) can be used to predict the diagnosis of psoriatic arthritis, a type of arthritis that affects people with psoriasis. The study used a national dataset of 282 patients with psoriasis, with 55 of them also affected by psoriatic arthritis. Machine learning systems were trained to predict the diagnosis from input variables, such as sex, age, weight, and other clinical information. The results showed that AI models that included information on gender were more accurate in predicting psoriatic arthritis diagnosis compared to those that did not. Read the abstract here.
Gender Bias in Medical Research
In the world of medical research, there has been a long-standing gender bias, with most studies focusing predominantly on men. However, there have been continuous efforts to overcome this bias and include more women in clinical and preclinical studies. With the rise of artificial intelligence (AI) in healthcare, it's important to examine the impact of gender information in predictive models. In a retrospective study, our team analyzed 21 datasets containing gender as a variable and found that gender was included in the heuristic predictive model 19 out of 21 times. This shows that even for highly advanced AI tools like Artificial Neural Networks (ANNs), information on sex carries a specific value.
Gender differences have been described in many chronic diseases and physiological processes such as physical and cognitive aging. In addition, gender differences in lifestyles are correlated with disease epidemiological trends. Despite significant scientific advances, most biomedical AI technologies currently in use take little account of sex-related bias. This can result in inequalities in healthcare, limiting the expected improvement in care to a subset of patients. Attention to the selection of gender characteristics in medical datasets is important to mitigate biases related to the underrepresentation of women in medical science.
Artificial Neural Networks (ANNs) are information processing paradigms inspired by the analytical processes of the human brain. They can learn to recognize the complex patterns existing between the input signals and the corresponding outputs. ANNs are particularly suited for solving non-linear problems and analyzing complex datasets. However, several variables that do not contain specific information pertinent to the problem can interfere with the network's generalization ability, reducing the overall accuracy. Feature selection methods are intended to reduce the number of input variables to those that are believed to be most useful to predict the target variable with high accuracy. Our group has applied ANNs to address a variety of medical problems and has systematically used automatic procedures to reduce redundant information from a dataset.
Evolutionary algorithms are adaptive systems able to find optimal data when fixed rules or constraints have to be respected. They become fundamental when the space of possible states in a dynamic system tends toward infinity, such as in variables selection. The introduction of variable selection systems generally results in a dramatic improvement in ANNs' performance. The input selection system (IS) is an example of the adaptation of an evolutionary algorithm to this problem. An IS system based on the evolutionary algorithm GenD proved to be able to handle the relevance of the different variables of the dataset in an intelligent way and therefore became a standard of our work.
To further improve the difficult task of variable selection, the TWIST algorithm was developed. TWIST finds dynamically the subset of variables that allows a neural network to build a mathematical model with the best predictive performance in terms of global accuracy, allowing simultaneously the optimal repartition of records in training and testing subsets. Many papers in the literature have been published using the TWIST algorithm in different fields of medicine. To answer the question "Is sex information useful for a neural network to build a predictive model?", we checked retrospectively how many times sex has been saved as part of informative variables subset in 21 data sets used from our group for scientific studies in which ANNs were employed to answer critical questions pertinent to a large array of diseases and conditions after preprocessing with TWIST algorithm. The results showed that sex information was included in the heuristic predictive model 19 out of 21 times. This confirms the importance of gender information in building high-performance predictive models in the field of AI.
The Psoriatic Arthritis Example
Artificial intelligence (AI) has revolutionized many aspects of healthcare, including disease diagnosis, treatment planning, and drug development. However, despite the potential benefits of AI, there are concerns about bias and the potential for AI to perpetuate existing inequalities. One area where bias based on sex has been reported is in the field of rheumatology, where sex-specific differences have consistently been observed for various diseases, including rheumatoid arthritis, axial spondyloarthritis, and ankylosing spondylitis.
In psoriatic arthritis, for example, sex differences may have important implications for clinical research in terms of epidemiology, clinical, radiological, and laboratory features, and response to treatment. Despite this, bias based on sex persists in clinical rheumatology. To address this issue within the context of artificial intelligence, researchers have used a national data set made available for analysis. The data set was derived from a multicenter study conducted in the dermatological outpatient clinics of Italian universities, and included information on 282 patients with psoriasis, of whom 55 were affected by psoriatic arthritis and 227 were not affected.
Table III: Predictive Performance of Machine Learning in Psoriatic Arthritis Diagnosis with or Without Information Regarding Sex
|Machine Learning||Recs Arthritis||Arthritis||Sensitivity||Specificity||Overall Accuracy|
|Without sex||Back propagation ANN a-b sequence||144||29||115||89.66%|
|Back propagation ANN b-a sequence||138||26||112||80.77%|
|With sex||Back propagation ANN a-b sequence||130||25||105||100%|
|Back propagation ANN b-a sequence||152||30||122||96.67%|
This table shows the predictive performance of machine learning in diagnosing psoriatic arthritis with or without information regarding sex. The machine learning method used in the study was back propagation artificial neural network (ANN) with two different sequences of a-b and b-a. The table presents the results of two experiments: one with sex information available and one without sex information available.
In the "Without sex" section of the table, the machine learning method was applied without using information on sex. The back propagation ANN a-b sequence correctly identified 144 patients with psoriatic arthritis and 115 patients without psoriatic arthritis. The back propagation ANN b-a sequence correctly identified 138 patients with psoriatic arthritis and 112 patients without psoriatic arthritis. The mean or sum of both sequences of the machine learning method was 85.21% for sensitivity, 90.32% for specificity, and 87.77% for overall accuracy.
In the "With sex" section of the table, the machine learning method was applied with using information on sex. The back propagation ANN a-b sequence correctly identified 130 patients with psoriatic arthritis and 105 patients without psoriatic arthritis, achieving 100% sensitivity and 88.57% specificity. The back propagation ANN b-a sequence correctly identified 152 patients with psoriatic arthritis and 122 patients without psoriatic arthritis, achieving 96.67% sensitivity and 92.62% specificity. The mean or sum of both sequences of the machine learning method was 98.33% for sensitivity, 90.6% for specificity, and 94.47% for overall accuracy.
The table shows that using information on sex improved the predictive performance of the machine learning method in diagnosing psoriatic arthritis. The overall accuracy of the machine learning method was higher when using sex information (94.47%) than when not using sex information (87.77%). This suggests that incorporating sex information into predictive models can help avoid disparities in predictive performance in both genders.
The researchers used machine learning systems to predict psoriatic arthritis diagnosis from 17 input variables, including sex, age, weight, BMI, psoriasis familiarity, familiarity for psoriatic arthritis, nail involvement, osteoarthritis, use of NSAIDs, PGA, PASI, and six other variables. Two experiments were carried out: one with sex information available, after automatic selection of variables with TWIST system; and one without sex information available. The same validation protocol was adopted with training-testing procedure with crossover. The experiments with sex and without sex were conducted in a blind and independent manner in two directions: training with sub-sample A and blind testing with sub-sample B versus training with sub-sample B and blind testing with sub-sample A. The best results obtained using classical Back Propagation artificial neural network as classifier are reported in Table III.
Information on gender allowed the machine learning system to reach an overall accuracy of 94.47% while the absence of this information was associated with a lower level of overall accuracy (87.77%). The improvement in overall accuracy resulted to be statistically significant (p<0.05). The researchers noted that neural networks can input multiple factor values simultaneously, combining and recombining them in different ways according to specific equations which are generally non-linear. In comparison with classical statistics, neural networks allow for the building up of a high number of independent models which have different predictive capacity in classifying patients according to certain targets, due to slight differences in their architecture, topology and learning laws. Overall, neural networks belonging to specific settings do not provide a unique solution, because their performance is determined by several factors, such as the initial randomised incidence of interconnections between nodes, the order of presentation of cases during the training cycle and the number of training cycles.
Other variables pertaining to the mathematical attributes of a specific neural network will also affect the final state of a trained neural network, allowing for a very high number of different possible combinations. Evolutionary algorithms have been proposed to find the most suitable design of neural networks, to allow a better prediction, given the high number of possible combinations of parameters. In this paper, the researchers have shown that combining ANNs with EA the variables models developed for a vast range of chronic diseases contain sex almost always.
Equalizing Gender Information in AI
The historical absence of women from the health professions and clinical research has led to medical knowledge that focuses on the male body and neglects female physiological differences. To ensure that gender-based inequalities do not manifest themselves in AI applied to medicine, great care is needed to incorporate gender information into predictive models in order to avoid disparities in predictive performance in the two genders.
The study found that including information about gender in the machine learning model significantly improved the accuracy of predicting psoriatic arthritis diagnosis, with an overall accuracy of 94.47% when gender was included compared to 87.77% when gender was excluded. This indicates that gender has an important role in the predictive model, and that omitting gender from the model can lead to disparities in predictive performance between genders.
The study also highlights the importance of incorporating gender information in predictive models to avoid gender-based inequalities. Historically, medical knowledge has focused on the male body and neglected female physiological differences, due in part to the historical absence of women from the health professions and clinical research. Incorporating gender information into predictive models can help to address these disparities and ensure that the benefits of AI in medicine are available to all genders.