Publications

2025

Cummins JA, Gottlieb DJ, Sofer T, Wallace DA. Applying Natural Language Processing Techniques to Map Trends in Insomnia Treatment Terms on the r/Insomnia Subreddit: Infodemiology Study.. Journal of medical Internet research. 2025;27:e58902.

BACKGROUND: People share health-related experiences and treatments, such as for insomnia, in digital communities. Natural language processing tools can be leveraged to understand the terms used in digital spaces to discuss insomnia and insomnia treatments.

OBJECTIVE: The aim of this study is to summarize and chart trends of insomnia treatment terms on a digital insomnia message board.

METHODS: We performed a natural language processing analysis of the r/insomnia subreddit. Using Pushshift, we obtained all r/insomnia subreddit comments from 2008 to 2022. A bag of words model was used to identify the top 1000 most frequently used terms, which were manually reduced to 35 terms related to treatment and medication use. Regular expression analysis was used to identify and count comments containing specific words, followed by sentiment analysis to estimate the tonality (positive or negative) of comments. Data from 2013 to 2022 were visually examined for trends.

RESULTS: There were 340,130 comments on r/insomnia from 2008, the beginning of the subreddit, to 2022. Of the 35 top treatment and medication terms that were identified, melatonin, cognitive behavioral therapy for insomnia (CBT-I), and Ambien were the most frequently used (n=15,005, n=13,461, and n=11,256 comments, respectively). When the frequency of individual terms was compared over time, terms related to CBT-I increased over time (doubling from approximately 2% in 2013-2014 to a peak of over 5% of comments in 2018); in contrast, terms related to nonprescription over-the-counter (OTC) sleep aids (such as Benadryl or melatonin) decreased over time. CBT-I-related terms also had the highest positive sentiment and showed a spike in frequency in 2017. Terms with the most positive sentiment included "hygiene" (median sentiment 0.47, IQR 0.31-0.88), "valerian" (median sentiment 0.47, IQR 0-0.85), and "CBT" (median sentiment 0.42, IQR 0.14-0.82).

CONCLUSIONS: The Reddit r/insomnia discussion board provides an alternative way to capture trends in both prescription and nonprescription sleep aids among people experiencing sleeplessness and using social media. This analysis suggests that language related to CBT-I (with a spike in 2017, perhaps following the 2016 recommendations by the American College of Physicians for CBT-I as a treatment for insomnia), benzodiazepines, trazodone, and antidepressant medication use has increased from 2013 to 2022. The findings also suggest that the use of OTC or other alternative therapies, such as melatonin and cannabis, among r/insomnia Reddit contributors is common and has also exhibited fluctuations over time. Future studies could consider incorporating alternative data sources in addition to prescription medication to track trends in prescription and nonprescription sleep aid use. Additionally, future prospective studies of insomnia should consider collecting data on the use of OTC or other alternative therapies, such as cannabis. More broadly, digital communities such as r/insomnia may be useful in understanding how social and societal factors influence sleep health.

See also: Insomnia, Sleep
andrew.mcintosh@ed.ac.uk MDDWG of the PGCE address:, Consortium MDDWG of the PG. Trans-ancestry genome-wide study of depression identifies 697 associations implicating cell types and pharmacotherapies.. Cell. 2025;188(3):640-652.e9.

In a genome-wide association study (GWAS) meta-analysis of 688,808 individuals with major depression (MD) and 4,364,225 controls from 29 countries across diverse and admixed ancestries, we identify 697 associations at 635 loci, 293 of which are novel. Using fine-mapping and functional tools, we find 308 high-confidence gene associations and enrichment of postsynaptic density and receptor clustering. A neural cell-type enrichment analysis utilizing single-cell data implicates excitatory, inhibitory, and medium spiny neurons and the involvement of amygdala neurons in both mouse and human single-cell analyses. The associations are enriched for antidepressant targets and provide potential repurposing opportunities. Polygenic scores trained using European or multi-ancestry data predicted MD status across all ancestries, explaining up to 5.8% of MD liability variance in Europeans. These findings advance our global understanding of MD and reveal biological targets that may be used to target and develop pharmacotherapies addressing the unmet need for effective treatment.

Goodman MO, Faquih T, Paz V, Nagarajan P, Lane JM, Spitzer B, et al. Genome-wide association analysis of composite sleep health scores in 413,904 individuals.. Communications biology. 2025;8(1):115.

Recent genome-wide association studies (GWASs) of several individual sleep traits have identified hundreds of genetic loci, suggesting diverse mechanisms. Moreover, sleep traits are moderately correlated, so together may provide a more complete picture of sleep health, while illuminating distinct domains. Here we construct novel sleep health scores (SHSs) incorporating five core self-report measures: sleep duration, insomnia symptoms, chronotype, snoring, and daytime sleepiness, using additive (SHS-ADD) and five principal components-based (SHS-PCs) approaches. GWASs of these six SHSs identify 28 significant novel loci adjusting for multiple testing on six traits (p < 8.3e-9), along with 341 previously reported loci (p < 5e-08). The heritability of the first three SHS-PCs equals or exceeds that of SHS-ADD (SNP-h2 = 0.094), while revealing sleep-domain-specific genetic discoveries. Significant loci enrich in multiple brain tissues and in metabolic and neuronal pathways. Post-GWAS analyses uncover novel genetic mechanisms underlying sleep health and reveal connections (including potential causal links) to behavioral, psychological, and cardiometabolic traits.

Wallace DA, Evenson KR, Isasi CR, Patel SR, Sotres-Alvarez D, Zee PC, et al. Characteristics of objectively-measured naturalistic light exposure patterns in U.S. adults: A cross-sectional analysis of two cohorts.. The Science of the total environment. 2025;969:178839.

Light is an environmental feature important for human physiology. Investigation of how light affects population health requires exposure assessment and personal biomonitoring efforts. Here, we derived measures of amount, duration, regularity, and timing from objective personal light (lux) measurement in >4000 participants across two United States (US)-based cohort studies, the Multi-Ethnic Study of Atherosclerosis (MESA) and the Hispanic Community Health Study / Study of Latinos (HCHS/SOL), encompassing eight geographic regions. Objective light and actigraphy data were collected over a week using wrist-worn devices (Actiwatch Spectrum). Cohort-stratified light exposure metrics were analyzed in relation to sex, season, time-of-day, location, and demographic and sleep health characteristics using Spearman correlation and linear and logistic regressions (separately by cohort) adjusted for age, sex (where applicable), and exam site. Light exposure showed sex-specific patterns and had seasonal, diurnal, geographic, and demographic and sleep health-related correlates. Results between independent cohorts were strongly consistent, supporting the utility and feasibility of light biomonitoring. These findings provide a fundamental first characterization of light exposure patterns in a large US sample and will inform future work to incorporate light as a biologically relevant exposure in environmental public health and key component of the human exposome.

2024

Chung J, Goodman MO, Huang T, Castro-Diehl C, Chen JT, Sofer T, et al. Objectively regular sleep patterns and mortality in a prospective cohort: The Multi-Ethnic Study of Atherosclerosis.. Journal of sleep research. 2024;33(1):e14048.

Irregular sleep and non-optimal sleep duration separately have been shown to be associated with increased disease and mortality risk. We used data from the prospective cohort Multi-Ethnic Study of Atherosclerosis sleep study (2010-2013) to investigate: do aging adults whose sleep is objectively high in regularity in timing and duration, and of sufficient duration tend to have increased survival compared with those whose sleep is lower in regularity and duration, in a diverse US sample? At baseline, sleep was measured by 7-day wrist actigraphy, concurrent with at-home polysomnography and questionnaires. Objective metrics of sleep regularity and duration from actigraphy were used for statistical clustering using sparse k-means clustering. Two sleep patterns were identified: "regular-optimal" (average duration: 7.0 ± 1.0 hr obtained regularly) and "irregular-insufficient" (duration: 5.8 ± 1.4 hr obtained with twice the irregularity). Using proportional hazard models with multivariate adjustment, we estimated all-cause mortality hazard ratios. Among 1759 participants followed for a median of 7.0 years (Q1-Q3, 6.4-7.4 years), 176 deaths were recorded. The "regular-optimal" group had a 39% lower mortality hazard than did the "irregular-insufficient" sleep group (hazard ratio [95% confidence interval]: 0.61 [0.45, 0.83]) after adjusting for socio-demographics, lifestyle, medical comorbidities and sleep disorders. In conclusion, a "regular-optimal" sleep pattern was significantly associated with a lower hazard of all-cause mortality. The regular-optimal phenotype maps behaviourally to regular bed and wake times, suggesting sleep benefits of adherence to recommended healthy sleep practices, with further potential benefits for longevity.

See also: Sleep
Li Y, Wong KY, Howard AG, Gordon-Larsen P, Highland HM, Graff M, et al. Mendelian randomization with incomplete measurements on the exposure in the Hispanic Community Health Study/Study of Latinos.. HGG advances. 2024;5(1):100245.

Mendelian randomization has been widely used to assess the causal effect of a heritable exposure variable on an outcome of interest, using genetic variants as instrumental variables. In practice, data on the exposure variable can be incomplete due to high cost of measurement and technical limits of detection. In this paper, we propose a valid and efficient method to handle both unmeasured and undetectable values of the exposure variable in one-sample Mendelian randomization analysis with individual-level data. We estimate the causal effect of the exposure variable on the outcome using maximum likelihood estimation and develop an expectation maximization algorithm for the computation of the estimator. Simulation studies show that the proposed method performs well in making inference on the causal effect. We apply our method to the Hispanic Community Health Study/Study of Latinos, a community-based prospective cohort study, and estimate the causal effect of several metabolites on phenotypes of interest.

Wallace DA, Qiu X, Schwartz J, Huang T, Scheer FAJL, Redline S, et al. Light exposure during sleep is bidirectionally associated with irregular sleep timing: The multi-ethnic study of atherosclerosis (MESA).. Environmental pollution (Barking, Essex : 1987). 2024;344:123258.

Exposure to light at night (LAN) may influence sleep timing and regularity. Here, we test whether greater light exposure during sleep (LEDS) is bidirectionally associated with greater irregularity in sleep onset timing in a large cohort of older adults in cross-sectional and short-term longitudinal (days) analyses. Light exposure and activity patterns, measured via wrist-worn actigraphy (ActiWatch Spectrum), were analyzed in 1933 participants with 6+ valid days of data in the Multi-Ethnic Study of Atherosclerosis (MESA) Exam 5 Sleep Study. Summary measures of LEDS averaged across nights were evaluated in linear and logistic regression analyses to test the association with standard deviation (SD) in sleep onset timing (continuous variable) and irregular sleep onset timing (SD > 90 min, binary). Night-to-night associations between LEDS and absolute differences in nightly sleep onset timing were also evaluated with distributed lag non-linear models and mixed models. In between-individual linear and logistic models adjusted for demographic, health, and seasonal factors, every 5-lux unit increase in LEDS was associated with a 7.8-min increase in sleep onset SD (β = 0.13 h, 95%CI:0.09-0.17) and 32% greater odds (OR = 1.32, 95%CI:1.17-1.50) of irregular sleep onset. In within-individual night-to-night mixed model analyses, every 5-lux unit increase in LEDS the night prior was associated with a 2.2-min greater deviation of sleep onset the next night (β = 0.036 h, p < 0.05). Conversely, every 1-h increase in sleep deviation was associated with a 0.35-lux increase in future LEDS (β = 0.348 lux, p < 0.05). LEDS was associated with greater irregularity in sleep onset in between-individual analyses and subsequent deviation in sleep timing in within-individual analyses, supporting a role for LEDS in irregular sleep onset timing. Greater deviation in sleep onset was also associated with greater future LEDS, suggesting a bidirectional relationship. Maintaining a dark sleeping environment and preventing LEDS may promote sleep regularity and following a regular sleep schedule may limit LEDS.

See also: Sleep
Meng X, Navoly G, Giannakopoulou O, Levey DF, Koller D, Pathak GA, et al. Multi-ancestry genome-wide association study of major depression aids locus discovery, fine mapping, gene prioritization and causal inference.. Nature genetics. 2024;56(2):222-33.

Most genome-wide association studies (GWAS) of major depression (MD) have been conducted in samples of European ancestry. Here we report a multi-ancestry GWAS of MD, adding data from 21 cohorts with 88,316 MD cases and 902,757 controls to previously reported data. This analysis used a range of measures to define MD and included samples of African (36% of effective sample size), East Asian (26%) and South Asian (6%) ancestry and Hispanic/Latin American participants (32%). The multi-ancestry GWAS identified 53 significantly associated novel loci. For loci from GWAS in European ancestry samples, fewer than expected were transferable to other ancestry groups. Fine mapping benefited from additional sample diversity. A transcriptome-wide association study identified 205 significantly associated novel genes. These findings suggest that, for MD, increasing ancestral and global diversity in genetic studies may be particularly important to ensure discovery of core genes and inform about transferability of findings.