Document Type

Article

Publication Date

2025

Abstract

This project explores the application of Big Data technologies for large-scale mental health analysis, focusing on the prevalence of depressive disorder symptoms across diverse demographic and geographic subgroups. Utilizing Apache Spark on Google Cloud Dataproc, the system efficiently processed millions of survey records stored in Hadoop Distributed File System (HDFS). Through comprehensive data preprocessing, aggregation, and visualization, the analysis revealed critical trends and disparities in mental health outcomes related to age, race, education level, gender, and state. Seasonal variations and subgroup-specific confidence intervals were also examined to identify high-risk populations and areas of measurement uncertainty. The results offer actionable insights for public health decision-makers, supporting targeted interventions and equitable resource allocation. This work demonstrates the potential of scalable data processing frameworks to inform data-driven mental health strategies and highlights the integration of computational tools in addressing public health challenges.

Program or Discipline Name

Computer and Information Sciences

Publication Title

Scalable Mental Health Analysis Using Big Data: A Demographic and Geographic Study of Depressive Symptoms

Start Page No.

1

End Page No.

13

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS