Datawiz-IN: fostering representative innovation in health data science—outcomes from a summer research experience

Afreen, Sadia; Krohannon, Alexander; Purkayastha, Saptarshi; Janga, Sarath Chandra

doi:10.1186/s12909-025-07298-1

Research
Open access
Published: 28 May 2025

Datawiz-IN: fostering representative innovation in health data science—outcomes from a summer research experience

Sadia Afreen¹,
Alexander Krohannon¹,
Saptarshi Purkayastha¹ &
…
Sarath Chandra Janga¹

BMC Medical Education volume 25, Article number: 793 (2025) Cite this article

306 Accesses
1 Altmetric
Metrics details

Abstract

The growing adoption of Artificial Intelligence (AI) across sectors highlights the importance of diverse perspectives in guiding its development and implementation. This study examines”Datawiz-IN” an educational initiative that provides data science and machine learning research experience to students from various backgrounds in biomedicine. Supported by a National Institutes of Health R25 grant from the National Library of Medicine, the program engaged cohorts of 14 students in Summer 2023 and 13 students in Summer 2024. Initial data suggest modest increases in representation, with higher participation rates of women and less prevalant students compared to typical AI research programs. Student projects addressed various aspects of biomedical data science, including disease mechanism analysis, clinical decision support systems, and health disparity investigations. While the program’s limited scale and short duration constrain broad generalizations, preliminary results indicate the potential benefits of structured inclusion efforts in expanding participation in AI research and development. This case study contributes to ongoing discussions about approaches for developing more representative AI systems and research communities, though longer-term studies will be needed to assess sustained impact. The findings suggest that targeted educational initiatives may play a role in broadening participation in AI development, while acknowledging that meaningful change requires sustained, systemic efforts across multiple institutions and career stages.

Peer Review reports

Introduction

Artificial intelligence (AI) adoption is rapidly expanding across sectors, yet balanced access remain elusive [1]. Further, AI systems may unintentionally perpetuate variability, yielding marginalized outcomes, particularly in healthcare applications [2]. Varied perspectives must inform AI ethics and governance to mitigate such risks, especially as these systems increasingly influence critical healthcare decisions. Currently, AI guidelines and regulations disproportionately reflect the viewpoints of industrialized nations, failing to account for the distinct values of less developed regions [3, 4]. These homogeneous perspectives risk perpetuating variability and limiting societal benefits [1, 5].

The underrepresentation of underrepresented voices in AI development is particularly concerning, with women and racial minorities comprising only 10–15% of the AI field [6]. This disparity reflects longstanding representation gaps in technology-related disciplines [7] and has direct implications for healthcare AI development and deployment.

Research demonstrates that representation spurs innovation, corrects variability, and promotes user-centric design [9], making open participation crucial for advancing access-driven AI solutions [10] (Figs 1 and 2).

Several initiatives have emerged to address this gap in AI diversity. While programs like AI4 ALL have achieved 40% women participation [11] and Carnegie Mellon University’s Data Science for All program [12] provides data literacy training, few focus specifically on healthcare AI applications. Recognizing this need, the National Institutes of Health (NIH) established the R25 initiative to support biomedical informatics training for historically excluded groups. This initiative provides targeted grants for specialized training programs developing hands-on expertise, rather than broad frameworks, to confront marginalization in biomedicine. As summarized by the NLM (2019), these programs actively strive to rectify representation gaps and power impactful research by students from underrepresented backgrounds. Through research training collaborations across over 30 higher education institutions (Fig. 3), the R25 initiative has created a nationwide network supporting diversity in biomedical informatics [13].

To address this critical need, we developed Datawiz-IN in 2023, building upon.

Indiana University’s long-standing commitment to diversity in STEM education. The program leverages the established Indiana University-Minority Serving Institutions (IU-MSI) STEM Initiative, which since 2006 has fostered partnerships with Historically Black Colleges and Universities (HBCUs), Hispanic-serving institutions, and Tribal Colleges [14]. Through these strategic collaborations, Datawiz-IN specifically aims to increase the participation of students from marginalized backgrounds in healthcare AI development.

The program represents a targeted response to the “leaky academic pipeline” [15] that has historically limited diversity in advanced scientific fields. Through NIH R25 funding, Datawiz-IN pursues two central priorities: broadening academic and research career opportunities for marginalized students while equipping them with impactful emerging technologies like AI [16]. The program provides:

Immersive research experiences in biomedical informatics and healthcare AI
Faculty mentorship tailored to participants’ contexts
Comprehensive professional development support
Structured pathways to graduate education and research careers [17]

Building on feedback from our inaugural 2023 cohort, we enhanced the program in 2024 through several key improvements:

Introduction of biweekly reflection dinners for strengthening mentor–mentee relationships
Enhanced academic support through GRE preparation workshops and graduate school guidance

Streamlined administrative processes, including pre-enrollment procedures
Development of a sophisticated project matching system aligning mentors and mentees
Expanded professional development opportunities through weekly seminars and research ethics training

This paper evaluates the implementation and outcomes of the Datawiz-IN program across its first two cohorts (2023–2024). We examine:

1.
Program structure and implementation strategies, including our approaches to fostering supportive environments and building a sense of belonging through mentorship programs, affinity groups, and mindset interventions
2.
Effectiveness of our recruitment and support mechanisms
3.
Participant research outcomes in Biomedical AI applications
4.
Impact on participants’ technical skills and professional development

Through this analysis, we demonstrate how targeted educational initiatives can successfully promote diversity in AI while advancing healthcare innovations.

Case study in focus: Datawiz-IN

Datawiz-IN: NLM R25-funded program

Datawiz-IN represents one component of broader institutional initiatives addressing representation in STEM fields. Through an NIH R25 grant, the program was established to provide research opportunities in biomedical informatics for students from historically underrepresented groups. The program structure combines technical training with mentorship support, offering undergraduate and graduate students an eight-week summer research experience. The curriculum includes both research activities and professional development components, with institutional funding covering participants’ expenses [18, 19].

Evaluation framework

The program assessment employed a mixed-methods approach examining three primary areas across the 2023 and 2024 cohorts. Demographic data were collected to examine participant representation. Research outputs were analyzed through a systematic review of methodological approaches, data visualization techniques, and potential applications within their respective domains. Additionally, participant experiences were assessed through surveys addressing skill development, mentorship effectiveness, encountered challenges, and overall program perceptions. This evaluation structure was designed to examine the program’s progress toward its stated objectives of expanding participation in biomedical informatics research while supporting participants’ professional development.

Materials and methods

Participant recruitment process

Participant recruitment utilized existing institutional networks and partnerships, primarily through established relationships with Minority-Serving Institutions (MSIs). While this approach facilitated outreach to certain underrepresented groups, it may have limited access for potential candidates outside these established networks. The recruitment process would benefit from systematic documentation of outreach methods and response rates to better understand potential selection biases.

The DataWiz-IN pathway program took a holistic approach to recruiting and selecting Scholars. Rather than relying solely on traditional metrics such as college grade point average (GPA) and Graduate Record Examinations (GRE), the program developed strategies to identify broader indicators of potential among students from educationally disadvantaged and underrepresented minority (ED-URM) backgrounds. The recruitment process emphasized identifying research aptitude and interpersonal capabilities that might not have been reflected in conventional academic measures. The program implemented a targeted marketing plan through established community partnerships, building on successful recruitment patterns from IU-MSI SSI, LiFT, and related initiatives. Program faculty activated their professional networks and distributed recruitment materials at biomedical informatics conferences. Additional outreach efforts focused specifically on conferences and venues with higher representation of women and students in technology and biosciences. Candidate evaluation incorporated multiple evidence sources of potential and achievement. Applicants submitted personal statements that addressed their identification with the program’s goals and their potential contributions to broadening participation in biosciences. The selection process considered letters of recommendation from college teachers, counselors, and advisors specifically addressing qualities such as persistence, innovation, collaborative abilities, and problem-solving capabilities in project-based environments. To recognize diverse forms of achievement, the program also accepted recommendations from community leaders documenting candidates’ civic engagement and participation in representation-focused college organizations.

Analysis data collection

Initial demographic data collection occurred during the application process, with information stored in the NIH XTrain portal system. While this system enables basic longitudinal tracking of participant metrics, limitations in the standardization of data entry and categorization should be noted. The portal’s predetermined classification systems may not fully capture the complexity of participants’ backgrounds and experiences.

Project collection and assessment

Project evaluation followed a structured protocol comprising both formative and summative components:

Regular progress reports submitted to graduate teaching assistants
Documentation of milestone completion against predetermined metrics
Assessment of methodological approach and implementation
Evaluation of deliverables against initial project specifications

Graduate teaching assistants maintained progress documentation, though standardization of monitoring criteria varied across projects. Faculty mentors conducted final evaluations based on project-specific rubrics, with potential variation in assessment criteria across different research domains.

Survey design and administration

Participant experiences were evaluated through a comprehensive exit survey comprising both structured and open-ended questions. The survey instrument included:

Eight questions using 4-point Likert scales (Strongly agree, Agree, Disagree, Strongly disagree) assessing:
- Program learning opportunities
- Content and structure alignment with career goals
- Skills confidence
- Workload satisfaction
- Performance feedback adequacy
- Professional competence development
- Career advancement aspirations
One mentor evaluation question with a 4-point scale (Exceeded Expectations, Met Expectations, Neutral, Did not meet expectations)
One overall satisfaction question with a 4-point scale (Extremely satisfied, Somewhat satisfied, Somewhat dissatisfied, Extremely dissatisfied)

Four open-ended questions capturing:
- Significant skills and knowledge gained
- Challenges encountered and potential solutions
- Plans for implementing gained experience
- Additional feedback

The survey was administered to all participants (n = 27) at the conclusion of each cohort’s summer experience.

Data analysis

Program outcomes were analyzed across three key dimensions: representation, innovation, and experience. For demographic analysis, we calculated percentages of participant representation across gender and ethnic categories to demonstrate diversity achievement in both 2023 and 2024 cohorts.

Project outcomes were systematically documented and analyzed through:

Categorization of technical approaches (e.g., machine learning, genomics, imaging techniques)
Documentation of healthcare domains addressed (e.g., neuroscience, RNA biology, kidney disease)

Compilation of institutional representation to demonstrate geographic and institutional representation
Recording of project deliverables and outcomes

Participant experiences were evaluated through:

Quantitative analysis of Likert-scale responses (reported as percentages of agreement levels)
Qualitative analysis of open-ended responses through sentiment analysis using the syuzhet R package

The sentiment analysis component generated numerical scores from −1 to + 10 for participant testimonials, acknowledging limitations such as:

Potential misinterpretation of technical terms in participant feedback
Limited ability to capture nuanced responses about specific program aspects
Challenges in accurately scoring complex statements about technical learning experiences

Results are presented primarily through:

Tabulated project summaries (Tables 1 and 2) showing the range of research approaches and institutional diversity
Sentiment analysis scores for participant feedback (Table 3)
Visual representations of participant diversity and selected project outcomes

Table 1 2023 Datawiz-IN Projects

Full size table

Table 2 2024 Datawiz-IN Projects

Full size table

Table 3 Sentiment Analysis Results for Program Satisfaction, Skills Confidence, Valuable Learning Opportunities, Content and Structure Alignment, and Professional Competence Development

Full size table

Results

Outcomes and impact of the program

Participant demographics

The Datawiz-IN program’s recruitment efforts focused on increasing participation from historically underrepresented groups in biomedical informatics. Demographic data from the 2023 and 2024 cohorts indicate that women and underrepresented students comprised approximately 60% of participants. While these percentages suggest improved representation compared to typical AI research programs, the limited sample size (n = 27 across both cohorts) necessitates caution in drawing broader conclusions about the program’s impact on field-wide representation.

Innovations and impacts: spotlight on participant projects

The Datawiz-IN program facilitated impactful AI and data science projects across healthcare domains. Participants leveraged diverse computational approaches, including machine learning, deep learning, and AI-driven analysis techniques, complemented by domain-specific methods such as gene sequencing and spatial transcriptomics (see Table 1 & Table 2). Their work demonstrates how diverse perspectives can enhance AI applications in healthcare, particularly in addressing health disparities and improving patient care for underserved populations.

To illustrate the impact of diverse perspectives in AI research, we highlight two exemplar projects that showcase how participants’ backgrounds influenced their approach to healthcare challenges:

1.
AI-Driven Wound Care Classification (Fig. 4): A participant from North Carolina A&T State University developed a machine learning model to address wound care disparities in underserved communities. The project analyzed a dataset of 19,896 patients, creating an automated classification system for wound healing status. The AI model was specifically designed to account for varying wound presentations across different ethnic groups and socioeconomic backgrounds, addressing a known gap in existing wound care algorithms. The visualization shows the distribution of wound types used to train the model, particularly to”gray area” cases (2,574 patients) that often present classification challenges in minority populations.
2.
AI-Enhanced Genomic Analysis (Fig. 5): A participant integrated machine learning techniques with genomic analysis to study stress response mechanisms in disease. The project employed deep learning models to analyze m6 A modification patterns, identifying previously unknown stress response genes. The participant’s background in both computer science and biology enabled them to develop an AI approach that could process complex genomic data while maintaining biological interpretability. The visualization demonstrates how AI-driven analysis revealed significant differences in m6 A sites under heat shock conditions (median 16 vs. 8), with the IGV browser validation confirming the AI model’s predictions.

These projects exemplify how Datawiz-IN achieves dual objectives:

Providing advanced AI and computational training to students from underrepresented backgrounds
Enabling diverse perspectives to influence the development of AI healthcare solutions

The program’s impact extends beyond technical achievements, as participants bring their lived experiences and cultural understanding to address healthcare challenges through AI applications. This integration of diverse viewpoints with AI expertise has led to more comprehensive and equitable healthcare solutions, as evidenced by the attention to population-specific factors in project designs.

This bar chart illustrates the distribution of wound types in the dataset, including”not healing” (8,334 patients),”healed” (6,320 patients),”healing” (2,668 patients), and”gray area” (2,574 patients).

The dataset highlights the prevalence of non-healing wounds, which are critical for clinical intervention and pose significant challenges in wound care management. These distributions formed the basis for training the machine learning model to classify wound types effectively.

Experience evaluation

All 27 participants completed the exit survey, achieving a 100% response rate. The quantitative responses demonstrated strong positive outcomes, with 87% of respondents strongly agreeing that the program provided valuable learning opportunities. The 4-point Likert scale responses (Strongly agree to Strongly disagree) revealed high satisfaction across multiple dimensions, with over 90% of participants reporting significant gains in skills confidence.

Five key themes emerged from the sentiment analysis of open-ended responses, with scores ranging from −1 (most negative) to + 10 (most positive):

Overall Program Satisfaction (mean score: 2.205)
Confidence About Skills Gained (mean score: 0.9909)
Value of Learning Opportunities (mean score: 1.073)
Career Goals Alignment (mean score: 0.5545)
Professional Competence Development (mean score: 0.8545)

Qualitative feedback highlighted both program strengths and areas for improvement. One participant emphasized the program’s collaborative environment: Participant 2 quoted’I loved how the fellowship was set up. Meeting so many different people helped boost my confidence’. Technical skill development was frequently mentioned, as illustrated by participant 5:’I gained proficiency in retrieving promoter sequences from UNIPROT…I am confident that these newly acquired skills will greatly contribute to my success and growth’.

Despite the overall positive outcomes, approximately one-third of participants noted challenges related to adapting to new skills, managing time constraints, and balancing personal responsibilities. As participant 11 reflected:’I was new to everything so I had to learn as I went along. Though that was a bit challenging, it paid off’, demonstrating resilience through the learning process.

The sentiment analysis results (Table 3) revealed varying degrees of program effectiveness. While Overall Program Satisfaction showed the highest mean score (2.205) and widest range (−0.45 to 10.100), indicating diverse experiences, consistent positive sentiment was observed in Skills Confidence (mean: 0.9909) and Learning Opportunities (mean: 1.073). The lower mean score for Career Goals Alignment (0.5545) suggests an opportunity for better customization to participant aspirations. Professional Competence Development showed moderate positive sentiment (mean: 0.8545), though with room for enhancement.

Discussion

This analysis examines the implementation of the Datawiz-IN program in healthcare AI education and research. While the program’s representation of women students and underrepresented participants (approximately 60%) exceeded typical field demographics of 10–15% [6], the small sample size (n = 27) limits broader generalizations about the program’s impact on diversity in AI.

The participants’ research projects addressed various healthcare applications, with some projects incorporating demographic considerations:

The wound care classification project addressed racial and socioeconomic disparities in wound healing assessment
The diabetes phenotype study revealed important demographic variations in rural and Hispanic populations
The COVID-19 outcomes analysis identified disparities between metropolitan and non-metropolitan hospitals

Through implementing the Datawiz-IN program, we identified several effective strategies for fostering inclusion in health data science education:

1.
Fostering inclusive environments: Creating welcoming spaces for peer interactions and communication has shown positive impact on belonging for marginalized STEM students [20]. Our implementation included inclusive language and active listening practices.
2.
Role model mentors: Mentoring connects students to role models with shared identities and experiences, enhancing belonging [21]. Our mentor schemes paired undergraduates with senior STEM students and faculty for both academic and psychosocial support.
3.
Affinity groups through cohort-based activities: Cultural sharing activities, including visits to Indianapolis Canal and shared meals, strengthened community bonds.
4.
Mindset interventions: Activities addressing belonging uncertainty and stereotype threats improved persistence among women and minorities in STEM [22], helping participants reframe challenges as surmountable.
5.
Growth mindset training: Workshops helped students view intellectual abilities as malleable through effort [23], building resilience particularly among negatively stereotyped groups.

Several limitations should be noted. The program’s small scale (14 students in 2023, 13 in 2024) and short duration (8–10 weeks) constrain the generalizability of outcomes. Following approaches used in other NIH-funded programs [24], longitudinal tracking would be necessary to assess career impacts and retention in the field.

Implementation challenges included institutional barriers and resource constraints. While other IU initiatives address similar challenges [25], systematic changes would require broader institutional commitment. The Datawiz-IN program represents one approach to increasing representation in AI research, though its effectiveness compared to alternative interventions remains to be established.

As AI applications in healthcare expand, the need for diverse perspectives in development and implementation grows. While this program suggests potential approaches for supporting underrepresented students in AI research, more extensive studies would be needed to validate these methods. The participants’ projects indicate possibilities for incorporating varied perspectives in healthcare AI development, though their long-term impact on healthcare delivery remains to be determined.

This early-stage initiative provides preliminary insights into supporting diversity in AI education, while acknowledging that sustained efforts across multiple institutional levels would be necessary for systemic change. Future work should focus on rigorous evaluation of intervention effectiveness and the development of scalable, evidence-based support structures.

Conclusion

This analysis of the Datawiz-IN program offers insights into implementing biomedical data science research experiences for underrepresented students. While our sample size (n = 27 across two cohorts) precludes definitive conclusions, several observations may inform similar initiatives. The program’s approach to recruitment and support structures suggests potential strategies for addressing persistent representation gaps in biomedical data science, though longer-term studies would be needed to validate their effectiveness.

The program’s experience highlights three key considerations for future initiatives: First, recruitment strategies that look beyond traditional academic metrics may help identify promising candidates from diverse backgrounds. However, such approaches require careful validation and refinement to ensure both equity and excellence in selection processes. Second, structured mentorship frameworks combining technical guidance with professional development support appear beneficial, though their optimal implementation may vary by institutional context. Third, integrating discussions of health disparities into technical training may help participants connect their research to broader societal impacts, while maintaining rigorous methodological standards.

Data availability

Due to the limited sample size, releasing de-identified information raises ethical concerns and risks of reidentification. Therefore, it is not feasible.

Abbreviations

NLM:: National Library of Medicine
NIH:: National Insititues of Health
IU-MSI:: Indiana University and Minority Serving Institutions
STEM:: Science, Technology, Engineering, Mathematics
HBCU:: Historically Black college or university
AI:: Artificial Intelligence
RNA:: Ribonucleic acid
ICU:: Intensive Care Unit
IU:: Indiana University
IUI:: Indiana University Indianapolis
HBCUs:: Historically Black Colleges and Universities
UNIPROT:: The Universal Protein Resource
MS:: Master of Science
BS:: Bachelor of Science

References

Roche C, Wall P, Lewis D. Ethics and diversity in artificial intelligence policies, strategies and initiatives. AI and Ethics. 2023;3(4):1095–115.
Article Google Scholar
Owosela BO, Steinberg RS, Leslie SL, Celi LA, Purkayastha S, Shiradkar R, et al. Identifying and improving the “ground truth” of race in disparities research through improved EMR data reporting. A systematic review. International Journal of Medical Informatics. 2023;p. 105303.
Alberto IRI, Alberto NRI, Altinel Y, Blacker S, Binotti WW, Celi LA, et al. A scientometric analysis of fairness in health AI literature. PLOS Global Public Health. 2024;4(1): e0002513.
Article Google Scholar
Gallifant J, Zhang J, Whebell S, Quion J, Escobar B, Gichoya J, et al. A new tool for evaluating health equity in academic journals; the Diversity Factor. PLOS Global Public Health. 2023;3(8): e0002252.
Article Google Scholar
Gichoya JW, Thomas K, Celi LA, Safdar N, Banerjee I, Banja JD, et al. AI pitfalls and what not to do: mitigating bias in AI. Br J Radiol. 2023;96(1150):20230023.
Article Google Scholar
Salas-Pilco SZ, Xiao K, Oshima J. Artificial intelligence and new technologies in inclusive education for minority students: a systematic review. Sustainability. 2022;14(20):13572.
Article Google Scholar
West SM, Whittaker M, Crawford K. Discriminating systems. AI Now. 2019;p. 1–33.
World Economic Forum.: A Blueprint for Equity and Inclusion in Artificial Intelligence. Accessed: insert-date-of-access. https://www.weforum.org/.
Stathoulopoulos K, Mateos-Garcia JC. Gender diversity in AI research. Available at SSRN 3428240. 2019;.
Whittaker M, Crawford K, Dobbe R, Fried G, Kaziunas E, Mathur V, et al. AI now report 2018. AI Now Institute at New York University New York; 2018.
: AI4ALL: Opening doors to artificial intelligence for all. AI4ALL. Accessed: 2024–02–20. https://ai-4-all.org/.
: Data Literacy Education Program. Carnegie Mellon University Library. Accessed: 2024–02–20. https://www.library.cmu.edu/about/news/2022-04/data-literacy-education-program.
: NLM’s Short-Term Research Education Experiences to Attract Talented Students to Biomedical Informatics/Data Science Careers and Enhance Diversity. National Library of Medicine. Accessed: 2024–02–04. https://grants.nih.gov/ grants/guide/rfa-files/RFA-LM-22–001.html.
: IU-MSI STEM Initiative. Indiana University. Accessed: 2024–02–04. https://iustem.sitehost.iu.edu/index.html.
Sarraju A, Ngo S, Rodriguez F. The leaky pipeline of diverse race and ethnicity representation in academic science and technology training in the United States, 2003–2019. PLoS ONE. 2023;18(4): e0284945.
Article Google Scholar
Hofstra B, McFarland DA, Smith S, Jurgens D. Diversifying the professoriate Socius. 2022;8:23780231221085120.
Google Scholar
Rawlings-Goss R. Data science careers, training, and hiring. Springer; 2019.
University I.: Grant helps Luddy Indianapolis prepare students to shape future of biomedical research. Accessed: 2024–02–04. https://luddy.iupui.edu/news-events/ news/grant-helps-luddy-indianapolis-prepare-students.html.
of Informatics Computing LS, at IUPUI E.: Datawiz-IN Scholars Program - NLM funded R25. Accessed: 2024–02–04. https://datawiz.iupui.edu/index.html.
Chattopadhyay D. Exploring Effects of Institutional, Interpersonal, & Individual Communication on University Students’ Attitudes about Diversity and Institutional Belongingness. Intercultural Communication Education. 2022;5(2):39–58.
Article Google Scholar
Fuesting MA, Diekman AB. Not by success alone: Role models provide pathways to communal opportunities in STEM. Pers Soc Psychol Bull. 2017;43(2):163–76.
Article Google Scholar
Baldwin A, Bunting B, Daugherty D, Lewis L, Steenbergh T. Promoting belonging, growth mindset, and resilience to foster student success. The National Resource Center for The First-Year Experience; 2020.
Zeeb H, Ostertag J, Renkl A. Towards a growth mindset culture in the classroom: Implementation of a lesson-integrated mindset training. Education Research International. 2020;2020:1–13.
Article Google Scholar
Feig AL, Robinson L, Yan S, Byrd M, Mathur A. Using longitudinal data on career outcomes to promote improvements and diversity in graduate education. Change: The Magazine of Higher Learning. 2016;48(6):42–49.
Russomanno D, Ivey S, Campbell JT, Robinson A, Stewart CO, Alfrey K, et al. Urban STEM collaboratory after two years: A multi-institutional approach to the success of financially disadvantaged students. Journal of STEM Education: Innovations and Research. 2023;24(2).

Download references

Acknowledgements

We acknowledge the support from IU-MSI, in particular, Howard Simms—Assistant Dean, Diversity and Inclusion, and Graduate Assistants—Ciara Thomas and Kerry Guest. We also thank numerous peer mentors and program faculty, without whom the Datawiz program would not be possible.

Funding

The work on this project is funded by the National Library of Medicine of the US National Institutes of Health through an R25 award (#1R25LM014209).

Author information

Authors and Affiliations

Department of Biomedical Engineering and Informatics, Indiana University Indianapolis, Indianapolis, IN, 46202, USA
Sadia Afreen, Alexander Krohannon, Saptarshi Purkayastha & Sarath Chandra Janga

Authors

Sadia Afreen
View author publications
Search author on:PubMed Google Scholar
Alexander Krohannon
View author publications
Search author on:PubMed Google Scholar
Saptarshi Purkayastha
View author publications
Search author on:PubMed Google Scholar
Sarath Chandra Janga
View author publications
Search author on:PubMed Google Scholar

Contributions

Program Design and Implementation, S.P. and S.C.J.; Data Analysis, S.A and X.K; Data Curation, S.P. and S.C.J. and X.K. and S.A.; writing original draft, S.A., writing—review and editing, S.P. and S.C.J and X.K; Funding Acquisition, S.P. and S.C.J.

Corresponding author

Correspondence to Sadia Afreen.

Ethics declarations

Ethics approval and consent to participate

We received approval for Expedited Human Subjects research from the Indiana University Institutional Review Board (#16288) for this study.

Consent for publication

Informed consent was obtained from all subjects involved in the study.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Material 1.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Afreen, S., Krohannon, A., Purkayastha, S. et al. Datawiz-IN: fostering representative innovation in health data science—outcomes from a summer research experience. BMC Med Educ 25, 793 (2025). https://doi.org/10.1186/s12909-025-07298-1

Download citation

Received: 19 March 2024
Accepted: 06 May 2025
Published: 28 May 2025
DOI: https://doi.org/10.1186/s12909-025-07298-1

Datawiz-IN: fostering representative innovation in health data science—outcomes from a summer research experience

Abstract

Introduction

Case study in focus: Datawiz-IN

Datawiz-IN: NLM R25-funded program

Evaluation framework

Materials and methods

Participant recruitment process

Analysis data collection

Project collection and assessment

Survey design and administration

Data analysis

Results

Outcomes and impact of the program

Participant demographics

Innovations and impacts: spotlight on participant projects

Experience evaluation

Discussion

Conclusion

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Supplementary Information

Supplementary Material 1.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Medical Education

Contact us