Aside

A PhD student with an interest in exploring the space between natural language processing (NLP), public forums as a data source and substance use disorder (SUD).

Contact Info

Skills

Disclaimer

Last updated on 2022-04-16.

Main

Layla Bouzoubaa

Public Health - data science, visualization, engineering

Education

Drexel University

PhD Information Science

Philadelphia, PA

2021–Present

University of Miami Miller School of Medicine

M.S. Public Health cn. Biostatistics

Miami, FL

2015–2016

Georgia Institute of Technology

B.S. Industrial & Systems Engineering

Atlanta, GA

2009–2014

Professional/Research Experience

Graduate Assistant

Drexel University

Philadelphia, PA

2021–

  • Research Assistant: NIH NCATS Biomedical Translator
    • Wrote biomedical publications database SOP
    • Managed aspects of Drexel team projects like taking all meeting notes, tickets in Jira, issues in GitHub, ran weekly team stand-ups
    • Unit testing on our component of the translator (explainable autonomous relay agent)
    • Redesigned how multihop queries can be interpreted by our agent
    • Developed an R package (PMCfetchR) that allows the user to retrieve full-text, machine readable articles, as a tidy dataframe meant for text analytics/NLP tasks. This package utilizes the AWS S3 REST API made available by PMC. The resulting datasets are then used to do named entity recognition (NER) using BioBERT to formulate our agent’s explanations.
    • Maintenance of case-base: check for possible bugs i.e. duplication
  • Teaching Assistant: DSCI 501 - Math/Programming Foundations in Data Science
    • Grade student python programming assignments
    • Provided 3 hours a week of dedicated office hours to answer student questions plust answer any ad hoc questions over email or course Discord channel
    • Assisted professor in designing final exam
    • Prepared supplemental lecture material on probability theory and basic statistics

Lead Research Analyst

University of Miami Miller School of Medicine

Miami, FL

2019–2021

  • HEALing Communities Study Analyst - address opioid misuse use disorder (OUD)
  • Build Shiny App to visualize community characters and map of naloxone facilities
  • Design and disseminate statistical reports and visualizations by community with R and R Markdown
  • Process population data for NY communities using the US Census Bureau via API
  • Create maps from shapefiles (tigris) and overlay various population health risk factors
  • CTN0094 Analyst- Predict response to OUD treatments
  • Spell Checker Shiny App to harmonize and standardize drug names
  • SCAN360.com Project Manager
  • University of Miami COVID-19 Contact Tracking and Tracing Committee Principal Analyst
  • Observing COVID-19 policy adoption in collaboration with the University of Miami Institute for Advanced Study of the Americas, led by Dr. Felicia Knaul, and Universidad Anáhuac in Mexico

Biostatistician Programmer

University of Miami Miller School of Medicine - Sylvester Comprehensive Cancer Center

Miami, FL

2017–2019

  • Acquire & process population data from various sources
  • Calculate various measures of cancer burden including age-adjusted incidence, mortality, survival rates
  • Prototype new data sources via Shiny App and R package development

Epidemiology Researcher

Université Sidi Mohamed Ben Abdellah Faculté de Médecine et de Pharmacie

Fes, Morocco

2014–2015

  • Statistical consulting
  • Literature reviews for a multitude of research initiatives

Content Developer

Great Parents Academy

Atlanta, GA

2013–2014

  • Develop digital lessons that were integrated into the educational supplemental software - XML

Selected Teaching Experience

BST 692 - Data Science & Machine Learning for Health Research

University of Miami Department of Public Health Sciences

Miami, FL

2019–2021

  • Co-Instructor
  • Developed curriculum to a Biostatistics course to teach computational data science methods to graduate students in a modular format that follows the basic components of a data pipeline. Primary topics taught: version control with git and GitHub, building Shiny applications - theory and applied, building interactive visualizations, working with geospatial data.

BST 625 - Statistical Computing

University of Miami Department of Public Health Sciences

Miami, FL

2016

  • Teaching Assistant

BST 601 - Introduction to Biostatistics

University of Miami Department of Public Health Sciences

Miami, FL

2016

  • Teaching Assistant

BST 601 - Computing Lab Instructor

University of Miami Department of Public Health Sciences

Miami, FL

2016

  • Re-introduce a SAS procedure covered in lecture and assist students with code-based assignments. This lab met once a week for 1.5 hours through out the duration of the fall semester.

Refereed International Presentations

{DOPE}: An R Package for Processing & Classifying Drug Names

Virtual conference lightening talk to demonstrate to international members of the R in Medicine community the functionality of DOPE.

Online

August 2021

L Bouzoubaa, G Odom, R Balise

DOPE: A Package for the R Language to Process/Classify Drug Names

Virtual Poster/Q&A at the annual CPDD Conference. This poster will introduce members of the drug
dependence scientific community to DOPE, an R package to parse out drug information from a corpus of text. This package assists in identifying potential patterns of drug dependency.

Online

June 2021

R Balise, G Odom, L Bouzoubaa, S Luo, D Feaster

Machine Learning to Describe Polysubstance Abuse in CTN Opioid Trials

Online presentation at College of Problems of Drug Dependence

Online

June 2020

R Balise, L Bouzoubaa, E Nunes, S Luo, D Feaster

Invited Presentations & Lectures

What Goes Around Comes Back Around: Benifit of Community Involvment

Guest Lecturer for University of Miami EPH 607 - Health Communications.
Description: You are surrounded by potential opportunities just waiting around the corner. Can you find them? This talk will provide tips on how to harness the power of community for scientific or career-focused aspirations. This talk will hopefully motivate students to get a jump (if they haven’t already) on establishing an online and physical presence to help them build a fruitful network of their peers. This talk will take students through several examples based on personal experiences as well as engage students to share their own experiences to develop a collective knowledge base of tips and tricks.

Online

April 2022

L Bouzoubaa

Introduction to Public Health Data Science with the R Programming Language

A Workshop given to the attendees of the NSF Re-enter STEM Through Emerging Technology (RESET). The workshop was open to all levels of participants and given twice for a duration of 1.5 hours each. This workshop employs computational data science techniques covering the first 4 main steps of a typical data science pipeline - acquisition, Tidying/Manipulation, Transformation, and Visualization/Communication. All material are accessible on GitHub and work space environments were created on RStudio Cloud for participants who were unfamiliar with the RStudio IDE.

Online

March 2021

L Bouzoubaa

Introduction to R Markdown

This workshop was given to the R-Ladies Miami community to demonstrate how to create and prepare presentation-ready R Markdown files within RStudio. This workshop used socially relevant and publicly accessible data and focused on output that can be utilized for presentations and publications. This workshop was free and open to all levels of R users. All material are available on GitHub.

Miami, FL

March 2020

L Bouzoubaa

Tearing Down Barriers to Publication with R Markdown

A 3 hour comprehensive seminar presented to Florida International University graduate students on how to use R for publications, dissertations, blogs using packages such as rmarkdown, knitr, bookdown, pagedown, xariangan.

Miami, FL

Feb 2020

L Bouzoubaa, R Balise

A Web App to Visualize Cancer Risk Factors for 900+ Neighborhoods in Florida - ShinyNeighborhood

A presentation given at the annual R/Medicine Conference on a web-based application to visualize several socio-demographic data for over 900 neighborhoods in Florida, some of the geo-spatial challenges presented with data aggregation and the R packages used to solve them (including a self-developed package).

New Haven, CT

Sept 2018

SCAN 360

A Poster given at the NAACCR Annual Meeting on SCAN360.com, a web-platform that visualized multiple measures of cancer burden for several population subgroups across the state of Florida.

Pittsburgh, PA

June 2018

L Bouzoubaa, R Balise

BST 601: Medical Biostatistics I Guest Lecturer

University of Miami Department of Public Health Sciences - Medical Biostatistics
Fall 2017 - 34 graduate students

Miami, FL

Sept 2017

L Bouzoubaa, D Dave

  • Topics:
    • Methods in Non-Experimental Epidemiological Studies (September 8, 2017)
    • Methods in Experimental Studies (September 8, 2017)

BST 601: Medical Biostatistics I Guest Lecturer

University of Miami Department of Public Health Sciences - Medical Biostatistics
Spring 2016 - 29 graduate students

Miami, FL

Feb 2016

L Bouzoubaa, A Gauri

  • Topics:
    • Methods in Non-Experimental Epidemiological Studies (February 5, 2016)
    • Methods in Experimental Studies (February 5, 2016)

Selected Publications

Gender disparities in lung cancer survival from an enriched Florida population-based cancer registry

Annals of Medicine and Surgery. doi:10.1016/j.amsu.2020.11.081

N/A

2020

A Elkbuli, MM Byrne, W Zhao, M Sutherland, M McKenney, Y Godinez, DJ Dave, L Bouzoubaa, T Koru-Sengul

SCAN360: A Resource for a 360-Degree View of Cancer Prevention, Risk, and Survival

Preventing Chronic Disease (2020). doi:10.5888/pcd17.200263

N/A

2020

Z Bailey, R Balise, L Bouzoubaa, E Kobetz

Emergent high fatality lung disease in systemic juvenile arthritis

Annals of the rheumatic diseases (2019). doi:10.1136/annrheumdis-2019-216040

N/A

2019

Saper, V, G Chen, G Deutsch, R Guillerman, J Birgmeier, K Jagadeesh, S Canna, G Schulert, R Deterding, J Xu, AN Leung, L Bouzoubaa, …,P Khatri, ED Mellins

Risk of Cancer Death Among White, Black, and Hispanic Populations in South Florida

Prev Chronic Dis (2019). doi:10.5888/pcd16.180529

N/A

2019

P Pinheiro, K Callahan, T Koru-Sengul, J Ransdell, L Bouzoubaa, CP Brown, E Kobetz

The Impact of Immediate Salvage Surgery on Corporeal Length Preservation in Patients Presenting with Penile Implant Infections

J. Urol. (2018). doi:10.1016/j.juro.2018.01.082

N/A

2018

D Lopategui, R Balise, L Bouzoubaa, S Wilson, and B Kava

Cervical cancer sociodemographic and diagnostic disparities in Florida: a population-based study (1981–2013) by stage at presentation

Ethnicity & Health (2018). doi:10.1080/13557858.2018.1471669

N/A

2018

A Gauri, SE Messiah, L Bouzoubaa, KJ Moore, T Koru-Sengul

Professional Development

Text Mining with R Bookclub

N/A

Online

August 2021

Tidy Modeling with R Bookclub

N/A

Online

March 2021

Advanced R Bookclub

N/A

Online

March 2021

Tidy Text Mining Workshop

Part of rstudio::conf. This workshop taught how to manipulate, summarize, and visualize the characteristics of text using tidy data principle and R packages from the tidy tool ecosystem.

San Francisco, CA

Jan 2020

Tidyverse Development Day

As part of rstudio::conf, TDD is a full day dedicated learning and coding. This day is an opportunity for select conference participants to contribute to Tidyverse packages via GitHub issues and submitting pull requests.

San Francisco, CA

Jan 2020

R at Sea Workshop

This was an immersive 3-day workshop focused on help attendees transition from writing individual functions to writing families of functions that work well together and fit seamlessly into the tidyverse. This workshop also covered how to take dependencies on other packages (including when to do so), and the basics of tidy evaluation that are needed to know as a package author

Miami, FL

Jan 2020

Software

PMCfetchR

An R package to retrieve full-text articles from the NCBI PMC Articles Dataset on AWS. https://github.com/labouz/PMCfetchR

N/A

2021

DOPE

Drug Ontology Parsing Engine.
https://cran.r-project.org/web/packages/DOPE/index.html

N/A

2020

tidyREDCap

tidyREDCap is an R package with functions for processing REDCap data
https://cloud.r-project.org/web/packages/tidyREDCap/index.html

N/A

2020

Issued Patents

U.S. Patent PCT/US2019/050098

“SYSTEM AND METHOD FOR ANALYZING AND DISPLAYING STATISTICAL DATA GEOGRAPHICALLY”

N/A

Sept 6, 2019

Certifications

Google Analytics

Google Academy

N/A

2020

Associate Cloud Engineer

Google Cloud

N/A

2019

Applied Statistics

Pennsylvania State University

N/A

2017

Community Engagement

Social Justice in Data Science Bookclub

Organizer

Online

Jan 2021-

GitHub Campus

Advisor

Online

Jan 2020-

Google #IamRemarkable

Facilitator

Various

April 2019-

Google WomenTechMakers

Ambassador

Various

Feb 2019-

R-Ladies Miami

Organizer

Miami, FL

Jan 2019-

Google Developer Groups Miami

Organizer

Miami, FL

Jan 2019-