About me

Who Am I?

I am Sabrina, a Health Informatics, Statistics and Information Science major data enthusiast.

As a creative self-starter and passionate fast-learner, I am highly dedicated to developing data-driven insights, strategies, and solutions for companies and researches. I am interested in a data science, data analytics career within healthcare, tech, and business industries.

Please take a look at my portfolio and feel free to reach out if you have questions regarding to my past projects.

Data Science

Database

Web Design

NLP

years on earth
years experience in advance programming
projects
Partners
My Specialty

Some of My Skills

Innovative Ideas

First Prize, Second Prize, 23rd International ICT Innovative Services Awards— Open Data, ICT Application

Programming Language

Python, R, SQL, Java, C++

Packages

Data Science and Machine Learning: Pandas, Numpy, Scikit-Learn, Spark, PyTorch Natural Language Processing: NLTK, Gensim, Topic Models (LDA)

Web Development

HTML, Javascript, Django, GeoJson

Database

MySQL, DB2, Neo4j

Tools and Environment

Microsoft (including VBA in Excel), VScode, Anaconda, GitHub, WEKA

With great interest in technology and programming, I have been dedicated in exposing myself to multiple tools and developing my skills in various programming languages. I have highlighted some of them below which I am most proficient in.

R

85%

Python

90%

SQL

80%

Java

75%

Visual Basic Application (VBA)

70%

HTML

70%
Education

Education

Key Courses:Computational Methods for Informatics, Computational Tools for Data Science, Applied Data Mining and Machine Learning, Topics in Deep Learning, Clinical Database Management Systems

Proficiency/Modules: Marketing Analysis, Data Management & Decision, and Big Data Industrial Intelligence Program.

Honors: Research Project Contest Excellence Award, Taipei Chung Cheng Scholarship, Academic Excellence.

Experience

Work Experience

Yale University New Haven, Connecticut

Research Assistant, School of Management Sept 2019- Present

• Analyzed large-scale Uber’s drivers’ data, performed Latent Dirichlet Allocation (LDA) topic modeling on Zappos’ product reviews, and manipulated the Chain Store Guide (CSG) database through R and Python programming.

• Supported on a prosocial behavior change research paper by exploring, analyzing data and building probit models.

Yale SOM

Dr. Advice Technology Co. Ltd. Taipei, Taiwan

Data Analyst Intern, Department of Data Strategy Apr 2019- Jul 2019

• Accomplished precision marketing for a health product startup through K-means clustering analysis and Light Gradient Boost Model (LGBM), which targets 80% potential customers using 16% of the list of people.

• Conducted data ETL through web crawler to support the company’s own Health Information System (HIS).

• Organized the data architecture and supported the foundation of the company’s Clinic Location Selection System product by creating data pipeline scripts to catch and preprocess updated data.

Dr Advice

Didi Chuxing Technology Co. Ltd. Beijing, China

Research Intern, Data Warehouse, Department of Foundational Platform Jul 2018- Aug 2018

• Participated in the development of Didi’s own analytics services by researching Google Analytics, Facebook Analytics etc. and interpreted the flow of products via these tools.

• Compiled the database, with the use of Hadoop Hive and having approximately 2,000 colleagues accessing each day, for the International Sales team by writing metadata in English, which supported in developing its market in South America.

DiDi

TutorABC Taipei, Taiwan

Demo Consultant Nov 2016- Jun 2019

• Business Development: Provided English demo lessons online for potential clients from Asia and effectively converted them to online course purchase. (Average rating by clients: 9.8/10)

• Evaluated clients’ performance and provide progress reports and learning advice.

TutorABC
Research Work

Research Projects

Sentimental Analysis on Twitter during COVID-19

• Assessing differential effects between cities in the US across crisis progression through sentimental analysis on tweets. • Conducted LDA topic models and observed the topic dynamics across different time and location.

more...

Airbnb New York City Hospitality Visualization Website

• Established a website through python Django and conducted interactive graphs using D3.js and javascript that provides hospitality and analytical information of Airbnb listings in New York. • Implemented a linear regression model to predict Airbnb listing prices in New York and achieved a 3866.7 MSE.

more...

Picture Your Way — Picture-based Attraction Recommendation System Development

• Proposed an innovative algorithm by combining LDA topic modeling and k-means algorithms in Scatter/Gather, a document-clustering structure, through Python and VBA programming. • Developed a system using JavaScript to recommend personalized preferable attractions in Taiwan to users and achieved PR 72 system effectiveness and 74% precision value.

more...

Work 04

Application

100 49

Read

Besides Data...

HTML5 Bootstrap Template by colorlib.com
July 11, 2020 | Travel

Diving

HTML5 Bootstrap Template by colorlib.com
July 11, 2020 | Travel

Orchid Island, Taiwan

HTML5 Bootstrap Template by colorlib.com
June 25, 2019 | Travel

Copenhagen, Denmark

HTML5 Bootstrap Template by colorlib.com
July 14, 2018 | Travel

Beijing China

HTML5 Bootstrap Template by colorlib.com
June 6, 2017 | Remote Teaching

Rural-caring program