top of page

EXPERIENCE

EXPERIENCE

Jan 2024-Present

Big Brothere and Big Sisters of America

 Data Analyst 

(Volunteer - NGO Social Service Project)

 •System Development & Architecture: Spearheaded the development of a comprehensive digital platform for tracking and analyzing youth center attendance, mentoring sessions, and user engagement, leveraging Node.js, MySQL, and Flutter to create a scalable and user-friendly system.

​

•Database Management & Security: Led the design, implementation, and management of a secure database infrastructure using MySQL and AWS RDS, ensuring the integrity, security, and compliance of sensitive youth data in alignment with CCPA guidelines.

​

•Backend Development: Engineered robust backend services utilizing Node.js and integrated with a mobile web interface for seamless login, scheduling, and data management functionalities, enhancing operational efficiency.

​

•User Interface Integration: Implemented a mobile web interface with QR code functionality for efficient youth center login and attendance tracking, ensuring a smooth and accessible user experience across devices.

​

•Mentorship Scheduling System: Developed a dynamic scheduling system for students to book 1:1 mentoring sessions, integrating real-time availability via Google Sheets API and Flutter to provide an intuitive scheduling experience.

​

•Data Analytics & Reporting: Applied data analytics to generate insightful reports on youth attendance, mentoring sessions, and engagement metrics using SQL and Tableau, enabling data-driven decision-making for center administrators.

​

•Mobile App Development: Contributed to the development of a cross-platform mobile app using Flutter, incorporating features such as auto-login/logout, geofencing, and push notifications to enhance user engagement and streamline center operations.

 

•Security & Compliance: Ensured the secure handling of user data by implementing encryption in transit and at rest, and managing developer credentials with best practices in code security within the GitHub repository.

​

•Collaboration & Leadership: Worked closely with a cross-functional volunteer team, leading the backend development efforts and aligning the technical solutions with the overall mission of the NGO to provide a safe and supportive environment for the youth.

BBBSA_black_green_Logo.jpg

Feb 2022 - Dec 2023

Data Engineer

Medicare.png

University of North Texas

AIM-AHEAD Pilot Project: Integrated SEER and medicare datasets for cancer survival analysis in women, laying the foundation for comprehensive analysis and further research.

 Role: Spearheaded data management and analysis initiatives aimed at improving health outcomes through advanced analytics. Enhanced Data Quality: Addressed data quality by handling missing values, removing duplicates, and standardizing column names, ensuring a robust dataset for accurate and reliable analysis of breast cancer stages and survival times.

Data Merging Precision: Merged datasets using Zip and Fips codes as foreign keys, integrating diverse data sources to create a unified dataset that offers a complete view of the patient journey, from diagnosis through treatment and outcomes.

Data Visualization: Employed Matplotlib and other plotting techniques, such as scatter plots, bar charts, and heatmaps, to visualize variable distributions, patterns, and trends within the breast cancer study. This facilitated a deeper understanding of data characteristics and aided in the identification of key insights related to cancer stages and patient survival.

Actionable Insights: Analyzed and interpreted visualized data results using SQL, Tableau, Excel, and machine learning tools, enabling the identification of critical insights for further research, reporting, and strategic decision-making in oncology.

Machine Learning Application: Applied a variety of machine learning algorithms, including Gradient Boosting Classifier (GBC) and Ordinary Least Squares (OLS) regression, to evaluate and mitigate biases within large-scale cancer patient datasets, enhancing model fairness.

Data Integration and Harmonization: Orchestrated the integration of diverse datasets, including nationwide cancer registry data (SEER), medical claims (Medicare), state-level health data (HIMSS), and datasets from CMS, optimizing data accessibility and analytical depth.

Predictive Modeling: Developed and implemented predictive models using state-of the-art machine learning methodologies to forecast cancer outcomes and treatment responses.

Explainable AI/ML: Implemented explainable AI/ML approaches to provide transparent insights into model predictions, supporting clinicians and researchers in decision-making processes.

Quality Assurance and Validation: Conducted rigorous quality assurance and data validation processes to ensure the accuracy, reliability, and consistency of data used for analysis and modeling tasks.

Advanced Data Scraping: Employed advanced data scraping techniques to procure extensive datasets from SEER, Medicaid, AHD, USDA, and CMS, crucial for robust analysis and informed decision-making in health research.

Research Contributions: Contributed valuable insights to research papers, effectively communicating findings derived from complex data analysis to stakeholders and the scientific community. Collaboration: Collaborated closely with cross-functional teams to align data initiatives with broader research goals and ensure the successful implementation of data-driven solutions.

May 2019 - Dec 2021

Data Analyst 

Edgerock Software Solutions

Skills: Product/Service Knowledge, Market Research, Strategic Thinking, Analytical, Network Abilities

Developed SSIS packages using for each loop in Control Flow to process all excel files within folder, file system task to move file into archive after processing and execute SQL task to insert transaction log data into the SQL table.
Created logs for ETL load at package level and task level to log number of records processed by each package and each task in package using SSIS.
Used Power BI for ETL processes to perform data transformations, event joins, and some pre-aggregations before storing the data on to HDFS.
Used Azure functions to write python code to cleanse the CSV files from special characters and data validation. The python code combines multiple small files received into compressed parquet format, also to convert XML data into JSON format using python XML package.
Converted existing ADF V1 to ADF V2 and implemented SSIS package in Azure environment using Integration Runtime (IR).
Experience working with Azure BLOB and Data Lake storage and loading data into Azure SQL Synapse analytics and involved in maintaining and updating metadata repository with details on the nature and use of applications or data transformations to facilitate impact analysis.
Monitored resources using Azure automation and created alarms for VM, Blob storage, ADF, databricks, and Synapse Analytics based on different events that occur.
Used DAX functions to create measure and calculate fields to analyze data for logical visualization and used Power Queries to transform the data.
Extracted data from data warehouse server by developing complex SQL statements using stored procedure and CTE to support report building.
Created and administered workspaces for each project on Power BI service and published the reports from Power BI Desktop to Power BI services workspace.
Used JIRA for project tracking and participated in daily scrum meetings.
Involved in troubleshooting, resolving, and validating data to improve data quality and developed analysis documentation for future for developers.    
Created numerous efficient stored procedures to perform data cleansing and loading task when data is loaded into staging area, created views to facilitate easy user interface implementation and triggers on them to facilitate consistent data entry into the database.
Wrote stored procedures and UDF to use SSIS packages in SQL scripts, used joins and CTEs to simplify complex queries involving multiple tables.
Developed complex views and generated drill-through reports, parameterized reports and linked reports using SSRS.
Created different Power Pivot reports as per the client requirement, implemented Row level security, used Power BI’s power query functionality to transform and model the data and developed Power BI graphical solutions as per the business objectives.
Created visualizations using tree and heat maps and provided the ability to perform ad-hoc reporting.
Expertise in generating reports using SSRS and MS Excel Spreadsheets and Power Pivot.
Worked extensively on ADF including data transformations, Integration Runtimes, Azure Key vaults, Triggers and migrating ADF pipelines to higher environments using ARM templates.
Created Use case, activity, sequence and class diagrams per UML.
Created UNIX scripts for file transfer and file manipulation.
Engaged with QA and UAT team for validation of all deliverable dashboards and reports and closure of defects raised during UAT.
Worked in each phase of SDLC and post production support.
Worked in the Agile methodology leveraging Scrum process to finish the tasks in Sprint time frame.

bottom of page