Data Scientist
Julia Fliorko
SQL
Python
Tableau
ML
About Me
Education
- Bachelor’s in Software Engineering
- Master’s in Data Science (in progress)
Professional Background
- Data Scientist with a background in Business Analysis, experienced in translating ambiguous requirements into well-defined analytical questions and actionable insights.
- Strong at bridging stakeholders and data — turning business needs into clean datasets, measurable metrics, and interpretable models.
- Specialized in working with messy, real-world data: sourcing from APIs, cleaning and merging multiple datasets, and documenting assumptions and limitations.
- Build end-to-end analytical workflows in Python, from raw data to machine learning models and decision-ready outputs.
- Recent projects focus on research-oriented datasets in environmental and food-access domains, where data quality and methodology matter more than visuals.
Certifications
- Career Foundry: Data Analysis
Data analysis is an art — and I’m just the tool that helps reveal the picture hidden beneath the numbers.
Tools & Skills
Programming & Data Manipulation
Python (pandas, numpy, scipy)
- Data cleaning, transformation, merging large datasets
- Feature engineering, vectorized operations, performance-aware workflows
SQL (PostgreSQL / SQLite)
- Joins, subqueries, CTEs, aggregations
- Data validation, exploratory querying, schema understanding
Data Collection & Integration
- APIs (REST)
- Data ingestion, pagination handling, authentication
- JSON normalization, schema alignment across sources
- Web data handling (structured extraction)
- Dealing with inconsistent formats and missing fields
Data Cleaning & Preparation
- Handling missing data (imputation, exclusion with justification)
- Deduplication and record linkage
- Data type standardization and normalization
- Outlier detection and treatment
- Data quality checks and validation rules
- Documentation of assumptions and limitations
Machine Learning
Supervised
- Classification & Regression
- Logistic Regression, Decision Trees, Random Forest, K-Nearest Neighbors
- Model evaluation: precision, recall, F1-score, ROC-AUC
- Confusion matrix analysis
- Feature scaling: StandardScaler, MinMaxScaler
- Hyperparameter tuning: GridSearchCV, RandomizedSearchCV
- Train / validation / test design (time-aware splits when applicable)
Unsupervised
- Clustering: K-Means, Hierarchical clustering
- Dimensionality reduction: PCA
- Cluster validation and interpretation
Neural Networks (Applied)
- ANN (feedforward)
- CNN (spatial data only)
- RNN / LSTM (sequential data only)
- TensorFlow / Keras or PyTorch
Time Series & Temporal Analysis
- Time-based feature engineering (lags, rolling windows)
- Trend and seasonality analysis
- Baseline forecasting methods
- Time-aware validation strategies
Data Visualization & Communication
- Tableau: interactive dashboards, KPI reporting, geo/time visuals
- Python visualization: matplotlib, seaborn
- Translating results into non-technical insights
Geospatial & Structured Data
- Geographic identifiers (ZIP, census tract, region)
- Spatial joins and aggregation
- Distance-based metrics and density calculations
Workflow & Tooling
- Git / GitHub (version control)
- Reproducible project structure
- Jupyter Notebooks / Python scripts
- Modular, readable code practices
- Dataset versioning (raw / interim / final)
Portfolio
Filter by Tools
Filter by Skills
Contact
Download Portfolio (PDF)
Download Portfolio (PDF)