About
I am a Machine Learning Engineer specialized in Neural Networks, Computer Vision, Natural Language Processing, and the application of Machine Learning in healthcare.
I have worked at IBM Research as a Machine Learning Engineer Researcher, primarily collaborating with IBM’s Yorktown Heights research lab. I also co-founded a start-up that develops research-backed cognitive games for the elderly, which was a provider for an Uruguayan government program. Additionally, I have consulted as a Machine Learning Engineer in dozens of companies ranging from start-ups to Fortune 500 companies.
I hold a graduate certificate in Data Science from Harvard University - Extension School and a Master’s Degree in Data Science from Universidad Austral, with my thesis focused on Computer Vision. I possess advanced skills in NLP, Computer Vision, and various Medical EHR standards such as HL7, FHIR, and SMART on FHIR.
Machine Learning Engineer
- Degree: Master's Degree in Data Science
- Email: matias.aiskovich@gmail.com
- City: Buenos Aires, Argentina (GMT-3)
Testimonials
Skills
Natural Language Processing (NLP):
- Large Language Model (LLM)
- Generative Pre-trained Transformers (GPT)
- BERT
- Transformers
- Word2Vec
- Hugging Face
- Natural Language Toolkit (NLTK)
- Spacy
- LangChain
Computer Vision (CV):
- Stable Diffusion
- Point Clouds
- Object Detection
- Object Tracking
- Semantic Segmentation
- 3D Reconstruction
- Image Processing
- Image Recognition
- 3D Image Processing
- LiDAR
- Depth Prediction
- Facial Recognition
- Detectron2
Programming Languages
- Python
- Java
- R
- JavaScript
Frameworks/Libraries
- PyTorch
- TensorFlow
- Flask
- Pandas
- NumPy
- Keras
- Scikit-learn
- XGBoost
- Apache Beam
- Apache Spark
Healthcare:
- Fast Healthcare Interoperability Resources (FHIR)
- HL7
- DICOM
- NIfTI images
- Picture Archiving & Communication Systems (PACS)
- OpenEMR
- Genomics
- Medical Imaging
- Spatial Transcriptomics
- Biology
- Biopython
- Mirth Connect
Cloud:
- Amazon Web Services (AWS)
- AWS SageMaker
- Google Cloud Platform (GCP)
- GCP AI Platform
- GCP BigQuery
- GCP Dataflow
- GCP Pub/Sub
Infra:
- Linux
- Data Warehousing
- ETL
- Machine Learning Operations (MLOps)
- Docker
- Kubernetes
- CI/CD
- Jenkins
Others:
- MySQL
- PostgreSQL
- SQL
- Artificial Neural Networks (ANN)
- Data Analysis
- Data Analytics
- Data Mining
- Data Modeling
- Data Science
- Deep Neural Networks
- Machine Learning
- Convolutional Neural Networks
- Data Engineering
- Association Rule Learning
Resume
Professional Experience
Machine Learning Engineer Consultant
Mar. 2019 - Present
Multiple Clients
- Collaborated with the Computational Omics Huang Lab at the Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, where I contributed to the development of computer vision Variational Autoencoder (VAE) models. The primary objective of our research was to identify immunotherapy target genes within spatial transcriptomics data.
- As the NLP subject matter expert, led the creation of an ML pipeline for the automatic processing of legal documents for IFF-DuPont RD team; this project included the use of open-source (Google’s T5) and proprietary (GPT-3) LLM.
- As a principal ML engineer, audited, improved, and coached team members on a computer vision pipeline for object detection and semantic segmentation to detect small defects in car manufacturing plants.
- In the same project, responsible for an end-to-end Detectron 2 ML training and deployment pipeline, in Python combined with DVC for data versioning and W&B for tracking and evaluation.
- Developed SMART on FHIR app for healthtech start-up to be published in the Epic (EHR vendor) app store and coached client’s team members on FHIR and SMART on FHIR technologies.
- Integrated telemedicine company with different EHR’s, using Mirth Connect for routing HL7 message, and built a custom Python FHIR API for interoperability between platforms.
- Developed XGBoost gradient boosting machine learning models for predicting DNA sequences’ manufacture timeline for a biotech start-up, and serve them using Docker and Kubernetes.
- Developed NLP machine learning sentiment classification models based on Transformers and BERT, rebuilt client’s platform architecture and designed long-term roadmap for a marketing start-up’s platform.
Machine Learning Research Engineer
Aug. 2020 - Feb. 2022
IBM
- Developed computer vision machine learning models (3D CNN based in PyTorch) for brain age prediction (predicting age given an MRI image of the brain) and led the curation of a large dataset of brain MRI images for Yorktown Heights IBM Exploratory Life Science Sector (neuroscience team).
- Led machine learning experimentation in natural language processing project for detection of security threats in software packages for IBM Research in collaboration with IBM TSS team–which had several Fortune 500 companies as intended customers–as the NLP subject matter expert of the team, which included the usage of models such as LDA and BERT.
- Coached Software Engineers in ML and NLP.
- Co-authored two research papers, "Sparse Depth Completion with Semantic Mesh Deformation Optimization" and "Acoustic Sensing-based Hand Gesture Detection for Wearable Device Interaction" (patent pending).
Co-Founder and Full Stack Developer
Mar. 2013 - Dec. 2021
Caretronics
- Developed an app that was chosen to take part in the Uruguayan governmental project, Ibirapita. It was downloaded by more than 65,000 people.
- Developed a web platform based on medical research for improving the quality of life of people with cognitive diseases; at the moment is being used by several patients with Alzheimer’s disease.
Machine Learning Engineer
Aug. 2019 - Jul. 2020
Wevat Tax Refund
- As the first hire in the Machine Learning team, led the planning of the machine learning roadmap, ensuring that stakeholders not familiar with ML capabilities were included in the decision-making process.
- Responsible for end-to-end ML modeling, developed computer vision models with TensorFlow to confirm receipt images were compliant with UK legal norms and serve them with Google Cloud ML Engine.
- Built machine learning models with XGBoost (gradient boosting) to predict the volume of customers.
Senior Data Engineer
Oct. 2017 - Aug. 2019
Morsum, LLC
- Designed and led the implementation of an ETL into Google Cloud Platform (Pub/Sub, Dataflow, BigQuery).
- Developed machine learning market basket analysis recommendation models for food ordering.
- Responsible for the design and implementation of the inpatient food ordering project for hospitals (based on SMART on FHIR to connect with EHR’s).
Data Engineer
Oct. 2016 - Oct. 2017
Morsum, LLC
- Developed a statistical tool for customers to get insights about their nutritional consumption.
- Developed Python API’s to act as an interface between web and mobile apps and machine learning models.
- Worked closely with Data Scientists, providing support in the optimization of Python code, review of Machine Learning models, and providing data sources to be utilized in their products.
Full Stack Developer and Sysadmin
Jul. 2012 - Sept. 2016
Gumma SRL
- Improved the reliability and speed of company infrastructure with the virtualization of the physical servers in the regional branch offices in Argentina, Uruguay, and Brazil.
- Changed company internal processes, with the development of In-house software; simplifying and automating several tasks, resulting in more productivity, better communication between sectors, and precise business forecasts for the management department.
Publications
Sparse Depth Completion with Semantic Mesh Deformation Optimization
2021
Acoustic Sensing-based Hand Gesture Detection for Wearable Device Interaction
2021
Cognitive Stimulation of Autobiographic and Emotional Memory in a Patient with Alzheimer’s Disease
2020
Education
Master's Degree in Data Science
2020 - 2022
Universidad Austral, Buenos Aires, Argentina
Key areas of study:
Data Mining, Natural Language Processing (NLP), Computer Vision (CV), Neural Networks,
Descriptive & Inferential Statistics, Data Architecture
Thesis in Computer Vision: Controlling bias with explainability techniques.
Data Science Graduate Certificate
2016 - 2018
Harvard University - Extension School
Key areas of study: Data Science, Big Data in Healthcare
Commercial Pilot
2014 - 2015
ETAP, Buenos Aires, Argentina
Courses & Certifications
Introduction to the Biology of Cancer
John Hopkins - Coursera (2023)
Human Research
CITI Program (2022)
Data or Specimens Only Research
CITI Program (2022)
Clinical Genomics
Universidad de Buenos Aires - (2022)
Introduction to Biology - The Secret of Life
MIT - EDX - (2022)
NLP Specialization
DeepLearning.AI (2021)
Fundamentals of Reinforcement Learning
University of Alberta - Coursera (2020)
Deep Learning Specialization
DeepLearning.AI (2020)
Data Science in Stratified Healthcare and Precision Medicine
The University of Edinburgh - Coursera (2020)
Fundamentals of GIS
UCDAVIS - Coursera (2020)
Image processing for Artificial Vision
Universidad CAECE (2019)
Deep Learning Nanodegree
Udacity (2019)
Understanding Clinical Research: Behind the Statistics
University of Cape Town - Coursera (2018)
Data Engineering for Google Cloud Platform
ROI Training, San Jose, California (2017)
Senior Web Developer Nanodegree
Udacity (2015)
Portfolio
Computer Vision Machine Learning (VAE): Find novel genes for immunotherapy treatment
Built Variational Auto Encoder (VAE) models using several 10XGenomics Spatial Transcriptomics datasets for different cancer types/tissues. Compared euclidian distance of genes in each tissue latent space and across tissues latents spaces for each gene.
- Programming Languages & Software: Python, OpenCV, NumPy, PyTorch, Git.
- Machine Learning: Tested different VAE architectures and hyperparameters.
- Data preprocessing: Reconstructed spatial data for each tissue as a tensor input for the network. Standardised dimensions across different datasets.
- Results processing: Alignment of latent spaces across different cancer types, found closest novel genes to known genes targeted by immunomodulators.
Large Language Model (Chat-GPT) end-to-end pipeline: Automation of document processing
Deployment of a Python Web-App to aid in the processing, metadata extraction, and creation of legal documents.
- Programming Languages & Software: Python, HuggingFace, Git.
- Machine Learning: Exploration of different LLMs, including open source ones (using HuggingFace for Google T5) and proprietary LLMs (Chat-GPT, GPT-3).
- Web-App Design: Web-App designed completely from scratch using Streamlit Python package.
Object Detection Computer Vision pipeline: Detection of defects in car manufacturing production lines
Creation of an object detection pipeline.
- Programming Languages & software: Python, Detectron2, PyTorch, Weights & Biases (W&B), GitHub Actions, DVC, Kubernetes, Docker.
- Machine Learning: Tested several object detection models, including YOLO (PyTorch implementation) and most models present in the Detectron2 library.
- MLOPS: Created guidelines for artifacts and experiment tracking in W&B.
- Model deployment: Exported selected model from Detectron2 to TorchScript in order to serve the model with Nvidia Triton.
- Leadership: Coached client's Machine Learning Engineers in model development and experimentation best practices.
Machine Learning (Decision Trees) and App Development: SMART on FHIR app for healthtech start-up
Development of a SMART on FHIR app with a Python back-end serving a decision tree classifier for a health tech start-up to be published in the Epic electronic health record (EHR) vendor app gallery.
- Programming Languages & Software: Python, Flask, XGBoost, JavaScript, Git.
- Machine Learning: Tested several gradient boosting algorithms (AdaBoost, XGBoost, and LightGBM) and ended up using XGBoost classifier with its hyperparameters optimized with Optuna.
- App development: Built a SMART on FHIR front-end app using JavaScript.
- Back end: Built an API with Flask and Python to act as an interface between the front-end app and the classifier model.
- Leadership: Coached company's Software Engineers in SMART on FHIR technology.
Computer Vision Machine Learning: Brain MRI Age Prediction
Developed computer vision machine learning models (3D CNN based in PyTorch) for brain age prediction (predicting age given an MRI image of the brain) and led the curation of a large dataset of brain MRI images.
- Programming Languages & Software: Python, PyTorch, FreeSurfer, Git, HPC cluster, PostgreSQL, Flask.
- Computer vision architectures: Tested several common convolutional neural network architectures (ResNet50, EfficientNet) and adapted them to process 3D images (MRI slices).
- Data preprocessing: Compiled various datasets from different institutions and standardized their terms and ontologies under a master structure that I created. Stored and served the standardized data under a PostgreSQL DB and Flask API. Aligned Brain MRI images using FreeSurfer.
- Results processing: Compared prediction age error for control individuals against individuals with neurodegenerative diseases.
NLP & Machine Learning: Early detection of security threats in software packages
Created different Machine Learning models for classifying security-related reports.
- Programming Languages & Software: Python, scikit-learn, Flask, PyTorch, HuggingFace, XGBoost, Spacy.
- Machine Learning: Tested several algorithms: extracted embeddings using BERT and used them as input for Gradient Boosting classifiers; LDA models for unsupervised data.
- Data preprocessing: Preprocessed text with Spacy.
- Leadership: Coached Software Engineers in Machine Learning and NLP.
Services
These are just some examples of areas in which I can help you in the Machine Learning and Data Engineering space. If you have a need that is not mentioned here, please feel free to reach out, and I will let you know if I can help; if not, I will do my best to put you in contact with someone who can.
Machine Learning Strategy
Help businesses define their machine learning goals, develop a roadmap, and create a strategy for implementing machine learning solutions.
Data Analysis and Modeling
Assist in data analysis, data preprocessing, feature engineering, and building predictive models using various machine learning algorithms.
Computer Vision
Develop computer vision solutions for tasks such as object detection, image classification, facial recognition, and video analysis.
Natural Language Processing (NLP)
Assist in the seamless integration and optimization of Large Language Models (LLMs) tailored to meet the specific requirements and workflows of your company. I can also construct Natural Language Processing (NLP) models for a wide range of tasks, including sentiment analysis, text classification, named entity recognition, text generation, and language translation, among others.
Healthcare ML pipelines
Develop a comprehensive healthcare data pipeline encompassing various stages. I am proficient in handling data extraction through HL7 or FHIR protocols, as well as managing images through PACS systems. Additionally, I excel at preprocessing data, constructing machine learning models, and integrating with electronic health record (EHR) systems using the SMART on FHIR framework.
Training and Workshops
Conduct training sessions and workshops to educate teams or individuals on machine learning concepts, best practices, and tools.
Contact
Location:
Buenos Aires, Argentina (GMT-3)
Email:
matias.aiskovich@gmail.com