Justin Clancy

Data Scientist | Astrophysicist

Transforming massive astronomical datasets into actionable insights through advanced statistical modeling, signal processing, machine learning, and scalable data pipelines. Astrophysics PhD specializing in large-scale data analysis and scientific computing.

Technical Skills

Expertise in data science, signal processing, machine learning, and high performance scientific computing

Programming & Data Languages

Python Julia R SQL Git

Machine Learning & Analytics

scikit-learn TensorFlow PyTorch PySpark Pandas NumPy SciPy

Tools & Infrastructure

Jupyter Conda HPC Linux/Unix/MacOs

Statistical Methods

Bayesian Inference Time Series Data Signal Processing Monte Carlo

Core Competencies

Pipeline Development ETL Data Modeling Optimization

Featured Projects

Data-driven solutions and research

Rapid Transient Detection

Transient Detection Pipeline for the Simons Observatory

Developed automated pipeline processing time-ordered data from the Simons Observatory, achieving 95% detection accuracy for rapid astrophysical transients using custom matched filtering and machine learning algorithms.

Python scikit-learn SOTODLIB SciPy & NumPy
Galactic Cold Clumps

Statistical Analysis of Polarized Galactic Cold Clumps and Mask Generation

Stacking analysis code to measure the polarization fraction of Planck Galactic cold clumps and generate fullsky source masks in HEALPix format for CMB surveys.

Python SciPy & Numpy Healpy & Astropy Matplotlib
Stock ETL

Stock ETL Pipeline

A self-organised project with a comprehensive demonstration of common data engineering practices through the use of Python, Apache, SQL, and Flask. This project implements a complete ETL pipeline for stock market data.

Python Apache Spark / PySpark SQL Apache Airflow Flask
Justin Clancy

About Me

I'm a data scientist who recently achieved a PhD in Astrophysics from The University of Melbourne, specializing in analysing large-scale datasets from astrophysical instruments. My research with the Simons Observatory has given me extensive experience in statistical modeling, signal processing, machine learning, and building production-ready data pipelines.

I'm driven by diving into messy, real-world data and turning it into clear, insightful outputs. Whether it's developing algorithms to detect rare, rapid astrophysical transient events, or building systems to run statistical analysis of large data catalogues, I thrive on solving challenging data problems.

Beyond technical skills, I'm passionate about communicating my findings to diverse audiences -- from academic presentations to public outreach events. I believe great data science should combine technical rigor with the ability to tell compelling stories from the data.

When I'm not coding or exploring the Universe: You can find me exploring the corners of world, stargazing (naturally!), or sharing the wonders of science with the next generation. I'm a big sports fan, and can't go past a good movie, TV or book as a way to wind down in the evening!

Let's Connect

Open to data science/analysis opportunities and collaborations