Emille E. O. Ishida, PhD

I am a Brazilian physicist  working in Astronomy and Cosmology since I can remember.  

My research is focused on machine learning applications to astronomy and in the development of sustainable interdisciplinary scientific environments.

I am co-founder of the Cosmostatistics Initiative (COIN), 

the Fink broker and the SNAD team

Main scientific activities

An adaptable LSST community broker based on machine learning
The Cosmostatistics Initiative
SuperNova Anomaly Detection



Fink: early supernovae Ia classification using active learning

Leoni, Ishida et al., 2021

We describe how the Fink broker early supernova Ia classifier optimizes its ML classifications by employing an active learning (AL) strategy. We demonstrate the feasibility of implementation of such strategies in the current Zwicky Transient Facility (ZTF) public alert data stream. We compare the performance of two AL strategies: uncertainty sampling and random sampling. Our pipeline consists of 3 stages: feature extraction, classification and learning strategy. Starting from an initial sample of 10 alerts (5 SN Ia and 5 non-Ia), we let the algorithm identify which alert should be added to the training sample. The system is allowed to evolve through 300 iterations. Our data set consists of 23 840 alerts from the ZTF with confirmed classification via cross-match with SIMBAD database and the Transient name server (TNS), 1 600 of which were SNe Ia (1 021 unique objects). The data configuration, after the learning cycle was completed, consists of 310 alerts for training and 23 530 for testing. Averaging over 100 realizations, the classifier achieved 89% purity and 54% efficiency. From 01/November/2020 to 31/October/2021 Fink has applied its early supernova Ia module to the ZTF stream and communicated promising SN Ia candidates to the TNS. From the 535 spectroscopically classified Fink candidates, 459 (86%) were proven to be SNe Ia. Our results confirm the effectiveness of active learning strategies for guiding the construction of optimal training samples for astronomical classifiers. It demonstrates in real data that the performance of learning algorithms can be highly improved without the need of extra computational resources or overwhelmingly large training samples. This is, to our knowledge, the first application of AL to real alerts data.


SNAD Transient Miner: Finding Missed Transient Events in ZTF DR4 using k-D trees

Aleo et al., 2021

We report the automatic detection of 11 transients (7 possible supernovae and 4 active galactic nuclei candidates) within the Zwicky Transient Facility fourth data release (ZTF DR4), all of them observed in 2018 and absent from public catalogs. Among these, three were not part of the ZTF alert stream. Our transient mining strategy employs 41 physically motivated features extracted from both real light curves and four simulated light curve models (SN Ia, SN II, TDE, SLSN-I). These features are input to a k-D tree algorithm, from which we calculate the 15 nearest neighbors. After pre-processing and selection cuts, our dataset contained approximately a million objects among which we visually inspected the 105 closest neighbors from seven of our brightest, most well-sampled simulations, comprising 92 unique ZTF DR4 sources. Our result illustrates the potential of coherently incorporating domain knowledge and automatic learning algorithms, which is one of the guiding principles directing the SNAD team. It also demonstrates that the ZTF DR is a suitable testing ground for data mining algorithms aiming to prepare for the next generation of astronomical data.