{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "- title: Finding a planet by subtracting starlight\n", "- author: Joseph Long\n", "- date: 2020-09-24\n", "- tags: download-notebook\n", "- slug: klip-in-numpy\n", "- summary: Implementing the KLIP algorithm of Soummer _et al._ (2012) for starlight subtraction using basic NumPy. (Basically, the document I wish I had when I started working on high-contrast imaging.)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When I started graduate school, I had to get up to speed on high contrast imaging. As is typical, I had a bunch of different papers to read that—together—represented the state of the art. Turning that into a complete conceptual picture that I could implement myself took some time. (And, of course, once you 'get it' you wonder why it took you so long...)\n", "\n", "This is not quite an [explorable explanation](https://en.wikipedia.org/wiki/Explorable_explanation), but it's something I wish I had when I started graduate school. If you'd like to, you should be able to follow along with nothing more than this notebook, an internet connection, Python, NumPy, matplotlib, and scikit-image.\n", "\n", "**Table of contents**\n", "\n", "- [Introduction](#Introduction)\n", "- [Get the data](#Get-the-data)\n", "- [The preliminaries](#The-preliminaries)\n", "- [The algorithm](#The-algorithm)\n", " - [Step 1: subtract image means](#Step-1:-subtract-image-means)\n", " - [Step 2: compute the Karhunen-Loève transform](#Step-2:-compute-the-Karhunen-Lo%C3%A8ve-transform)\n", " - [Step 3: truncate the basis set](#Step-3:-truncate-the-basis-set)\n", " - [Step 4: estimate the PSF from its projection onto eigenimages](#Step-4:-estimate-the-PSF-from-its-projection-onto-eigenimages)\n", " - [Step 5: calculate the final image](#Step-5:-calculate-the-final-image)\n", "- [Apply KLIP and ADI](#Apply-KLIP-and-ADI)\n", "- [Optimizing $K_\\text{klip}$ and measuring signal-to-noise](#Optimizing-$K_\\text{klip}$-and-measuring-signal-to-noise)\n", "- [The Case of the Missing Precision](#The-Case-of-the-Missing-Precision)\n", "- [Conclusion](#Conclusion)\n", "\n", "# Introduction\n", "\n", "To image planets around other stars, we are usually impeded by the brightness of the star relative to the planet. Therefore, clever schemes to remove starlight have been the cornerstone of exoplanet direct imaging. The state of the art in starlight subtraction advanced dramatically in 2012 with the publication in the Astrophysical Journal Letters of [\"Detection and Characterization of Exoplanets and Disks Using Projections on Karhunen-Loève Eigenimages\"](https://ui.adsabs.harvard.edu/abs/2012ApJ...755L..28S/abstract) by Rémi Soummer (my [old boss](https://www.stsci.edu/stsci-research/research-topics-and-programs/russell-b-makidon-optics-laboratory/meet-the-team)!), Laurent Pueyo (my old colleague!), and James Larkin (never had the pleasure!). \n", "\n", "In their paper, they explain how the eigenvectors of the covariance matrix of a reference set of star images may be used to estimate (and subtract) starlight in a new observation. Almost simultaneously, Adam Amara and Sascha P. Quanz published their paper [\"PYNPOINT: an image processing package for finding exoplanets\"](https://ui.adsabs.harvard.edu/abs/2012MNRAS.427..948A/abstract) in the Monthly Notices of the Royal Astronomical Society, describing a closely related technique.\n", "\n", "Though the specific notation and algorithm proposed differed, the two papers applied the method of [principal component analysis](https://en.wikipedia.org/wiki/Principal_component_analysis) to starlight subtraction. Principal component analysis goes by many names, as the following quote from the Wikipedia page demonstrates:\n", "\n", "> Depending on the field of application, it is also named the discrete Karhunen–Loève transform (KLT) in signal processing, the Hotelling transform in multivariate quality control, proper orthogonal decomposition (POD) in mechanical engineering, singular value decomposition (SVD) of X (Golub and Van Loan, 1983), eigenvalue decomposition (EVD) of XTX in linear algebra, factor analysis (for a discussion of the differences between PCA and factor analysis see Ch. 7 of Jolliffe's Principal Component Analysis), Eckart–Young theorem (Harman, 1960), or empirical orthogonal functions (EOF) in meteorological science, empirical eigenfunction decomposition (Sirovich, 1987), empirical component analysis (Lorenz, 1956), quasiharmonic modes (Brooks et al., 1988), spectral decomposition in noise and vibration, and empirical modal analysis in structural dynamics.\n", "\n", "These different names generally come with minor mathematical differences, such as whether means are subtracted along one dimension or another, or eigenimages are obtained from the eigenvectors of the image-to-image covariance matrix instead of from the singular vectors of the data matrix, but we will not attempt to cover all possible variants—just the description in Soummer *et al.*\n", "\n", "There are two key aspects to these algorithms:\n", "\n", " - identifying the eigenimages, and\n", " - truncation as noise suppression.\n", "\n", "As we will see, a set of images in an array of $K \\times n_\\text{pix} \\times n_\\text{pix}$ elements can be transformed into an array of $K \\times n_\\text{pix} \\times n_\\text{pix}$ **eigenimages**. These eigenimages are ordered such that the first image captures most of the variance in the dataset (i.e. the average image), with the successive components capturing variance in the residuals after previous components have been removed.\n", "\n", "The **truncation** step relies on the assumption (which can be checked empirically) that, in a set of $K$ observations contaminated with noise transformed into $K$ corresponding eigenimages, only the first $K_\\text{klip} < K$ of those eigenimages will capture useful information for subtracting starlight from a new image outside the reference set, while the rest capture only noise.\n", "\n", "The following code sample follows the algorithm as laid out in Soummer *et al.*, following their notation (within the constraints of Python).\n", "\n", "In the following, $I_\\psi$ means a PSF intensity pattern realized for a telescope state $\\psi$. We're trying to compute $\\hat{I}_\\psi$, an approximation of the PSF intensity pattern, in hopes that subtracting it will reveal a faint astronomical signal (which they call $A$). Our observed frames $T$ are combinations of the PSF and an astronomical signal: $T = I_\\psi + A$. If our approximation is good, we subtract it off like so: $T - \\hat{I}_\\psi = A$, recovering the astronomical signal of interest. (This is a simplification of what is stated in *Soummer et al.* since, as we will see, $A$ will not be recovered perfectly in general.)\n", "\n", "To begin, we import our libraries in the usual fashion:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import skimage" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Get the data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[These data](https://github.com/carlgogo/VIP_extras/tree/master/datasets) of $\\beta$ Pictoris from [NACO](http://www.eso.org/sci/facilities/paranal/decommissioned/naco.html) were made available by Carlos Gomez Gonzalez, author of [VIP](https://github.com/vortex-exoplanet/VIP) (an image processing package implementing this and other algorithms). Unlike most exoplanet direct imaging data, they have [a planet](https://en.wikipedia.org/wiki/Beta_Pictoris_b) in them, which makes them great for a demo.\n", "\n", "Download [naco_betapic_preproc.npz](https://github.com/carlgogo/VIP_extras/raw/master/datasets/naco_betapic_preproc.npz) and save it to the current directory, or just run this cell:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " % Total % Received % Xferd Average Speed Time Time Time Current\n", " Dload Upload Total Spent Left Speed\n", "100 160 100 160 0 0 237 0 --:--:-- --:--:-- --:--:-- 0:-- 237\n", "100 2177k 100 2177k 0 0 770k 0 0:00:02 0:00:02 --:--:-- 1102k\n" ] } ], "source": [ "!curl -OL https://github.com/carlgogo/VIP_extras/raw/master/datasets/naco_betapic_preproc.npz" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load the NumPy archive and assign the data cube to frames_full." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "naco_betapic_preproc = np.load('./naco_betapic_preproc.npz')\n", "frames_full = naco_betapic_preproc['cube']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Peek at the mean frame. These data were taken with a [vortex coronagraph](https://www.eso.org/sci/publications/messenger/archive/no.152-jun13/messenger-no152-8-13.pdf), so don't be surprised that the star looks more like a donut." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "