### Zé Vinícius

I convert maths into code :)

# Finding periods of transit-like events

One of the prime works on finding a fast way to compute good enough estimates of periods of planet transit events was carried out by Kovács, Zucker, and Mazeh. Their algorithm, the Box-least Squares (or BLS for short), has been very successful in finding those periodic events (their paper has over 500 citations!).

This blog post has the goal of investigating the maths that underpin the BLS method from a different angle, and verify whether there are unexplored speed-ups or properties. More precisely, I derive the parameters of a box function that “best” fits (from a maximum likelihood perspective) a planet transit in time series brightness data of stars (these are often called “lightcurves” by astronomers).

Assume the following relation for the folded lightcurve $\boldsymbol{Y} \triangleq (Y_1, Y_2, ..., Y_n)$ \begin{align} Y_i = h - d\cdot \mathbb{I}(t_0 \leq t_i \leq t_0 + w) + \eta_i\nonumber, \end{align} where $\eta_i \sim \mathcal{N}(0, \sigma^2)$, $\sigma$ is known, and $\mathbb{I}$ is the indicator function. The parameters to be estimated on the basis of observations $\boldsymbol{Y} = \boldsymbol{y}$ are $\left\{h, d, t_0, w\right\}$, i.e. the baseline flux $h$, the depth of the planet transit $d$, the start time of the transit $t_0$, and the duration or width of the transit $w$.

Pictorially:

In this setting, $Y_i \sim \mathcal{N}(h - d\cdot \mathbb{I}(t_0 \leq t_i \leq t_0 + w), \sigma^2)$, and the joint probability density function of $\boldsymbol{Y}$ is given by \begin{align} p(\boldsymbol{y}) = \prod_{i=1}^{n} p(y_i) = \dfrac{1}{(2\pi\sigma^2)^{n/2}} \exp\left\{-\dfrac{1}{2\sigma^2}\left(\sum_{i \in I^c}(y_i - h)^2 + \sum_{i \in I}(y_i - (h - d))^2\right)\right\} \end{align} where $I \triangleq \{i | t_0 \leq t_i \leq t_0 + w\}$.

The log-likelihood function (up to an additive constant) may be written as \begin{align} \log p(\boldsymbol{y}) = -\dfrac{1}{2\sigma^2}\left(\sum_{i \in I^c}(y_i - h)^2 + \sum_{i \in I}(y_i - (h - d))^2\right) \end{align}

Therefore, the maximum likelihood estimator may be written as \begin{align} {\theta}^{\star}({\boldsymbol{y}}) &= \arg \min_{h, d, t_0, w} \sum_{i \in I^c}(y_i - h)^2 + \sum_{i \in I}(y_i - (h - d))^2 \end{align}

Note that, $\log p({\boldsymbol{y}})$ has continuous partial derivatives with respect to $h$ and $d$. And therefore, for fixed values of $t_0$ and $w$, \begin{align} {\theta}^{\star}({\boldsymbol{y}}) &= \arg \min_{h, d} \sum_{i \in I^c}(y_i - h)^2 + \sum_{i \in I}(y_i - (h - d))^2 \end{align}

Differentiating $\log p({\boldsymbol{y}})$ with respect to $h$, and solving $\dfrac{\partial}{\partial h}\log p({\boldsymbol{y}}) = 0$, it follows that \begin{align} h^{\star} = \dfrac{1}{n}\left(N(I)\cdot d + \sum_{i=1}^{n}y_i\right),\nonumber \end{align} where $N(I)$ is the number of elements in the set $I$.

Now, differentiating $\log p({\boldsymbol{y}})$ with respect to $d$, and solving $\dfrac{\partial}{\partial d}\log p({\boldsymbol{y}}) = 0$, it follows that \begin{align} d^{\star} = \dfrac{\dfrac{1}{n}\displaystyle\sum_{i=1}^{n}y_i - \dfrac{1}{N(I)}\sum_{i \in I}y_i}{1 - \dfrac{N(I)}{n}}\nonumber, \end{align}

The formulae for $h^{\star}$ and $d^{\star}$ presented here are precisely the same as the ones developed by Kovács, Zucker, and Mazeh, in case one assumes that the “weights” $w_i$, as presented in the aforementioned paper, are constant w.r.t. $i$, e.g., $w_i = 1$.

Hence, one can iterate between numerically optimizing $\log p({\boldsymbol{y}})$ for $t_0$ and $w$ and analytically computing $h^{\star}$ and $d^{\star}$. On my experiments, I found that two iterations of this procedure are enough for convergence.

In the BLS paper, it seems to me that they used a grid-search approach in order to find $t_0$ and $w$.

Interestingly, the maximum likelihood estimator for the transit depth, $d^{\star}$, is an unbiased estimator for any sample size $n$. To see that, let’s take the expected value of $d^{\star}$ with respect to the distribution of the data $\boldsymbol{Y}$ \begin{align} \mathbb{E}\left(d^{\star}\right) = \dfrac{\dfrac{1}{n}\displaystyle\sum_{i=1}^{n}\mathbb{E}\left(Y_i\right) - \dfrac{1}{N(I)}\sum_{i \in I}\mathbb{E}\left(Y_i\right)}{1 - \dfrac{N(I)}{n}}, \end{align} noting that $\mathbb{E}\left(Y_i\right) = h - d\cdot \mathbb{I}(t_0 \leq t_i \leq t_0 + w)$, and after doing the maths, one arrives at \begin{align} \mathbb{E}\left(d^{\star}\right) = d, \end{align} which proves that $d^{\star}$ is unbiased.

Another interesting statistical figure to look at is the variance of $d^{\star}$. Before proceeding, note that $d^{\star}$ can be rewritten as \begin{align} d^{\star} = \dfrac{\dfrac{1}{n}\displaystyle\sum_{i \in I^c}y_i + \left(\dfrac{1}{n} - \dfrac{1}{N(I)}\right)\sum_{i\in I}y_i}{1 - \dfrac{N(I)}{n}} \end{align} then, it follows that \begin{align} \mathrm{var}\left(d^{\star}\right) = \dfrac{\dfrac{1}{n^2}\displaystyle\sum_{i \in I^c}\mathrm{var}\left(Y_i\right) + \left(\dfrac{1}{n} - \dfrac{1}{N(I)}\right)^2\sum_{i\in I}\mathrm{var}\left(Y_i\right)}{\left(1 - \dfrac{N(I)}{n}\right)^2}, \end{align} after algebraic simplifications \begin{align} \mathrm{var}\left(d^{\star}\right) = \dfrac{\sigma^2}{N(I)\left(1 - \dfrac{N(I)}{n}\right)} \geq \dfrac{\sigma^2}{N(I)} \end{align}

The inequality shown above is particularly interesting specially if we rewrite it as \begin{align} N(I) \geq \left(\dfrac{\sigma}{\mathrm{std}\left(d^{\star}\right)}\right)^2 \end{align} which tells us the minimum number of samples in transit $N(I)$ that are needed in order to achieve a given uncertainty on the transit depth $\mathrm{std}\left(d^{\star}\right)$.

Another interesting figure of merit is the assymptotic ratio \begin{align} \lim_{n\rightarrow\infty} \dfrac{\mathbb{E}\left(d^{\star}\right)}{\sqrt{\mathrm{var}\left(d^{\star}\right)}} = \dfrac{d\sqrt{N(I)}}{\sigma} \end{align} usually called signal-to-noise ratio by astronomers.

More broadly, note that $d^{\star} \sim \mathcal{N}\left(d, \dfrac{\sigma^2}{N(I)\left(1 - \dfrac{N(I)}{n}\right)}\right)$. As a result, various others interesting quantities may be derived ;)

See the next blog post for an example of these maths. :)