Time Series Analysis in Stingray.jl

From Stargazing Dreams to Code Stargazing:)

Ever since childhood, I have been passionate about the mysteries of the universe. Over time, that passion grew into something more technical — not just looking at the stars, but wanting to understand their signals.

This summer, through Google Summer of Code 2025 (GSoC), I had the opportunity to contribute to Stingray.jl, a Julia library for time series analysis in astronomy. What follows is both a story of my personal journey and an exploration of the science and code I worked on.

The Beginning – Switching Languages and First Steps

When the organization was announced, I immediately started exploring Stingray (the Python version). I wanted to understand the structure, so I made my first PRs there.

Soon, I understood the Stingray a bit, then I started working on Stingray.jl, the Julia port of Stingray initiated by Aman Pandey in 2022. That meant learning Julia from scratch! Thankfully, Julia was similar to Python in many ways, so I picked it up quickly.

During my early testing, I noticed outdated dependencies — for example, some packages were only compatible with Julia 1.7, while the project was moving towards Julia 1.11. My first clumsy PR (on power_colors) wasn’t great, will tell you in a bit,t but it gave me valuable lessons.

Later, I worked on updating fourier.jl. This was also my first time seriously using structs in Julia. My mentor @fjebaker guided me on abstract methods and struct design, while @matteobachetti helped me understand what kind of astronomical data the project needed.

At first, I was confused about telescopes and their instruments. To learn, I went to NASA HEASARC documentation, where I discovered FITS data (Flexible Image Transport System) — the standard format for astronomical data. That’s when things started making sense.


My First Big Task – Reading Event Data

@fjebaker opened a milestone and assigned me the readevent task. At the same time, I became good friends with fellow contributor Jawd Ahmad. We often helped each other understand the codebase.

I implemented the readevent function with a small struct called
EventList something like :


struct DictMetadata
    headers::Vector{Dict{String,Any}}
end

struct EventList{T}
    filename::String
    times::Vector{T}
    energies::Vector{T}
    metadata::DictMetadata
end


Initially, it was just reading .fits files (not .evt Yet). After mentor suggestions and fixes, my first PR was finally merged into Stingray.jl. That moment felt amazing, it was the first pr to be merged into open source.


Recipes and Lightcurves

The next milestone was implementing recipes (plotting methods). Earlier, my worst PR  power_colors involved directly plotting arrays inside functions (yes, it was a mess 😅). This time, I studied RecipesBase.jl and realized how it allows clean plotting through micro-functions.

I implemented lightcurve recipes,  so one can now simply call plot(lc) in the Julia REPL.

At this stage, I was still generating my own FITS files with FITSIO.jl. While testing, I hit a table HDU error, which led me to file issues (and even attempt fixes) in that library.

Proposal Time

By then, I had collected dozens of reference links in my notepad and shaped them into my official GSoC proposal.

On May 8th, I got selected. That was one of the happiest moments of my life! 🎉

We had our first meeting with mentors, where we discussed our next plans. From then on, meetings were held almost every Friday at 3 PM (Rome time).


Building Functions Step by Step

Event Functions

I refined readevent and created better struct definitions. @fjebaker even made a PR to demonstrate how things are usually structured in Julia — it was extremely helpful.
The final results had these functions:

event1 = readevents("ni1050360108_0mpu7_cl.evt", load_gti=true, sort=true)
ev_lowE  = filter_energy(x -> x >= 20 && x < 500, event1)
ev_highE = filter_energy(x -> x >= 500 && x <= 1500, event1)

println("Low energy events: ", length(ev_lowE))
println("High energy events: ", length(ev_highE))

#Output:
Low energy events: 28841910
High energy events: 2941370

From here, as you can see, we have a readvent function that has some parameters, and you make a read
.evt, fits, and etc files with and we have one of the good function filter_energy,get_gti_info, and
many more:[reference PRs: readevent evolved to eventlist evolved to events]

Finally, we have a function that reads GTI info and all other telescope metadata.

Reference [ni1050360108_0mpu7_cl.evt file ], so you will see two types of files there :

.ufa Files are unfiltered, calibrated NICER event files containing all raw events, including bad or non-X-ray events, mainly for initial analysis.

.cl Files are clean, calibrated event files that have been screened to remove bad or non-X-ray events, ready for scientific analysis.

For more detailed information on NICER data processing and file formats, you can refer to the official HEASARC documentation: HEASARC.

Lightcurve Functions:

We then moved to filtering and error functions inside Lightcurves. I really loved working on these — it felt like building a scientific toolkit piece by piece.
I was able to create many functions like:
[the thing, like most,u can write one line code in the Julia terminal after reading the event]:

LightCurve{T}

Main structure for binned time series with uncertainties and properties

lc = create_lightcurve(eventlist, 1.0)  # 1-second bins

Key Functions:

Light Curve Creation

# Basic usage
lc = create_lightcurve(eventlist, binsize)

# With filtering
lc = create_lightcurve(eventlist, 0.1; 
                      tstart=1000.0, tstop=2000.0,
                      energy_filter=(0.5, 10.0))

# Custom filtering (recommended)
filtered_events = eventlist |>
    ev -> filter_time(t -> t > 1000.0, ev) |>
    ev -> filter_energy(e -> e < 10.0, ev)
lc = create_lightcurve(filtered_events, 1.0)

Error Handling

# Set Poisson errors (default)
set_errors!(lc)

# Set custom Gaussian errors
set_errors!(lc, [0.5, 1.2, 0.8, 1.1])

# Recalculate errors after modifying counts
lc.counts .+= background_correction
calculate_errors!(lc)

Rebinning

# Rebin to larger time bins
lc_10s = rebin(lc_1s, 10.0)


Example light-curve of NICER X-ray data (counts vs time), 
produced by the new Stingray.jl routines. 
The simple command plot(events) now generates a basic light curve plot, and built-in options allow filtering and annotating the data (e.g., highlighting Good Time Intervals or applying energy/time filters). The snippet below shows a NICER light curve with peaks at regular intervals.

USING PLOTS [PLOTTING EXAMPLES FOR LIGHTCURVE]

using Plots
events = readevents("ni1200120104_0mpu7_cl.evt", load_gti=true, sort=true)

output

Found GTI data: 16 intervals
GTI time range: 1.3253976595089495e8 to 1.3261337476368216e8
EventList with 21244574 times and energies

simple plot

plot(events)

image

Light curve with time range filtering

plot(events, tstart=1.32540e8, tstop=1.32546e8)

image

lightcurves show_gtis and show_btis

plot(events, show_btis=true, bti_alpha=0.4,show_gtis=true, gti_alpha=0.3)

image

These capabilities come directly from the new lightcurve recipes. The examples show how a few keywords (like axis_limits, show_gtis, show_btis, etc.) can dramatically change the plot appearance without extra coding, making Stingray.JL is much more user-friendly for exploratory plotting.

Midterm Success

By midterm evaluation, I had multiple PRs merged, including Lightcurve recipes. I even optimized performance: initially, plotting a huge event file (34 million events, ~1.1 GB) took 1 minute 33 seconds. Using Julia broadcasting and matrix operations{suggested by @fjebaker}, I reduced it to 11 seconds.
And I have passed the midterm evaluation 

Moving towards GTI (Good Time Intervals) And BTI (Bad Time Intervals)
Refer to my previous blog

After discussions with @matteobachetti, I implemented GTI interfaces:

apply_gtis() (LightCurve version)- Segments light curves by GTIs, creating independent LightCurve objects suitable for:

  • Independent periodogram calculations
  • Variability analysis per observation segment
  • Maintaining uniform sampling requirements

create_filtered_lightcurve() - Internal function that creates filtered LightCurve objects while preserving metadata and recalculating statistical properties.

  • Data Gap Filling Function

fill_bad_time_intervals!() - Fills short gaps between GTIs with synthetic events to maintain temporal sampling continuity. This is used when:

  • Periodogram analysis requires uniform sampling
  • Short data gaps would introduce artifacts
  • Statistical methods assume continuous time series
  • These functions ensure we only analyze valid parts of data (avoiding bad intervals).

Diving into Spectral Analysis :)

Next came the Power Spectrum and Averaged Power Spectrum implementations.

Power Spectrum

This decomposes lightcurves into frequencies (like listening to the rhythm of a star).

Averaged Power Spectrum

This method cancels noise and highlights true signals.

Functions:
events[unbinned]

events = readevents("ni1200120104_0mpu7_cl.evt", load_gti=true, sort=true)
ps = Powerspectrum(events, norm="leahy")
avg_ps = AveragedPowerspectrum(events, 1024.0, norm="leahy", dt=0.1)

output: data [ik redunddant :)]

image image

lightcurve[binned]

lc = create_lightcurve(events, 0.1)
ps_lc = Powerspectrum(lc, norm="leahy")
ps_avg_lc = AveragedPowerspectrum(lc, 1024.0, norm="leahy")
ps_avg_lc = AveragedPowerspectrum(lc, 1024.0, norm="leahy")

output the same type as the above one:)

println(ps.df)
println(avg_ps.df)
println(ps_lc.df)
println(ps_avg_lc.df)

Output:

1.3662135391761732e-5
0.0009765625
1.3662135391761732e-5
0.0009765625
println(ps.m)
println(avg_ps.m)
println(ps_lc.m)
println(ps_avg_lc.m)

Output:

1
7
1
7

Functions with plots :

Basis setup:

using Plots
events = readevents("ni1200120104_0mpu7_cl.evt", sort=true)
lc = create_lightcurve(events, 0.1)

Plotting:

plot(events, segment_size=256.0, dt=0.01, norm="frac", show_noise=true)

image

plot(events, segment_size=256.0, show_noise=true, axis_limits=[1,1/10])

image

# Create the power spectra objects
ps2 = AveragedPowerspectrum(events, 256.0; dt=0.01, norm="frac") 
ps3 = AveragedPowerspectrum(events, 256.0; dt=0.01, norm="leahy")
ps4 = AveragedPowerspectrum(events, 256.0; dt=0.01, norm="abs")
plot([ps2, ps3, ps4], 
     labels=[ "Fractional", "Leahy", "Absolute"], 
     colors=[ :red, :green, :orange], 
     linewidth=2.0,
     show_noise=true)

image

ps = AveragedPowerspectrum(events, 256.0, norm="frac")
plot(ps, show_noise=true, meanrate=ps.mean_rate)

image

plot(ps, show_noise=true, show_errors=true, subtract_noise=true, meanrate=ps.mean_rate)
plot(ps, freq_mult=true, show_noise=true, meanrate=ps.mean_rate)

image image

ps1 = AveragedPowerspectrum(events, 128.0, norm="frac")
ps2 = AveragedPowerspectrum(events, 256.0, norm="frac") 
ps3 = AveragedPowerspectrum(events, 512.0, norm="frac")

plot([ps1, ps2, ps3], labels=["128s", "256s", "512s"])

image



While building this function, I faced trouble in gti since if u read .evt files from previous functions, it was excluding edge gtis, so @matteobachetti suggested to use  generate_indices_of_segment_boundaries_binned and this generate_indices_of_segment_boundaries_unbinned
This was a key feature to handle GTI correctly, thank Matteo

Cross Spectrum – A Big Milestone

The most challenging task was implementing Cross-Spectrum, which compares two lightcurves to study phase lags and time lags.

At first, I faced countless issues: GTI misalignment, event loss, and overflow errors. My mentor reminded me: “You are not reading complete .evt files”. Once I corrected that, things started working.

Extended FITSMetadata struct with timing keyword fields:

  • mjd_ref: MJD reference time (supports both MJDREF and MJDREFI+MJDREFF)
  • time_zero: Time zero offset (TIMEZERO)
  • time_unit: Time unit conversion factor (TIMEUNIT)
  • time_sys: Time system (TIMESYS)
  • time_pixr: Time pixel reference (TIMEPIXR)
  • time_del: Time resolution/bin size (TIMEDEL)
  • The implementation follows the HEASARC timing tutorial specifications and handles various edge cases commonly found in X-ray astronomy FITS files.

I also thought units were mismatched, but it turned out Julia automatically handled them. Lesson learned![after implementing events with these extensions]

Once I got past that misunderstanding, finally, things were aligning, and the cross-spectrum started working properly. The alignment was correct, our filter function worked, and my mentor suggested something super smart: use the same .evt file, filter it out, and then use both filtered versions.

This was a relief, especially because I couldn’t find matching .evt files from different detectors. That trick really saved me. 🙌

Along the way, I started developing a secret interest in telescope observatories. Working on real telescope data made me curious about how things actually operate behind the scenes.

Finally, I submitted my biggest PR (~5.7k lines of code) —sorry, mentors 😅. It includes Cross-Spectrum implementation.n

  • Average Cross-Spectrum and single

  • Recipes for plotting

At first, I was building separate structures for the cross-spectrum and the averaged cross-spectrum. But my mentor asked: “Why not just make it one?” And then @matteobachetti  suggested I could detect whether it was a cross-spectrum or an average by "m" and handle both within the same structure. That idea really helped reduce code size and made things much cleaner.

The best part? The recipes work automatically working now, which feels amazing. (Okay, I’ll admit it: I’ve used a lot of if-else statements, but hey—it works!)

Methods and Plots:

lc1 = create_lightcurve(ev_lowE,0.01)
lc2 = create_lightcurve(ev_highE,0.01)
cs = CrossSpectrum(ev_lowE, ev_highE, 64.0, 0.01; norm="frac")
acs = AveragedCrossSpectrum(ev_lowE, ev_highE, 64.0, 0.01)
CrossSpectrum{Float64} (Single)
  Frequencies: 3199
  Normalization: frac
CrossSpectrum{Float64} (Averaged)
  Frequencies: 3199
  Segments averaged: 53
  Segment size: 64.0
  Mean rates: 8205.026827830188, 828.4949882075472
  Normalization: frac

I am using these functions to determine whether it's an averaged or a single cross-spectrum

is_averaged(cs::CrossSpectrum) = cs.m > 1
is_single(cs::CrossSpectrum) = cs.m == 1

Now plotting:

simple plot direct cs and acs

plot(cs)
plot(acs)

or

plot(acs, plot_type=:amplitude)
plot(cs, plot_type=:amplitude)

image image

u can add a frequency range to:

plot(cs, plot_type=:amplitude, freq_range=(0.1, 10.0), color=:green)

image

Time lag spectrum

plot(cs, plot_type=:time_lag, color=:orange)
plot(acs, plot_type=:time_lag, color=:orange)

image image

Coherence function

plot(cs, plot_type=:coherence, color=:green)
plot(acs, plot_type=:coherence, color=:green)

image image

Coherence with confidence levels

plot(cs, Val(:coherence_confidence))
plot(acs, Val(:coherence_confidence))

image image

Time lag with error bars and reference values.

plot(cs, Val(:lag_frequency_errors), 
     freq_range=(1.0, 5.0),
     reference_lag=Dict(:frequency => 3.0, :expected_lag => 0.15))
plot(acs, Val(:lag_frequency_errors), 
     freq_range=(1.0, 5.0),
     reference_lag=Dict(:frequency => 3.0, :expected_lag => 0.15))

image image

Phase lag with error bars and reference values

plot(cs, Val(:phase_frequency_errors),
     freq_range=(2.0, 5.0), 
     reference_phase=Dict(:frequency => 3.0, :expected_phase => π/3))
plot(acs, Val(:phase_frequency_errors),
     freq_range=(2.0, 5.0), 
     reference_phase=Dict(:frequency => 3.0, :expected_phase => π/3))

image image

Time lag with rebinning.

plot(cs, Val(:lag_frequency_errors), rebin_factor=4, freq_range=(1.0, 10.0))
plot(acs, Val(:lag_frequency_errors), rebin_factor=4, freq_range=(1.0, 10.0))

image image

Compare two light curves.s

plot(lc1, lc2)

image

Normalization comparison (requires an array of differently normalized CrossSpectra)

cs_leahy = CrossSpectrum(lc1, lc2, norm="leahy")
cs_frac = CrossSpectrum(lc1, lc2, norm="frac") 
cs_abs = CrossSpectrum(lc1, lc2, norm="abs")

plot([cs_leahy, cs_frac, cs_abs], Val(:normalization_comparison))

plot([cs_leahy, cs_frac, cs_abs], Val(:normalization_comparison),
     norm_labels=["Leahy Norm", "Fractional RMS", "Absolute RMS"])

acs_leahy = AveragedCrossSpectrum(lc1, lc2,62, norm="leahy")
acs_frac = AveragedCrossSpectrum(lc1, lc2,62, norm="frac") 
acs_abs = AveragedCrossSpectrum(lc1, lc2,62, norm="abs")

plot([acs_leahy, acs_frac, acs_abs], Val(:normalization_comparison))

plot([acs_leahy, acs_frac, acs_abs], Val(:normalization_comparison),
     norm_labels=["Leahy Norm", "Fractional RMS", "Absolute RMS"])

image image

Rebinning comparison (original vs rebinned)

cs_rebinned = rebin(cs, 0.25)  # Your rebinning function
plot(cs, cs_rebinned, Val(:rebinning_comparison), rebin_type="Linear")
acs_rebinned = rebin(acs, 0.25)  # Your rebinning function
plot(acs, acs_rebinned, Val(:rebinning_comparison), rebin_type="Linear")

image image

Log rebinning comparison

cs_log_rebinned = rebin_log(cs, f=0.02)
plot(cs, cs_log_rebinned, Val(:rebinning_comparison), rebin_type="Logarithmic")
acs_log_rebinned = rebin_log(acs, f=0.02)   
plot(acs, acs_log_rebinned, Val(:rebinning_comparison), rebin_type="Logarithmic")

image image

noise compare

plot(cs, Val(:noise_comparison))
plot(acs, Val(:noise_comparison))

image image

white noise

plot(cs, Val(:white_noise), high_freq_fraction=0.3)
plot(acs, Val(:white_noise), high_freq_fraction=0.3)

output: same as amplitude: Phase lag errors

plot(cs, Val(:phase_lag_errors))
plot(acs, Val(:phase_lag_errors))

image image

significant_detections

plot(acs, Val(:significant_detections), threshold=3.0)
plot(cs, Val(:significant_detections), threshold=3.0)

image image


Yeah, I have used if-else in the plot's title to hehehehehehe:)

What’s Next:

After GSoC, I plan to continue working on:

  • Dynamic Power Spectrum

  • Cross-Spectrumectrum improvements

  • Combining with Power Colors

I also want to become a watcher of Stingray.jl and contribute to astronomy-related open source projects long term.

One of my mentors, @stefanocovino, also suggested

Since I haven't created any docs for this, although I have many notebooks here, but they are all scattered, I will make an unstable branch which will include all the changes I have made, and then I will document it as solving puzzles


SumUps:)

Looking back, my journey was full of:

  • Struggles (debugging metadata, bad first PRs, HDU errors).

  • Learning (structs in Julia, broadcasting, telescope instruments, GTIs).

  • Wins (my first merge, lightcurve recipes, big Cross Spectrum PR).

Most importantly, it connected my childhood passion for astronomy with my programming skills.

Working with @fjebaker@matteobachetti, and collaborating with peers like Jawd Ahmed made this journey unforgettable.

PR Number Title / Description Status
#65 Cross-spectrum & recipes Open
#60 Lightcurve recipes (GTI/BTI, plotting) Open
#59 Power spectrum recipes Open
#58 Spectra adjustments Open
#57 Powerspectrum & AveragedPowerspectrum Open
#53 GTI interface methods Closed
#52 Good Time Interval enhancements Closed
#51 Lightcurve creation & rebinning Closed
#49 Mission support additions Drafted
#47 `EventList` read event implementation Closed
#30 Initial `readevent` groundwork Closed
#61 Events feature Closed

Comments

Popular posts from this blog

Beginning of GSoC 2025

🌟 Things Are Getting Interesting!!