Overview

Why should the metallicity evolution be constrained? While the above methods work well for optimizing the per-template $r_j$ as a means for fitting SFHs, these methods can produce metallicity evolutions that could be considered unphysical, with large changes in the mean metallicity over small changes in time. An example of this type of behavior is shown in the SFH fit below.

Example of a SFH fit with variations in the metallicity evolution.

While some metallicity variation in the star-forming gas is to be expected, these variations in the SFH fit can end up being quite large depending on the data and isochrone grid adopted. A solution is to construct a more physically-motivated model.

We can do this using a hierarchical model with a parameterized metallicity evolution where the the $r_j$ are not the parameters directly optimized. Rather, we can optimize one stellar mass (or star formation rate) parameter per age bin, and then a number of metallicity evolution parameters that determine how that stellar mass is split between models with different metallicities at fixed age.

In most star formation history analyses, the metallicities are constrained through age-metallicity relations (AMRs), where the mean metallicity at time $t$ is a function of time and a small set of metallicity evolution parameters. A popular AMR model is the linear age-metallicity relation $\langle [\text{M}/\text{H}] \rangle (t) = \alpha \, \left( T_\text{max} - t \right) + \beta$ with a Gaussian distribution in metallicity at fixed age. $T_\text{max}$ here is the earliest lookback time under consideration such that $\langle [\text{M}/\text{H}] \rangle (T_\text{max}) = \beta$. This model is described in more detail here.

AMRs have historically been popular because they are generally capable of producing reasonable fits to observed data and it is relatively easy to derive the gradient of the objective function with respect to the AMR parameters analytically. However, in AMR models there is no direct link between the SFRs being fit and the metallicity evolution as a function of time, even though the two should in principle have some correlation as stellar processes are responsible for enriching the ISM.

A promising avenue of research involves fitting mass-metallicity relations (MZRs) rather than AMRs. In these models, the mean metallicity of stars forming at time $t$ is a function of the total stellar mass of the population at that time – therefore, the mean metallicity evolution changes self-consistently with the SFRs during the fitting process, resulting in a metallicity evolution that is meaningfully coupled to the star formation history. Additionally, AMRs can be difficult to compare between different galaxies because they do not reflect the different SFHs of the galaxies, whereas MZRs can be compared between galaxies much more easily. Our methods for MZR fitting are described in more detail here.

Generic Methods

While there are some methods in this package that are unique to AMR or MZR models, we present a minimal unified interface that can be used to fit SFHs under both types of models. To support multiple dispatch, we define AbstractMetallicityModel as the abstract supertype of AbstractAMR and AbstractMZR, which are each the supertypes for AMR and MZR types, respectively.

StarFormationHistories.AbstractMetallicityModel — Type

AbstractMetallicityModel{T <: Real} is the abstract supertype for all hierarchical metallicity models. Abstract subtypes are AbstractAMR for age-metallicity relations and AbstractMZR for mass-metallicity relations.

source

The generic methods that can be used for both AMRs and MZRs are described here. The main method for obtaining best-fit star formation histories is fit_sfh.

StarFormationHistories.fit_sfh — Function

fit_sfh(MH_model0::AbstractMetallicityModel,
        disp_model0::AbstractDispersionModel,
        models::AbstractMatrix{<:Number},
        data::AbstractVector{<:Number},
        logAge::AbstractVector{<:Number},
        metallicities::AbstractVector{<:Number};
        x0::AbstractVector{<:Number} = <...>
        kws...)
fit_sfh(MH_model0::AbstractMetallicityModel,
        disp_model0::AbstractDispersionModel,
        models::AbstractVector{<:AbstractMatrix{<:Number}},
        data::AbstractMatrix{<:Number},
        logAge::AbstractVector{<:Number},
        metallicities::AbstractVector{<:Number};
        x0::AbstractVector{<:Number} = <...>
        kws...)

Returns a CompositeBFGSResult instance that contains the maximum a posteriori (MAP) and maximum likelihood estimates (MLE) obtained from fitting the provided simple stellar population (SSP) templates models (with logarithmic ages logAge = log10(age [yr]) and metallicities metallicities) to the provided data. The metallicity evolution is modelled using the provided MH_model0, whose parameters can be free or fixed, with metallicity dispersion at fixed time modelled by disp_model0, whose parameters can be free or fixed.

This method is designed to work best with a grid of stellar models, defined by the outer product of N unique entries in logAge and M unique entries in metallicities. See the examples for more information on usage.

We provide several options for age-metallicity relations and mass-metallicity relations that can be used for MH_model0 and define APIs for users to create new models that will integrate with this function. Similar flexibility is allowed for the metallicity dispersion model disp_model0.

The primary method signature uses flattened formats for models and data. See the notes for the flattened call signature of StarFormationHistories.composite! for more details, as well as stack_models that facilitates rearranging the models into this flattened format.

Arguments

MH_model0 is an instance of AbstractMetallicityModel that defines how the average metallicity stars being formed in the population changes over time. The fittable parameters contained in this instance are used as the initial values to start the optimization.
disp_model0 is an instance of AbstractDispersionModel that defines the distribution of metallicities of stars forming in a fixed time bin (i.e., the dispersion in metallicity around the mean at fixed time). The fittable parameters contained in this instance are used as the initial values to start the optimization.
models are the template Hess diagrams for the SSPs that compose the observed Hess diagram.
data is the Hess diagram for the observed data.
logAge::AbstractVector{<:Number} is the vector containing the effective ages of the stellar populations used to create the templates in models, in units of log10(age [yr]). For example, if a population has an age of 1 Myr, its entry in logAge should be log10(10^6) = 6.0.
metallicities::AbstractVector{<:Number} is the vector containing the effective metallicities of the stellar populations used to create the templates in models. These should be logarithmic abundances like [M/H] or [Fe/H]. There are some notes on the Wikipedia that might be useful.

Keyword Arguments

x0 is the vector of initial guesses for the stellar mass coefficients per unique entry in logAge. We try to set reasonable defaults, but in most cases users should be calculating and passing this keyword argument. We provide StarFormationHistories.construct_x0_mdf to prepare x0 assuming a constant star formation rate and total stellar mass, which is typically a good initial guess.
kws... are passed to Optim.Options and can be used to control tolerances for convergence.

Returns

This function returns a CompositeBFGSResult that contains the output from both MLE and MAP optimizations, accessible via result.mle and result.map. These are each instances of BFGSResult. See the docs for these structs for more information.

source

This function returns an instance of CompositeBFGSResult.

StarFormationHistories.CompositeBFGSResult — Type

CompositeBFGSResult(map::BFGSResult, mle::BFGSResult)

Type for containing the maximum a posteriori (MAP) AND maximum likelihood estimate (MLE) results from BFGS optimizations that use Optim.jl, which are individually accessible via the :mle and :map properties (i.e., for an instance of this type t, t.mle or getproperty(t, :mle) and t.map or getproperty(t, :map)).

Random samples can be drawn from an instance t as rand(t, N::Integer). This will return a size length(μ) x N matrix. This will use the MLE result for the best-fit values and the inverse Hessian approximation to the covariance matrix from the MAP result, which is more robust when best-fit values that are constrained to be positive approach 0.

Per-SSP coefficients can be calculated with calculate_coeffs(result::CompositeBFGSResult, logAge::AbstractVector{<:Number}, metallicities::AbstractVector{<:Number}), which uses the MLE result (see these docs).

source

StarFormationHistories.BFGSResult — Type

BFGSResult(μ::AbstractVector{<:Number},
           σ::AbstractVector{<:Number},
           invH::AbstractMatrix{<:Number},
           result,
           MH_model::AbstractMetallicityModel,
           disp_model::AbstractDispersionModel)

Type for containing the maximum likelihood estimate (MLE) or maximum a posteriori (MAP) results from BFGS optimizations that use Optim.jl. Fields are as follows:

μ contains the final values of the fitting parameters. The mode and median methods will both return μ, but the mean of samples is not always equal to μ due to the variable transformations we perform.
σ contains the standard errors estimated for the parameters and is returned by the std method.
invH is the BFGS approximation to the inverse Hessian, which is an estimator for the covariance matrix of the parameters if the objective function is approximately Gaussian near the best-fit μ.
result is the full result object returned by Optim.jl.
MH_model is the best-fit metallicity model.
disp_model is the best-fit metallicity dispersion model.

This type is implemented as a subtype of Distributions.Sampleable{Multivariate, Continuous} to enable sampling from an estimate of the likelihood / posterior distribution constructed from the invH. You can obtain N::Integer samples from the distribution with rand(R, N) where R is an instance of this type. This will return a size length(μ) x N matrix.

You can also directly obtain the per-SSP template coefficients ($r_{j,k}$ in the derivation) using the optimization results stored in a BFGSResult with calculate_coeffs.

See also

CompositeBFGSResult is a type that contains two instances of BFGSResult, one for the MAP and one for the MLE.

source

This can be used to obtain random samples under a multivariable Normal approximation to the posterior or used to initialize a Hamiltonian Monte Carlo (HMC) sampling process to obtain more accurate posterior samples with sample_sfh and its multi-threaded alternative tsample_sfh.

StarFormationHistories.sample_sfh — Function

sample_sfh(bfgs_result::CompositeBFGSResult, 
           models::AbstractMatrix{<:Number},
           data::AbstractVector{<:Number},
           logAge::AbstractVector{<:Number},
           metallicities::AbstractVector{<:Number},
           Nsteps::Integer;
           ϵ::Real = 0.05, # HMC step size
           reporter = DynamicHMC.ProgressMeterReport(),
           show_convergence::Bool = true,
           rng::AbstractRNG = default_rng())
sample_sfh(bfgs_result::CompositeBFGSResult, 
           models::AbstractVector{<:AbstractMatrix{<:Number}},
           data::AbstractMatrix{<:Number},
           logAge::AbstractVector{<:Number},
           metallicities::AbstractVector{<:Number},
           Nsteps::Integer;
           kws...)

Takes the SFH fitting result in bfgs_result and uses it to initialize the Hamiltonian Monte Carlo (HMC) sampler from DynamicHMC.jl to sample Nsteps independent draws from the posterior.

Arguments

models, data, logAge, metallicities are as in fit_sfh.
Nsteps is the number of Monte Carlo samples you want to draw.

Keyword Arguments

ϵ is the HMC step size. Convergence of the HMC samples is checked after sampling and if a convergence warning is issued, you should decrease this value.
reporter is a valid reporter type from DynamicHMC.jl, either NoProgressReport, ProgressMeterReport for a basic progress meter, or LogProgressReport for more detailed reporting.
show_convergence if true, will send sample convergence statistics to the default display.
rng is a Random.AbstractRNG sampler instance that will be used when generating the random samples.

Returns

A NamedTuple with two elements:

posterior_matrix is a Matrix with dimensions (npar, Nsteps) where npar is the number of fitting variables in the problem and is npar = length(bfgs_result.mle.μ). Each column is one independent sample.
tree_statistics contains convergence statistics that can be viewed with DynamicHMC.Diagnostics.summarize_tree_statistics.

See also

tsample_sfh for multi-threaded version.

source

StarFormationHistories.tsample_sfh — Function

tsample_sfh(bfgs_result::CompositeBFGSResult, 
            models::AbstractMatrix{<:Number},
            data::AbstractVector{<:Number},
            logAge::AbstractVector{<:Number},
            metallicities::AbstractVector{<:Number},
            Nsteps::Integer;
            ϵ::Real = 0.05, # HMC step size
            show_convergence::Bool=true,
            show_progress::Bool=true,
            rng::AbstractRNG=default_rng(),
            chain_length::Integer=100)
tsample_sfh(bfgs_result::CompositeBFGSResult, 
            models::AbstractVector{<:AbstractMatrix{<:Number}},
            data::AbstractMatrix{<:Number},
            logAge::AbstractVector{<:Number},
            metallicities::AbstractVector{<:Number},
            Nsteps::Integer;
            kws...)

Multi-threaded version of sample_sfh; see that method's documentation for details.

Implementation

This method splits the requested number of samples Nsamples into a number of independent HMC chains, each of which has length chain_length. Initial positions for each chain are randomly drawn from the multivariate Gaussian approximation to the objective function stored in bfgs_result, approximating a warm start. Smaller values of chain_length achieve better load balancing while larger values of chain_length allow each chain more time to mix (see also [5]). The default value of 100 results in good mixing with 24 fitting variables and a well-scaled step length ϵ – higher dimensional problems should increase chain_length. The downside to large chain_length is poor load balancing across available threads resulting in longer runtimes.

Notes

if show_progress is true, we will show a progress bar that updates when individual chains complete. Currently this is not terribly useful unless the total number of chains is much greater than the number of available threads.

source

The per-SSP stellar mass coefficients ($r_{j,k}$ in the derivation) can be derived from a metallicity model, a metallicity dispersion model, the per-unique-log(age) stellar mass coefficients ($R_j$ in the derivation), and the set of SSP logarithmic ages logAge = log10(age [yr]) and metallicites using calculate_coeffs. Alternatively a CompositeBFGSResult can be fed into this method and the first three arguments will be read from the result object.

StarFormationHistories.calculate_coeffs — Function

calculate_coeffs(MH_model::AbstractMetallicityModel,
                 disp_model::AbstractDispersionModel,
                 mstars::AbstractVector{<:Number}, 
                 logAge::AbstractVector{<:Number},
                 metallicities::AbstractVector{<:Number})

Returns per-SSP stellar mass coefficients ($r_{j,k}$ in the derivation) using the provided metallicity model MH_model and metallicity dispersion model disp_model for the set of SSPs with logarithmic ages logAge and metallicities metallicities.

Examples

julia> n_logage, n_mh = 10, 20; # Number of unique logAges, MHs

julia> coeffs = calculate_coeffs(PowerLawMZR(1.0, -1.0),
                                 GaussianDispersion(0.2),
                                 rand(n_logage),
                                 repeat(range(7.0, 10.0; length=n_logage); inner=n_mh),
                                 repeat(range(-2.0, 0.0; length=n_mh); outer=n_logage));

julia> coeffs isa Vector{Float64}
true

julia> length(coeffs) == n_logage * n_mh
true

source

We can use the results of our fit to estimate uncertainties on the cumulative SFH, SFRs, and mean metallicity as a function of time with StarFormationHistories.cum_sfr_quantiles – this is an extension of the more basic StarFormationHistories.calculate_cum_sfr.

StarFormationHistories.cum_sfr_quantiles — Function

(cum_sfh, sfr, mean_MH, samples) = 
    cum_sfr_quantiles(result::Union{CompositeBFGSResult,BFGSResult},
                      logAge::AbstractVector{<:Number},
                      MH::AbstractVector{<:Number}, T_max::Number,
                      Nsamples::Integer, q; kws...)

Draws Nsamples independent star formation histories from the solution result and calculates quantiles q across the samples in each unique time bin (i.e., unique(logAge)) for the cumulative star formation histories, star formation rates, and mean metallicities. Also returns the drawn samples.

Arguments

result::Union{CompositeBFGSResult,BFGSResult} is a BFGS result object as returned by fit_sfh, for example, whose contents will be used to sample random, independent star formation histories. If this is a CompositeBFGSResult containing both the MLE and MAP solutions, then the MLE solution is used for the best-fit values (SFRs and metallicity parameters) and the MAP solution is used for the uncertainty estimate.
logAge::AbstractVector is a vector giving the log10(age [yr]) of the stellar populations that were used to derive result. For the purposes of calculating star formation rates, these are assumed to be left-bin edges.
MH::AbstractVector is a vector giving the metallicities of the stellar populations that were used to derive result.
T_max::Number is the rightmost final bin edge for calculating star formation rates in units of Gyr. For example, you might have logAge=[6.6, 6.7, 6.8] in which case a final logAge of 6.9 would give equal bin widths. In this case you would set T_max = exp10(6.9) / 1e9 ≈ 0.0079 so that the width of the final bin for the star formation rate calculation has the same log10(Age [yr]) step as the other bins.
Nsamples::Integer is the number of random, independent star formation histories to draw from result to use when calculating quantiles.
q are the quantiles you wish to have calculated. Can be one number (i.e., 0.5 for median), or most iterable types (e.g., (0.16, 0.5, 0.84) for median and 1-σ range).

Keyword Arguments

kws... are passed to calculate_cum_sfr, see that method's documentation for more information.

Returns

cum_sfh::Matrix has size (length(unique(logAge)), length(q)) and contains the normalized cumulative SFH. This is ~1 at the most recent time in logAge and decreases as logAge increases.
sfr::Matrix has size (length(unique(logAge)), length(q)) and gives the star formation rate across each bin in unique(logAge).
mean_MH::Matrix has size (length(unique(logAge)), length(q)) and gives the stellar-mass-weighted mean metallicity of the stellar population as a function of unique(logAge).
samples::Matrix has size (length(unique(logAge)), Nsamples) and contains the unique samples drawn from result that were used to derive the quantiles.

Examples

julia> result = fit_sfh(...)
CompositeBFGSResult{...} ...

julia> q = (0.16, 0.5, 0.84) # quantiles we want

julia> result = cum_sfr_quantiles(result, logAge, MH, 13.7, 10_000, q);

julia> length(result) == 4
true

julia> all(size(result[i]) == (length(unique(logAge)), length(q)) for i in 1:3)
true

julia> size(result[4], 2) == 10_000

source

We can calculate $\tau$ statistics with StarFormationHistories.tau – for example, $\tau_{90}$ indicates the time at which the the galaxy formed 90% of its present-day stellar mass.

StarFormationHistories.tau — Function

tau(τ, unique_logAge, max_logAge, cum_sfh [, lower, upper])

Returns the lookback time (in Gyr) at which τ percent of the total stellar mass of a galaxy formed. See tau_interp for the underlying interpolator function.

Arguments

τ: Fraction of a galaxy's total stellar mass for which you wish to know when the galaxy had that much mass (e.g., 0.50 for τ_50).
unique_logAge should contain the list of unique log10(age [yr]) for which the SFH solution was derived.
max_logAge should be the maximum log10(age [yr]) that you wish to consider – this sets when the stellar mass of the galaxy was 0.
cum_sfh is an array of the cumulative stellar mass of the galaxy corresponding to the lookback times in unique_logAge; this can be calculated with calculate_cum_sfr or cum_sfr_quantiles.

Optional Arguments

lower is an array of the cumulative stellar mass of the galaxy at some lower confidence level (e.g., 16% for 1-σ); if provided, the function will also return the lower bound on t.
upper is an array of the cumulative stellar mass of the galaxy at some higher confidence level (e.g., 84% for 1-σ); if provided, the function will also return the upper bound on t.

Returns

t::Number the lookback time in Gyr when the galaxy's stellar mass was τ times its total birth stellar mass.

If lower and upper are provided, a tuple of three numbers is returned: (t_lower, t_mle, t_upper).

Examples

First we will verify that tau returns the correct lookback time when the requested τ value corresponds exactly to a value in cum_sfh:

julia> unique_logAge = [8.0, 8.5, 9.0, 9.5, 10.0];

julia> max_logAge = 10.13;

julia> cum_sfh = [1.0, 0.8, 0.5, 0.2, 0.1];

julia> StarFormationHistories.tau(0.5, unique_logAge, max_logAge, cum_sfh) # Exact result
1.0

Now we will show an example where τ does not correspond exactly to a value in cum_sfh, in which case the function will interpolate between the two nearest values in cum_sfh to return the lookback time corresponding to the requested τ. The interpolation is performed in linear age space.

julia> cum_sfh = [1.0, 0.8, 0.4, 0.2, 0.1]; # Interpolation between log(age) = 8.5 and 9.0

julia> StarFormationHistories.tau(0.5, unique_logAge, max_logAge, cum_sfh) ≈ 0.8290569415042095
true

We can also pass multiple τ values at once:

julia> StarFormationHistories.tau([0.5, 0.75], unique_logAge, max_logAge, cum_sfh) ≈ [0.8290569415042095, 0.40169929526473325]
true

This will also work with reversed cum_sfh and unique_logAge:

julia> StarFormationHistories.tau(0.5, reverse(unique_logAge), max_logAge, reverse(cum_sfh)) ≈ 0.8290569415042095
true

Now we can pass in lower and upper estimates on the cumulative stellar mass to get confidence intervals on τ; the result is returned as a size (length(τ), 3) matrix where the first column is the lower bound on τ, the second column is the best estimate of τ, and the third column is the upper bound on τ:

julia> upper = [1.0, 0.85, 0.6, 0.3, 0.15];

julia> lower = [1.0, 0.75, 0.3, 0.1, 0.05];

julia> StarFormationHistories.tau(0.5, unique_logAge, max_logAge, cum_sfh, lower, upper) ≈ [0.6961012293408169  0.8290569415042095  1.7207592200561264]
true

And calling with a vector of τ values also works in this case:

julia> StarFormationHistories.tau([0.5, 0.75], unique_logAge, max_logAge, cum_sfh, lower, upper) ≈ [0.6961012293408169  0.8290569415042095  1.7207592200561264; 0.31622776601683794  0.40169929526473325  0.5897366596101027]
true

source

tau(result, τ, logAge, MH, max_logAge; Nsamples=10_000, q=(0.16, 0.5, 0.84), kws...)

Returns the lookback time (in Gyr) at which τ percent of the total stellar mass of a galaxy formed with upper and lower uncertainty estimates. Uses the solution result to draw samples of the cumulative star formation history and then calculates quantiles across those samples.

Arguments

result::Union{CompositeBFGSResult, BFGSResult} is a BFGS result object as returned by fit_sfh, for example, whose contents will be used to sample random, independent star formation histories via cum_sfr_quantiles.
τ: Fraction of a galaxy's total stellar mass for which you wish to know when the galaxy had that much mass (e.g., 0.50 for τ_50). Can also be a vector of multiple τ values (e.g., [0.5, 0.75, 0.9]).
logAge::AbstractVector is a vector giving the log10(age [yr]) of the stellar populations that were used to derive result. For the purposes of calculating star formation rates, these are assumed to be left-bin edges.
MH::AbstractVector is a vector giving the metallicities of the stellar populations that were used to derive result.
max_logAge should be the maximum log10(age [yr]) that you wish to consider – this sets when the stellar mass of the galaxy was 0.

Keyword Arguments

Nsamples::Integer is the number of random, independent star formation histories to draw from result to use when calculating quantiles.
q is passed to cum_sfr_quantiles to calculate quantiles on the cumulative star formation history samples. This must be a length 3 tuple of quantiles (e.g., (0.16, 0.5, 0.84)) to get lower, best, and upper estimates on τ. It is not recommended that you change this.
kws... are passed to calculate_cum_sfr, see that method's documentation for more information.

Returns

t::Matrix, a size (length(τ), 3) matrix where the first column is the lower bound on τ, the second column is the best estimate of τ, and the third column is the upper bound on τ.

source

StarFormationHistories.tau_interp — Function

tau_interp(unique_logAge, max_logAge, cum_sfh)

Returns an interpolator for the lookback time (in Gyr) at which a given fraction of the total stellar mass of a galaxy formed. This is a helper function for tau.

source