Overview
Why should the metallicity evolution be constrained? While the above methods work well for optimizing the per-template $r_j$ as a means for fitting SFHs, these methods can produce metallicity evolutions that could be considered unphysical, with large changes in the mean metallicity over small changes in time. An example of this type of behavior is shown in the SFH fit below.

While some metallicity variation in the star-forming gas is to be expected, these variations in the SFH fit can end up being quite large depending on the data and isochrone grid adopted. A solution is to construct a more physically-motivated model.
We can do this using a hierarchical model with a parameterized metallicity evolution where the the $r_j$ are not the parameters directly optimized. Rather, we can optimize one stellar mass (or star formation rate) parameter per age bin, and then a number of metallicity evolution parameters that determine how that stellar mass is split between models with different metallicities at fixed age.
In most star formation history analyses, the metallicities are constrained through age-metallicity relations (AMRs), where the mean metallicity at time $t$ is a function of time and a small set of metallicity evolution parameters. A popular AMR model is the linear age-metallicity relation $\langle [\text{M}/\text{H}] \rangle (t) = \alpha \, \left( T_\text{max} - t \right) + \beta$ with a Gaussian distribution in metallicity at fixed age. $T_\text{max}$ here is the earliest lookback time under consideration such that $\langle [\text{M}/\text{H}] \rangle (T_\text{max}) = \beta$. This model is described in more detail here.
AMRs have historically been popular because they are generally capable of producing reasonable fits to observed data and it is relatively easy to derive the gradient of the objective function with respect to the AMR parameters analytically. However, in AMR models there is no direct link between the SFRs being fit and the metallicity evolution as a function of time, even though the two should in principle have some correlation as stellar processes are responsible for enriching the ISM.
A promising avenue of research involves fitting mass-metallicity relations (MZRs) rather than AMRs. In these models, the mean metallicity of stars forming at time $t$ is a function of the total stellar mass of the population at that time – therefore, the mean metallicity evolution changes self-consistently with the SFRs during the fitting process, resulting in a metallicity evolution that is meaningfully coupled to the star formation history. Additionally, AMRs can be difficult to compare between different galaxies because they do not reflect the different SFHs of the galaxies, whereas MZRs can be compared between galaxies much more easily. Our methods for MZR fitting are described in more detail here.
Generic Methods
While there are some methods in this package that are unique to AMR or MZR models, we present a minimal unified interface that can be used to fit SFHs under both types of models. To support multiple dispatch, we define AbstractMetallicityModel as the abstract supertype of AbstractAMR and AbstractMZR, which are each the supertypes for AMR and MZR types, respectively.
StarFormationHistories.AbstractMetallicityModel — Type
AbstractMetallicityModel{T <: Real} is the abstract supertype for all hierarchical metallicity models. Abstract subtypes are AbstractAMR for age-metallicity relations and AbstractMZR for mass-metallicity relations.
The generic methods that can be used for both AMRs and MZRs are described here. The main method for obtaining best-fit star formation histories is fit_sfh.
StarFormationHistories.fit_sfh — Function
fit_sfh(MH_model0::AbstractMetallicityModel,
disp_model0::AbstractDispersionModel,
models::AbstractMatrix{<:Number},
data::AbstractVector{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number};
x0::AbstractVector{<:Number} = <...>
kws...)
fit_sfh(MH_model0::AbstractMetallicityModel,
disp_model0::AbstractDispersionModel,
models::AbstractVector{<:AbstractMatrix{<:Number}},
data::AbstractMatrix{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number};
x0::AbstractVector{<:Number} = <...>
kws...)Returns a CompositeBFGSResult instance that contains the maximum a posteriori (MAP) and maximum likelihood estimates (MLE) obtained from fitting the provided simple stellar population (SSP) templates models (with logarithmic ages logAge = log10(age [yr]) and metallicities metallicities) to the provided data. The metallicity evolution is modelled using the provided MH_model0, whose parameters can be free or fixed, with metallicity dispersion at fixed time modelled by disp_model0, whose parameters can be free or fixed.
This method is designed to work best with a grid of stellar models, defined by the outer product of N unique entries in logAge and M unique entries in metallicities. See the examples for more information on usage.
We provide several options for age-metallicity relations and mass-metallicity relations that can be used for MH_model0 and define APIs for users to create new models that will integrate with this function. Similar flexibility is allowed for the metallicity dispersion model disp_model0.
The primary method signature uses flattened formats for models and data. See the notes for the flattened call signature of StarFormationHistories.composite! for more details, as well as stack_models that facilitates rearranging the models into this flattened format.
Arguments
MH_model0is an instance ofAbstractMetallicityModelthat defines how the average metallicity stars being formed in the population changes over time. The fittable parameters contained in this instance are used as the initial values to start the optimization.disp_model0is an instance ofAbstractDispersionModelthat defines the distribution of metallicities of stars forming in a fixed time bin (i.e., the dispersion in metallicity around the mean at fixed time). The fittable parameters contained in this instance are used as the initial values to start the optimization.modelsare the template Hess diagrams for the SSPs that compose the observed Hess diagram.datais the Hess diagram for the observed data.logAge::AbstractVector{<:Number}is the vector containing the effective ages of the stellar populations used to create the templates inmodels, in units oflog10(age [yr]). For example, if a population has an age of 1 Myr, its entry inlogAgeshould belog10(10^6) = 6.0.metallicities::AbstractVector{<:Number}is the vector containing the effective metallicities of the stellar populations used to create the templates inmodels. These should be logarithmic abundances like [M/H] or [Fe/H]. There are some notes on the Wikipedia that might be useful.
Keyword Arguments
x0is the vector of initial guesses for the stellar mass coefficients per unique entry inlogAge. We try to set reasonable defaults, but in most cases users should be calculating and passing this keyword argument. We provideStarFormationHistories.construct_x0_mdfto preparex0assuming a constant star formation rate and total stellar mass, which is typically a good initial guess.kws...are passed toOptim.Optionsand can be used to control tolerances for convergence.
Returns
- This function returns a
CompositeBFGSResultthat contains the output from both MLE and MAP optimizations, accessible viaresult.mleandresult.map. These are each instances ofBFGSResult. See the docs for these structs for more information.
This function returns an instance of CompositeBFGSResult.
StarFormationHistories.CompositeBFGSResult — Type
CompositeBFGSResult(map::BFGSResult, mle::BFGSResult)Type for containing the maximum a posteriori (MAP) AND maximum likelihood estimate (MLE) results from BFGS optimizations that use Optim.jl, which are individually accessible via the :mle and :map properties (i.e., for an instance of this type t, t.mle or getproperty(t, :mle) and t.map or getproperty(t, :map)).
Random samples can be drawn from an instance t as rand(t, N::Integer). This will return a size length(μ) x N matrix. This will use the MLE result for the best-fit values and the inverse Hessian approximation to the covariance matrix from the MAP result, which is more robust when best-fit values that are constrained to be positive approach 0.
Per-SSP coefficients can be calculated with calculate_coeffs(result::CompositeBFGSResult, logAge::AbstractVector{<:Number}, metallicities::AbstractVector{<:Number}), which uses the MLE result (see these docs).
StarFormationHistories.BFGSResult — Type
BFGSResult(μ::AbstractVector{<:Number},
σ::AbstractVector{<:Number},
invH::AbstractMatrix{<:Number},
result,
MH_model::AbstractMetallicityModel,
disp_model::AbstractDispersionModel)Type for containing the maximum likelihood estimate (MLE) or maximum a posteriori (MAP) results from BFGS optimizations that use Optim.jl. Fields are as follows:
μcontains the final values of the fitting parameters. Themodeandmedianmethods will both returnμ, but the mean of samples is not always equal toμdue to the variable transformations we perform.σcontains the standard errors estimated for the parameters and is returned by thestdmethod.invHis the BFGS approximation to the inverse Hessian, which is an estimator for the covariance matrix of the parameters if the objective function is approximately Gaussian near the best-fitμ.resultis the full result object returned by Optim.jl.MH_modelis the best-fit metallicity model.disp_modelis the best-fit metallicity dispersion model.
This type is implemented as a subtype of Distributions.Sampleable{Multivariate, Continuous} to enable sampling from an estimate of the likelihood / posterior distribution constructed from the invH. You can obtain N::Integer samples from the distribution with rand(R, N) where R is an instance of this type. This will return a size length(μ) x N matrix.
You can also directly obtain the per-SSP template coefficients ($r_{j,k}$ in the derivation) using the optimization results stored in a BFGSResult with calculate_coeffs.
See also
CompositeBFGSResultis a type that contains two instances ofBFGSResult, one for the MAP and one for the MLE.
This can be used to obtain random samples under a multivariable Normal approximation to the posterior or used to initialize a Hamiltonian Monte Carlo (HMC) sampling process to obtain more accurate posterior samples with sample_sfh and its multi-threaded alternative tsample_sfh.
StarFormationHistories.sample_sfh — Function
sample_sfh(bfgs_result::CompositeBFGSResult,
models::AbstractMatrix{<:Number},
data::AbstractVector{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number},
Nsteps::Integer;
ϵ::Real = 0.05, # HMC step size
reporter = DynamicHMC.ProgressMeterReport(),
show_convergence::Bool = true,
rng::AbstractRNG = default_rng())
sample_sfh(bfgs_result::CompositeBFGSResult,
models::AbstractVector{<:AbstractMatrix{<:Number}},
data::AbstractMatrix{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number},
Nsteps::Integer;
kws...)Takes the SFH fitting result in bfgs_result and uses it to initialize the Hamiltonian Monte Carlo (HMC) sampler from DynamicHMC.jl to sample Nsteps independent draws from the posterior.
The primary method signature uses flattened formats for models and data. See the notes for the flattened call signature of StarFormationHistories.composite! for more details, as well as stack_models that facilitates rearranging the models into this flattened format.
Arguments
models, data, logAge, metallicitiesare as infit_sfh.Nstepsis the number of Monte Carlo samples you want to draw.
Keyword Arguments
ϵis the HMC step size. Convergence of the HMC samples is checked after sampling and if a convergence warning is issued, you should decrease this value.reporteris a valid reporter type from DynamicHMC.jl, eitherNoProgressReport,ProgressMeterReportfor a basic progress meter, orLogProgressReportfor more detailed reporting.show_convergenceiftrue, will send sample convergence statistics to the default display.rngis aRandom.AbstractRNGsampler instance that will be used when generating the random samples.
Returns
A NamedTuple with two elements:
posterior_matrixis aMatrixwith dimensions(npar, Nsteps)wherenparis the number of fitting variables in the problem and isnpar = length(bfgs_result.mle.μ). Each column is one independent sample.tree_statisticscontains convergence statistics that can be viewed withDynamicHMC.Diagnostics.summarize_tree_statistics.
See also
tsample_sfhfor multi-threaded version.
StarFormationHistories.tsample_sfh — Function
tsample_sfh(bfgs_result::CompositeBFGSResult,
models::AbstractMatrix{<:Number},
data::AbstractVector{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number},
Nsteps::Integer;
ϵ::Real = 0.05, # HMC step size
show_convergence::Bool=true,
show_progress::Bool=true,
rng::AbstractRNG=default_rng(),
chain_length::Integer=100)
tsample_sfh(bfgs_result::CompositeBFGSResult,
models::AbstractVector{<:AbstractMatrix{<:Number}},
data::AbstractMatrix{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number},
Nsteps::Integer;
kws...)Multi-threaded version of sample_sfh; see that method's documentation for details.
Implementation
This method splits the requested number of samples Nsamples into a number of independent HMC chains, each of which has length chain_length. Initial positions for each chain are randomly drawn from the multivariate Gaussian approximation to the objective function stored in bfgs_result, approximating a warm start. Smaller values of chain_length achieve better load balancing while larger values of chain_length allow each chain more time to mix (see also [5]). The default value of 100 results in good mixing with 24 fitting variables and a well-scaled step length ϵ – higher dimensional problems should increase chain_length. The downside to large chain_length is poor load balancing across available threads resulting in longer runtimes.
Notes
- if
show_progressistrue, we will show a progress bar that updates when individual chains complete. Currently this is not terribly useful unless the total number of chains is much greater than the number of available threads.
The per-SSP stellar mass coefficients ($r_{j,k}$ in the derivation) can be derived from a metallicity model, a metallicity dispersion model, the per-unique-log(age) stellar mass coefficients ($R_j$ in the derivation), and the set of SSP logarithmic ages logAge = log10(age [yr]) and metallicites using calculate_coeffs. Alternatively a CompositeBFGSResult can be fed into this method and the first three arguments will be read from the result object.
StarFormationHistories.calculate_coeffs — Function
calculate_coeffs(MH_model::AbstractMetallicityModel,
disp_model::AbstractDispersionModel,
mstars::AbstractVector{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number})Returns per-SSP stellar mass coefficients ($r_{j,k}$ in the derivation) using the provided metallicity model MH_model and metallicity dispersion model disp_model for the set of SSPs with logarithmic ages logAge and metallicities metallicities.
Examples
julia> n_logage, n_mh = 10, 20; # Number of unique logAges, MHs
julia> coeffs = calculate_coeffs(PowerLawMZR(1.0, -1.0),
GaussianDispersion(0.2),
rand(n_logage),
repeat(range(7.0, 10.0; length=n_logage); inner=n_mh),
repeat(range(-2.0, 0.0; length=n_mh); outer=n_logage));
julia> coeffs isa Vector{Float64}
true
julia> length(coeffs) == n_logage * n_mh
trueWe can use the results of our fit to estimate uncertainties on the cumulative SFH, SFRs, and mean metallicity as a function of time with StarFormationHistories.cum_sfr_quantiles – this is an extension of the more basic StarFormationHistories.calculate_cum_sfr.
StarFormationHistories.cum_sfr_quantiles — Function
(cum_sfh, sfr, mean_MH, samples) =
cum_sfr_quantiles(result::Union{CompositeBFGSResult,BFGSResult},
logAge::AbstractVector{<:Number},
MH::AbstractVector{<:Number}, T_max::Number,
Nsamples::Integer, q; kws...)Draws Nsamples independent star formation histories from the solution result and calculates quantiles q across the samples in each unique time bin (i.e., unique(logAge)) for the cumulative star formation histories, star formation rates, and mean metallicities. Also returns the drawn samples.
Arguments
result::Union{CompositeBFGSResult,BFGSResult}is a BFGS result object as returned byfit_sfh, for example, whose contents will be used to sample random, independent star formation histories. If this is aCompositeBFGSResultcontaining both the MLE and MAP solutions, then the MLE solution is used for the best-fit values (SFRs and metallicity parameters) and the MAP solution is used for the uncertainty estimate.logAge::AbstractVectoris a vector giving thelog10(age [yr])of the stellar populations that were used to deriveresult. For the purposes of calculating star formation rates, these are assumed to be left-bin edges.MH::AbstractVectoris a vector giving the metallicities of the stellar populations that were used to deriveresult.T_max::Numberis the rightmost final bin edge for calculating star formation rates in units of Gyr. For example, you might havelogAge=[6.6, 6.7, 6.8]in which case a final logAge of 6.9 would give equal bin widths. In this case you would setT_max = exp10(6.9) / 1e9 ≈ 0.0079so that the width of the final bin for the star formation rate calculation has the samelog10(Age [yr])step as the other bins.Nsamples::Integeris the number of random, independent star formation histories to draw fromresultto use when calculating quantiles.qare the quantiles you wish to have calculated. Can be one number (i.e.,0.5for median), or most iterable types (e.g.,(0.16, 0.5, 0.84)for median and 1-σ range).
Keyword Arguments
kws...are passed tocalculate_cum_sfr, see that method's documentation for more information.
Returns
cum_sfh::Matrixhas size(length(unique(logAge)), length(q))and contains the normalized cumulative SFH. This is ~1 at the most recent time inlogAgeand decreases aslogAgeincreases.sfr::Matrixhas size(length(unique(logAge)), length(q))and gives the star formation rate across each bin inunique(logAge).mean_MH::Matrixhas size(length(unique(logAge)), length(q))and gives the stellar-mass-weighted mean metallicity of the stellar population as a function ofunique(logAge).samples::Matrixhas size(length(unique(logAge)), Nsamples)and contains the unique samples drawn fromresultthat were used to derive the quantiles.
Examples
julia> result = fit_sfh(...)
CompositeBFGSResult{...} ...
julia> q = (0.16, 0.5, 0.84) # quantiles we want
julia> result = cum_sfr_quantiles(result, logAge, MH, 13.7, 10_000, q);
julia> length(result) == 4
true
julia> all(size(result[i]) == (length(unique(logAge)), length(q)) for i in 1:3)
true
julia> size(result[4], 2) == 10_000We can calculate $\tau$ statistics with StarFormationHistories.tau – for example, $\tau_{90}$ indicates the time at which the the galaxy formed 90% of its present-day stellar mass.
StarFormationHistories.tau — Function
tau(τ, unique_logAge, max_logAge, cum_sfh [, lower, upper])Returns the lookback time (in Gyr) at which τ percent of the total stellar mass of a galaxy formed. See tau_interp for the underlying interpolator function.
Arguments
τ: Fraction of a galaxy's total stellar mass for which you wish to know when the galaxy had that much mass (e.g., 0.50 forτ_50).unique_logAgeshould contain the list of uniquelog10(age [yr])for which the SFH solution was derived.max_logAgeshould be the maximumlog10(age [yr])that you wish to consider – this sets when the stellar mass of the galaxy was 0.cum_sfhis an array of the cumulative stellar mass of the galaxy corresponding to the lookback times inunique_logAge; this can be calculated withcalculate_cum_sfrorcum_sfr_quantiles.
Optional Arguments
loweris an array of the cumulative stellar mass of the galaxy at some lower confidence level (e.g., 16% for 1-σ); if provided, the function will also return the lower bound ont.upperis an array of the cumulative stellar mass of the galaxy at some higher confidence level (e.g., 84% for 1-σ); if provided, the function will also return the upper bound ont.
Returns
t::Numberthe lookback time in Gyr when the galaxy's stellar mass wasτtimes its total birth stellar mass.
If lower and upper are provided, a tuple of three numbers is returned: (t_lower, t_mle, t_upper).
Examples
First we will verify that tau returns the correct lookback time when the requested τ value corresponds exactly to a value in cum_sfh:
julia> unique_logAge = [8.0, 8.5, 9.0, 9.5, 10.0];
julia> max_logAge = 10.13;
julia> cum_sfh = [1.0, 0.8, 0.5, 0.2, 0.1];
julia> StarFormationHistories.tau(0.5, unique_logAge, max_logAge, cum_sfh) # Exact result
1.0Now we will show an example where τ does not correspond exactly to a value in cum_sfh, in which case the function will interpolate between the two nearest values in cum_sfh to return the lookback time corresponding to the requested τ. The interpolation is performed in linear age space.
julia> cum_sfh = [1.0, 0.8, 0.4, 0.2, 0.1]; # Interpolation between log(age) = 8.5 and 9.0
julia> StarFormationHistories.tau(0.5, unique_logAge, max_logAge, cum_sfh) ≈ 0.8290569415042095
trueWe can also pass multiple τ values at once:
julia> StarFormationHistories.tau([0.5, 0.75], unique_logAge, max_logAge, cum_sfh) ≈ [0.8290569415042095, 0.40169929526473325]
trueThis will also work with reversed cum_sfh and unique_logAge:
julia> StarFormationHistories.tau(0.5, reverse(unique_logAge), max_logAge, reverse(cum_sfh)) ≈ 0.8290569415042095
trueNow we can pass in lower and upper estimates on the cumulative stellar mass to get confidence intervals on τ; the result is returned as a size (length(τ), 3) matrix where the first column is the lower bound on τ, the second column is the best estimate of τ, and the third column is the upper bound on τ:
julia> upper = [1.0, 0.85, 0.6, 0.3, 0.15];
julia> lower = [1.0, 0.75, 0.3, 0.1, 0.05];
julia> StarFormationHistories.tau(0.5, unique_logAge, max_logAge, cum_sfh, lower, upper) ≈ [0.6961012293408169 0.8290569415042095 1.7207592200561264]
trueAnd calling with a vector of τ values also works in this case:
julia> StarFormationHistories.tau([0.5, 0.75], unique_logAge, max_logAge, cum_sfh, lower, upper) ≈ [0.6961012293408169 0.8290569415042095 1.7207592200561264; 0.31622776601683794 0.40169929526473325 0.5897366596101027]
truetau(result, τ, logAge, MH, max_logAge; Nsamples=10_000, q=(0.16, 0.5, 0.84), kws...)Returns the lookback time (in Gyr) at which τ percent of the total stellar mass of a galaxy formed with upper and lower uncertainty estimates. Uses the solution result to draw samples of the cumulative star formation history and then calculates quantiles across those samples.
Arguments
result::Union{CompositeBFGSResult, BFGSResult}is a BFGS result object as returned byfit_sfh, for example, whose contents will be used to sample random, independent star formation histories viacum_sfr_quantiles.τ: Fraction of a galaxy's total stellar mass for which you wish to know when the galaxy had that much mass (e.g., 0.50 forτ_50). Can also be a vector of multipleτvalues (e.g.,[0.5, 0.75, 0.9]).logAge::AbstractVectoris a vector giving the log10(age [yr]) of the stellar populations that were used to deriveresult. For the purposes of calculating star formation rates, these are assumed to be left-bin edges.MH::AbstractVectoris a vector giving the metallicities of the stellar populations that were used to deriveresult.max_logAgeshould be the maximumlog10(age [yr])that you wish to consider – this sets when the stellar mass of the galaxy was 0.
Keyword Arguments
Nsamples::Integeris the number of random, independent star formation histories to draw fromresultto use when calculating quantiles.qis passed tocum_sfr_quantilesto calculate quantiles on the cumulative star formation history samples. This must be a length 3 tuple of quantiles (e.g.,(0.16, 0.5, 0.84)) to get lower, best, and upper estimates onτ. It is not recommended that you change this.kws...are passed tocalculate_cum_sfr, see that method's documentation for more information.
Returns
t::Matrix, a size(length(τ), 3)matrix where the first column is the lower bound onτ, the second column is the best estimate ofτ, and the third column is the upper bound onτ.
StarFormationHistories.tau_interp — Function
tau_interp(unique_logAge, max_logAge, cum_sfh)Returns an interpolator for the lookback time (in Gyr) at which a given fraction of the total stellar mass of a galaxy formed. This is a helper function for tau.