Overview
Why should the metallicity evolution be constrained? While the above methods work well for optimizing the per-template $r_j$ as a means for fitting SFHs, these methods can produce metallicity evolutions that could be considered unphysical, with large changes in the mean metallicity over small changes in time. An example of this type of behavior is shown in the SFH fit below.
While some metallicity variation in the star-forming gas is to be expected, these variations in the SFH fit can end up being quite large depending on the data and isochrone grid adopted. A solution is to construct a more physically-motivated model.
We can do this using a hierarchical model with a parameterized metallicity evolution where the the $r_j$ are not the parameters directly optimized. Rather, we can optimize one stellar mass (or star formation rate) parameter per age bin, and then a number of metallicity evolution parameters that determine how that stellar mass is split between models with different metallicities at fixed age.
In most star formation history analyses, the metallicities are constrained through age-metallicity relations (AMRs), where the mean metallicity at time $t$ is a function of time and a small set of metallicity evolution parameters. A popular AMR model is the linear age-metallicity relation $\langle [\text{M}/\text{H}] \rangle (t) = \alpha \, \left( T_\text{max} - t \right) + \beta$ with a Gaussian distribution in metallicity at fixed age. $T_\text{max}$ here is the earliest lookback time under consideration such that $\langle [\text{M}/\text{H}] \rangle (T_\text{max}) = \beta$. This model is described in more detail here.
AMRs have historically been popular because they are generally capable of producing reasonable fits to observed data and it is relatively easy to derive the gradient of the objective function with respect to the AMR parameters analytically. However, in AMR models there is no direct link between the SFRs being fit and the metallicity evolution as a function of time, even though the two should in principle have some correlation as stellar processes are responsible for enriching the ISM.
A promising avenue of research involves fitting mass-metallicity relations (MZRs) rather than AMRs. In these models, the mean metallicity of stars forming at time $t$ is a function of the total stellar mass of the population at that time – therefore, the mean metallicity evolution changes self-consistently with the SFRs during the fitting process, resulting in a metallicity evolution that is meaningfully coupled to the star formation history. Additionally, AMRs can be difficult to compare between different galaxies because they do not reflect the different SFHs of the galaxies, whereas MZRs can be compared between galaxies much more easily. Our methods for MZR fitting are described in more detail here.
Generic Methods
While there are some methods in this package that are unique to AMR or MZR models, we present a minimal unified interface that can be used to fit SFHs under both types of models. To support multiple dispatch, we define AbstractMetallicityModel
as the abstract supertype of AbstractAMR
and AbstractMZR
, which are each the supertypes for AMR and MZR types, respectively.
StarFormationHistories.AbstractMetallicityModel
— TypeAbstractMetallicityModel{T <: Real}
is the abstract supertype for all hierarchical metallicity models. Abstract subtypes are AbstractAMR
for age-metallicity relations and AbstractMZR
for mass-metallicity relations.
The generic methods that can be used for both AMRs and MZRs are described here. The main method for obtaining best-fit star formation histories is fit_sfh
.
StarFormationHistories.fit_sfh
— Functionfit_sfh(MH_model0::AbstractMetallicityModel,
disp_model0::AbstractDispersionModel,
models::AbstractMatrix{<:Number},
data::AbstractVector{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number};
x0::AbstractVector{<:Number} = <...>
kws...)
fit_sfh(MH_model0::AbstractMetallicityModel,
disp_model0::AbstractDispersionModel,
models::AbstractVector{<:AbstractMatrix{<:Number}},
data::AbstractMatrix{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number};
x0::AbstractVector{<:Number} = <...>
kws...)
Returns a CompositeBFGSResult
instance that contains the maximum a posteriori (MAP) and maximum likelihood estimates (MLE) obtained from fitting the provided simple stellar population (SSP) templates models
(with logarithmic ages logAge = log10(age [yr])
and metallicities metallicities
) to the provided data
. The metallicity evolution is modelled using the provided MH_model0
, whose parameters can be free or fixed, with metallicity dispersion at fixed time modelled by disp_model0
, whose parameters can be free or fixed.
This method is designed to work best with a grid of stellar models, defined by the outer product of N
unique entries in logAge
and M
unique entries in metallicities
. See the examples for more information on usage.
We provide several options for age-metallicity relations and mass-metallicity relations that can be used for MH_model0
and define APIs for users to create new models that will integrate with this function. Similar flexibility is allowed for the metallicity dispersion model disp_model0
.
The primary method signature uses flattened formats for models
and data
. See the notes for the flattened call signature of StarFormationHistories.composite!
for more details, as well as stack_models
that facilitates rearranging the models
into this flattened format.
Arguments
MH_model0
is an instance ofAbstractMetallicityModel
that defines how the average metallicity stars being formed in the population changes over time. The fittable parameters contained in this instance are used as the initial values to start the optimization.disp_model0
is an instance ofAbstractDispersionModel
that defines the distribution of metallicities of stars forming in a fixed time bin (i.e., the dispersion in metallicity around the mean at fixed time). The fittable parameters contained in this instance are used as the initial values to start the optimization.models
are the template Hess diagrams for the SSPs that compose the observed Hess diagram.data
is the Hess diagram for the observed data.logAge::AbstractVector{<:Number}
is the vector containing the effective ages of the stellar populations used to create the templates inmodels
, in units oflog10(age [yr])
. For example, if a population has an age of 1 Myr, its entry inlogAge
should belog10(10^6) = 6.0
.metallicities::AbstractVector{<:Number}
is the vector containing the effective metallicities of the stellar populations used to create the templates inmodels
. These should be logarithmic abundances like [M/H] or [Fe/H]. There are some notes on the Wikipedia that might be useful.
Keyword Arguments
x0
is the vector of initial guesses for the stellar mass coefficients per unique entry inlogAge
. We try to set reasonable defaults, but in most cases users should be calculating and passing this keyword argument. We provideStarFormationHistories.construct_x0_mdf
to preparex0
assuming a constant star formation rate and total stellar mass, which is typically a good initial guess.kws...
are passed toOptim.Options
and can be used to control tolerances for convergence.
Returns
- This function returns a
CompositeBFGSResult
that contains the output from both MLE and MAP optimizations, accessible viaresult.mle
andresult.map
. These are each instances ofBFGSResult
. See the docs for these structs for more information.
This function returns an instance of CompositeBFGSResult
.
StarFormationHistories.CompositeBFGSResult
— TypeCompositeBFGSResult(map::BFGSResult, mle::BFGSResult)
Type for containing the maximum a posteriori (MAP) AND maximum likelihood estimate (MLE) results from BFGS optimizations that use Optim.jl, which are individually accessible via the :mle
and :map
properties (i.e., for an instance of this type t
, t.mle
or getproperty(t, :mle)
and t.map
or getproperty(t, :map)
).
Random samples can be drawn from an instance t
as rand(t, N::Integer)
. This will return a size length(μ) x N
matrix. This will use the MLE result for the best-fit values and the inverse Hessian approximation to the covariance matrix from the MAP result, which is more robust when best-fit values that are constrained to be positive approach 0.
Per-SSP coefficients can be calculated with calculate_coeffs(result::CompositeBFGSResult, logAge::AbstractVector{<:Number}, metallicities::AbstractVector{<:Number})
, which uses the MLE result (see these docs).
StarFormationHistories.BFGSResult
— TypeBFGSResult(μ::AbstractVector{<:Number},
σ::AbstractVector{<:Number},
invH::AbstractMatrix{<:Number},
result,
MH_model::AbstractMetallicityModel,
disp_model::AbstractDispersionModel)
Type for containing the maximum likelihood estimate (MLE) or maximum a posteriori (MAP) results from BFGS optimizations that use Optim.jl. Fields are as follows:
μ
contains the final values of the fitting parameters. Themode
andmedian
methods will both returnμ
, but the mean of samples is not always equal toμ
due to the variable transformations we perform.σ
contains the standard errors estimated for the parameters and is returned by thestd
method.invH
is the BFGS approximation to the inverse Hessian, which is an estimator for the covariance matrix of the parameters if the objective function is approximately Gaussian near the best-fitμ
.result
is the full result object returned by Optim.jl.MH_model
is the best-fit metallicity model.disp_model
is the best-fit metallicity dispersion model.
This type is implemented as a subtype of Distributions.Sampleable{Multivariate, Continuous}
to enable sampling from an estimate of the likelihood / posterior distribution constructed from the invH
. You can obtain N::Integer
samples from the distribution with rand(R, N)
where R
is an instance of this type. This will return a size length(μ) x N
matrix.
You can also directly obtain the per-SSP template coefficients ($r_{j,k}$ in the derivation) using the optimization results stored in a BFGSResult
with calculate_coeffs
.
See also
CompositeBFGSResult
is a type that contains two instances ofBFGSResult
, one for the MAP and one for the MLE.
This can be used to obtain random samples under a multivariable Normal approximation to the posterior or used to initialize a Hamiltonian Monte Carlo (HMC) sampling process to obtain more accurate posterior samples with sample_sfh
and its multi-threaded alternative tsample_sfh
.
StarFormationHistories.sample_sfh
— Functionsample_sfh(bfgs_result::CompositeBFGSResult,
models::AbstractMatrix{<:Number},
data::AbstractVector{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number},
Nsteps::Integer;
ϵ::Real = 0.05, # HMC step size
reporter = DynamicHMC.ProgressMeterReport(),
show_convergence::Bool = true,
rng::AbstractRNG = default_rng())
sample_sfh(bfgs_result::CompositeBFGSResult,
models::AbstractVector{<:AbstractMatrix{<:Number}},
data::AbstractMatrix{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number},
Nsteps::Integer;
kws...)
Takes the SFH fitting result in bfgs_result
and uses it to initialize the Hamiltonian Monte Carlo (HMC) sampler from DynamicHMC.jl to sample Nsteps
independent draws from the posterior.
The primary method signature uses flattened formats for models
and data
. See the notes for the flattened call signature of StarFormationHistories.composite!
for more details, as well as stack_models
that facilitates rearranging the models
into this flattened format.
Arguments
models, data, logAge, metallicities
are as infit_sfh
.Nsteps
is the number of Monte Carlo samples you want to draw.
Keyword Arguments
ϵ
is the HMC step size. Convergence of the HMC samples is checked after sampling and if a convergence warning is issued, you should decrease this value.reporter
is a valid reporter type from DynamicHMC.jl, eitherNoProgressReport
,ProgressMeterReport
for a basic progress meter, orLogProgressReport
for more detailed reporting.show_convergence
iftrue
, will send sample convergence statistics to the default display.rng
is aRandom.AbstractRNG
sampler instance that will be used when generating the random samples.
Returns
A NamedTuple
with two elements:
posterior_matrix
is aMatrix
with dimensions(npar, Nsteps)
wherenpar
is the number of fitting variables in the problem and isnpar = length(bfgs_result.mle.μ)
. Each column is one independent sample.tree_statistics
contains convergence statistics that can be viewed withDynamicHMC.Diagnostics.summarize_tree_statistics
.
See also
- [
tsample_sfh
(@ref StarFormationHistories.tsample_sfh) for multi-threaded version.
StarFormationHistories.tsample_sfh
— Functiontsample_sfh(bfgs_result::CompositeBFGSResult,
models::AbstractMatrix{<:Number},
data::AbstractVector{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number},
Nsteps::Integer;
ϵ::Real = 0.05, # HMC step size
show_convergence::Bool=true,
show_progress::Bool=true,
rng::AbstractRNG=default_rng(),
chain_length::Integer=100)
tsample_sfh(bfgs_result::CompositeBFGSResult,
models::AbstractVector{<:AbstractMatrix{<:Number}},
data::AbstractMatrix{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number},
Nsteps::Integer;
kws...)
Multi-threaded version of sample_sfh
; see that method's documentation for details.
Implementation
This method splits the requested number of samples Nsamples
into a number of independent HMC chains, each of which has length chain_length
. Initial positions for each chain are randomly drawn from the multivariate Gaussian approximation to the objective function stored in bfgs_result
, approximating a warm start. Smaller values of chain_length
achieve better load balancing while larger values of chain_length
allow each chain more time to mix (see also Chen et al. 2020). The default value of 100 results in good mixing with 24 fitting variables and a well-scaled step length ϵ – higher dimensional problems should increase chain_length
. The downside to large chain_length
is poor load balancing across available threads resulting in longer runtimes.
Notes
- if
show_progress
istrue
, we will show a progress bar that updates when individual chains complete. Currently this is not terribly useful unless the total number of chains is much greater than the number of available threads.
The per-SSP stellar mass coefficients ($r_{j,k}$ in the derivation) can be derived from a metallicity model, a metallicity dispersion model, the per-unique-log(age) stellar mass coefficients ($R_j$ in the derivation), and the set of SSP logarithmic ages logAge = log10(age [yr])
and metallicites using calculate_coeffs
. Alternatively a CompositeBFGSResult
can be fed into this method and the first three arguments will be read from the result object.
StarFormationHistories.calculate_coeffs
— Functioncalculate_coeffs(MH_model::AbstractMetallicityModel,
disp_model::AbstractDispersionModel,
mstars::AbstractVector{<:Number},
logAge::AbstractVector{<:Number},
metallicities::AbstractVector{<:Number})
Returns per-SSP stellar mass coefficients ($r_{j,k}$ in the derivation) using the provided metallicity model MH_model
and metallicity dispersion model disp_model
for the set of SSPs with logarithmic ages logAge
and metallicities metallicities
.
Examples
julia> n_logage, n_mh = 10, 20; # Number of unique logAges, MHs
julia> coeffs = calculate_coeffs(PowerLawMZR(1.0, -1.0),
GaussianDispersion(0.2),
rand(n_logage),
repeat(range(7.0, 10.0; length=n_logage); inner=n_mh),
repeat(range(-2.0, 0.0; length=n_mh); outer=n_logage));
julia> coeffs isa Vector{Float64}
true
julia> length(coeffs) == n_logage * n_mh
true