In general, a 'trait' is a phenotype (a characteristic that the plant exhibits). The traits that we are primarily interested in collecting data for are listed in Table \ref{tab:traits}. Before adding trait data, it is necessary to have the citation, treatments, and site information already entered. If the correct citation is not identified at the top of the page Figure 8. To add a new Trait, go to the new trait page: `Trait`

→ `new`

.

**Key Traits Stored in BETYdb**

Presently, we are also using the Trait table to record ecosystem level measurements other than Yield. Such ecosystem level measurements can include leaf area index or net primary productivity, but are only collected when required for a particular project. Most of the fields in the Traits table are also used in the Yields table. Here is a list of the fields with a brief description, followed by more thorough explanations:

**Species**: Search for species in the database using the search box; if speciesis not found, then the new species should be added to the database.

**Cultivar**: primarily used for crops; If the cultivar being used is not found indrop-down box

**DateLOC**: Date Level of confidence. See for values.**TimeLOC**: Time Level of confidence. See for values.**Mean**: For yield, mean is in units of tons per hectare per year (t/ha)**Stat name**: is the name of the statistical method used (usually one of SE, SD, MSE,CI, LSD, HSD, MSD). See for more details.

**Statistic**: is the value of the statistic associated with Stat name.**N**: Always record N if provided. N is the number of experimentalreplicates, often referred to as the sample size; N represents the

number of independent units within each treatment: in a field

setting, this is often the number of plots in each treatment, but in

a greenhouse, growth chamber, or pot-study this may be the number of

chambers, pots, or individual plants. Sometimes this value is not

clearly stated.

Uncertainty in Date or Time

DateLOC

The date level of confidence (DateLOC, Table \ref{tab:dateloc}) provides an indication of how accurately the date associated with the trait or yield observation is known. It provides the values that should be entered in this field. If the event occurred at a level of precision not defined by an integer in this table, then use fractions. For example, we commonly use 5.5 to indicate a one week level of precision. If the exact year is not known, but the time of year is, then use 91 to 97, with the second digit to indicate the information known within the year.

**Table Date level of confidence (DateLOC) field** Numbering convention for the DateLOC (Date level of confidence) and TimeLOC (Time level of confidence) field, used in managements, traits, and yields table.

TimeLOC

The time level of confidence (TimeLOC) provides an indication of how accurately the time associated with the trait or yield observation is known. It provides the values that should be entered in this field.

Statistics

Our goal is to record statistics that can be used to estimate standard deviation or standard error (https://www.authorea.com/users/5574/articles/6811/). Many different methods can be used to summarize data, and this is reflected in the diversity of statistics that are reported. An overview of these methods is given in a description below.

Where available, direct estimates of variance are preferred, including Standard Error (SE), sample Standard Deviation (SD), or Mean Squared Error (MSE). SE is usually presented in the format of mean (±SE). MSE is usually presented in a table. When extracting SE or SD from a figure, measure from the mean to the upper or lower bound. This is different than confidence intervals and range statistics (described below), for which the entire range is collected.

If MSE, SD, or SE are not provided, it is possible that LSD, MSD, HSD, or CI will be provided. These are range statistics and the most frequently found range statistics include a Confidence Interval (95%CI), Fisher’s Least Significant Difference (LSD), Tukey’s Honestly Significant Difference (HSD), and Minimum Significant Difference (MSD). Fundamentally, these methods calculate a range that indicates whether two means are different or not, and this range uses different approaches to penalize multiple comparisons. The important point is that these are ranges and that we record the entire range.

Another type of statistic is a “test statistic”; most frequently there will be an F-value that can be useful, but this should not be recorded if MSE is available. Only if there is no other information available should you record the P-value.

The protocol for entering yield data is identical to entering data for a trait, with a few exceptions:

There are no covariates associated with yield data

Yield data is always the dry harvestable biomass; if necessary, moisture content can be added as a trait

Yield is equivalent to aboveground biomass on a per-area basis, and has units of Mg ha^-1 y^-1

Covariates are required for many of the traits. Covariates generally indicate the environmental conditions under which a measurement was made. Without covariate information, the trait data will have limited value.

A complete list of required covariates can be found in Table \ref{tab:covariates}. For all respiration rates and photosynthetic parameters, temperature is recorded as a covariate. Soil moisture, humidity, and other such variables that were measured at the time of the measurement may be required in order to standardize across studies.

When root data is recorded, the root size class needs to be entered as a covariate. The term ’fine root’ often refers to the (<)2mm size class, and in this case, the covariate `root_maximum_diameter`

would be set to 2. If the size class is a range, then the `root_minimum_diameter`

can also be used.

**Table \ref{tab:covariates}: Traits with required covariates** \label{tab:covariates} A list of traits and the covariates that must be recorded along with the trait value in order to be converted to a constant scale from across studies.*notes:* stomatal conductance (`gs`

) is only useful when reported in conjunction with other photosynthetic data, such as `Amax`

. Specifically, if we have `Amax`

and `gs`

, then estimation of `Vcmax`

only covaries with `dark_respiration_factor`

and atmospheric CO2 concentration.
We also now have information to help constrain `stomatal_slope`

. If we have `Amax`

but not `gs`

, then our estimate of `Vcmax`

will covary with: `dark_respiration_factor`

, `CO2`

, `stomatal_slope`

, `cuticular_conductance`

, and vapor-pressure deficit `VPD`

(which is more difficult to estimate than CO2, but still possible given lat, lon, and date). Most important, there will be a strong covariance between `Vcmax`

and `stomatal_slope`

.

Overview

There are three phases for a basic bulk upload of data:

Use the web interface

to enter metadata pertaining to your data set (new sites, species, cultivars, citations, or treatments);

to obtain a template appropriate for your data set.

Fill in the template with your data. There are four templates to choose from:

yields.csv — Use this template if you are uploading yield data and you wish to specify the citation in the file by author, year, and title.

If your data includes standard error and cultivar information and you do not plan to specify any of the required information interactively, you will be able to use this template “as-is”. Otherwise, you will need to delete one or more columns:

If your data has no standard error information, delete both the

`SE`

and the`n`

column.If your data set has a single uniform value for the site, species, cultivar, treatment, access_level, or date, then these values may be entered interactively through the web interface; in this case you should delete the corresponding column(s) from the template.

Note that cultivar information can’t be specified interactively unless species information is as well; delete the

`cultivar`

column if and only if you either have no cultivar information or you are specifying both the species and the cultivar interactively.