arrow-left

All pages
gitbookPowered by GitBook
1 of 5

Loading...

Loading...

Loading...

Loading...

Loading...

Extracting Data From Figures

The easiest program to use for extracting points is WebPlotDigitzerarrow-up-right. WebPlotDigitizer is free, browser-based, and cross-platform. Extracts data from images. Demo herearrow-up-right.

  1. Identify the data that is associated with each treatment

    note: If the experiment has many factors, the paper may not report the mean and statistics for each treatment. Often, the reported data will reflect the results of more than one treatment (for example, if there was no effect of the treatment on the quantity of interest). In some cases it will be possible to obtain the values for each treatment, e.g. if there are n-1 values and n treatments. If this is not the case, the treatment names and definitions should be changed to indicate the data reflect the results of more than one experimental treatment.

  2. Enter the mean value of the trait

  3. Enter the statname, stat, and number of replicates, n associated with the mean

    • stat is the value of the statname (i.e. statname might be ’standard deviation’ (SD) and the stat

hashtag
For more information:

  • *

is the numerical value of the statistic)
  • Always measure size of error bar from the mean to the end of an error bar. This is the value when presented as ( X ± SE) or X(SE) and may be found in a table or on a graph.

  • Sometimes CI and LSD are presented as the entire range from the lower to the upper end of the confidence interval. In this case, take 1/2 of the interval representing the distance from the mean to the upper or lower bound.

  • "Extracting Data From Graphs"arrow-up-right
    related question on Stats.stackexchangearrow-up-right

    Estimating SE from Summary Statistics

    When conducting a meta-analysis that includes previously published data, differences between treatments reported with P-values, least significant differences (LSD), and other statistics provide no direct estimate of the variance.

    In the context of the statistical meta-analysis models that we use, overestimates of variance are okay, because this effectively reduces the weight of a study in the overall analysis relative to an exact estimate, but provides more information than either excluding the study or excluding any estimate of uncertainty (though there are limits to this assumption).

    Where available, direct estimates of variance are preferred, including Standard Error (SE), sample Standard Deviation (SD), or Mean Squared Error (MSE). SE is usually presented in the format of mean (±SE). MSE is usually presented in a table. When extracting SE or SD from a figure, measure from the mean to the upper or lower bound. This is different than confidence intervals and range statistics (described below), for which the entire range is collected.

    If MSE, SD, or SE are not provided, it is possible that LSD, MSD, HSD, or CI will be provided. These are range statistics and the most frequently found range statistics include a Confidence Interval (95%CI), Fisher’s Least Significant Difference (LSD), Tukey’s Honestly Significant Difference (HSD), and Minimum Significant Difference (MSD). Fundamentally, these methods calculate a range that indicates whether two means are different or not, and this range uses different approaches to penalize multiple comparisons. The important point is that these are ranges and that we record the entire range.

    Another type of statistic is a “test statistic”; most frequently there will be an F-value that can be useful, but this should not be recorded if MSE is available. Only if there is no other information available should you record the P-value.

    hashtag
    Further Reading

    Many statistical transformations are implemented in the function within the PEcAn.utils package. However, these transformations make conservative (variance inflating) assumptions about study-specific experimental design (especially degrees of freedom) that is not captured in the BETYdb schema, for example HSD, LSD, P.

    More accuate estimates of SE can be obtained at time of data entry using the formulas in .

    Quality Assurance

    Quality assurance and quality control (QA/QC) is a critical step that is used to ensure the validity of data in the database and of the analyses that use these data. When conducting QA/QC, your data access level needs to be elevated to “manager”.

    1. Open citation in Mendeley

    2. Locate citation in BETYdb

      • Select Use

      • Select Show

      • Check that author, year, title, journal, volume, and page

        information is correct

      • Check that links to URL and PDF are correct, using DOI if

        available

      • If any information is incorrect, click ’edit’ to correct

    3. Check that site(s) at bottom of citation record match site(s) in

      paper

      • Check that latitude and longitude are consistent with

        manuscript, are in decimals not degrees, and have appropriate

        level of precision

    4. Select

      from

      menu bar

      • Check that there is a control treatment

    5. Check if

      there are any listed on the treatments page.

      • If yield data has been collected, ensure that required

        managements have been entered

    6. Click or

      to check

      data.

      • Check that means, sample size, and statistics have been entered

        correctly

    transformstatsarrow-up-right
    "Transforming ANOVA and Regression statistics for Meta-analysis"arrow-up-right

    Click on site name to verify any additional information site

    information that is present

  • Enter any additional site level information that is found

  • Ensure that treatment name and definition are consistent with

    information in the manuscript

  • Under “treatments from all citations associated with associated

    sites”, ensure that there is no redundancy (i.e. if another

    citation uses the same treatments, it should not be listed

    separately)

  • If managements are listed, make sure that managment-treatment

    associations are correct

  • If managements have been entered, ensure that they are

    associated with the correct treatments

    If data has been transformed, check that transformation was

    correct in the associated google spreadsheet (or create a new

    google spreadsheet following instructions)

  • For any trait data that requires a covariate

  • treatmentsarrow-up-right
    managementsarrow-up-right
    Yieldsarrow-up-right
    Traitsarrow-up-right

    Appendices

    Common Unit Conversions

    For many transformations, particularly when automated, please use the udunits2 software where possible. For example, in R, you can use

    NB: Many of these conversions have been automated within PEcAnarrow-up-right.

    Useful conversions for entering site, management, yield, and trait data \label{tab:conversions}

    From ((X))

    to ((Y))

    Conversion

    Notes

    library(udunits2)
    ## transform meters to mm
    ud.convert(10, "m", "mm")
    ## equivalently, via the udunits synonym database
    ud.convert(10, "meters", "millimeters")
    ## it can also handle more complex units
    ud.convert(10, "m/s", "mm/d")

    (X_2=)root production

    (X_1=)root biomass & root turnover rate

    (Y = X_2/X_1)

    Gill [2000]

    DD(^{\circ}) MM'SS

    XX.ZZZZ

    (\textrm{XX.ZZZZ} = \textrm{XX} + \textrm{MM}/60+\textrm{SS}/60)

    to convert latitude or longitude from degrees, minutes, seconds to decimal degrees

    lb

    kg

    (Y=X\times 2.2)

    mm/s

    (\mu) mol CO(_2) m(^{2}) s(^{-1})

    (Y=X\times 0.04)

    m(^2)

    ha

    (Y = X/10^6)

    g/m(^2)

    kg/ha

    (Y=X\times 10)

    US ton/acre

    Mg/ha

    (Y = X\times 2.24)

    m(^3)/ha

    cm

    (Y=X/100)

    units used for irrigation and rainfall

    % roots

    root:shoot (q)

    (Y=\frac{X}{1-X})

    (\% \text{roots} = \frac{\text{root biomass}}{\text{total biomass}})

    (\mu) mol cm(^{-2}) s(^{-1})

    mmol m(^{-2}) s(^{-1})

    (Y = X/10)

    mol m(^{-2}) s(^{-1})

    mmol m(^{-2}) s(^{-1})

    (Y = X/10^6)

    mol m(^{-2}) s(^{-1})

    (\mu) mol cm(^{-2}) s(^{-1})

    (Y = X/ 10^5)

    mm s(^{-2})

    mmol m(^{-3}) s(^{-1})

    (Y=X/41)

    Korner et al. [1988]

    mg CO(_2) g(^{-1}) h(^{-1})

    (\mu) mol kg(^{-1}) s(^{-1})

    (Y = X\times 6.31)

    used for root_respiration_rate

    (\mu) mol

    mol

    (Y= X\times 10^6)

    julian day (1--365)

    date

    see ref: (NASA Julian Calendar)

    spacing (m)

    density (plants m(^{2}))

    (Y=\frac{1}{\textrm{row spacing}\times\textrm{plant spacing}})

    kg ha(^{-1}) y(^{-1})

    Mg ha(^{-1}) y(^{-1})

    (Y= X/1000)

    g m(^{-2}) y(^{-1})

    Mg ha(^{-1}) y(^{-1})

    (Y= X/100)

    kg

    mg

    (Y=X\times 10^6)

    cm(^2)

    m(^2)

    (Y=X\times 10^4)

    http://disc.gsfc.nasa.gov/julian_calendar.shtmlarrow-up-right