Written Text

An ecogeographic study comprises three main phases: project design, data collection/analysis and the ecogeographic products. The project design includes:

     1. Identification of target taxa expertise.

     2. Selection of target taxa taxonomy.

     3. Delimitation of target area.

     4. Design and creation of the database structure (optional).

The data collection and analysis phase includes:

     5. Survey of occurrence data, as well as passport, site and environment, and existing characterization and evaluation data.

     6. Collation of occurrence data into the database.

     7. Data verification.

     8. Data analysis.

The ecogeographic products include:

     9. A CWR occurrence data database (which contains raw data).

     10. A conspectus (that summarizes the taxonomic, geographical and ecological data for the target taxa).

     11. A report (which interprets the data and results obtained).

1. Identification of taxa expertise

Taxon experts and experts in the flora of a target area may provide accurate species location and ecological information and may be able to recommend relevant grey literature, Floras, monographs, taxonomic databases, the appropriate herbaria and genebanks to visit. They could also put the conservationist in contact with other specialists. Experts to contact may include:

  • Botanical, agrobiodiversity and biodiversity conservation, taxonomic, genetic, geographic, breeding researchers.
  • Herbaria and genebank curators.
  • NGOs working in conservation in the target region or on target crops.

2. Selection of target taxa taxonomy

The most widely accepted taxonomic classification can be determined with the aid of:

  • Target taxon experts.
  • National or global Floras.
  • Taxonomic monographs.
  • Recent taxonomic revisions.
  • Taxonomic databases etc.

It is important to detect existing synonyms to avoid missing specimens that may be identified under synonymous names and to prevent separate treatments of the same taxon. In the context of the development of a National Strategic Action Plan for CWR Conservation (NSAP), this step would already have been undertaken as part of the creation of the CWR checklist.

3. Delimitation of the target area

Normally an ecogeographic study should include the whole range of a species’ distribution so as to avoid the problem of non-compatible datasets that can be inherent in multiple surveys of the same taxon. However, given that conservation planning included in a NSAP is undertaken at national level, the whole country should be the target area.

4. (Design and creation of the occurrence database structure)

This step is optional as the use of the existing Occurrence data collation template is recommended (see step 6 for more details).

5. Survey of occurrence data

Sources of data are likely to include:

  • Scientific and ‘grey’ literature: Floras, monographs, recent taxon studies, reports of Environmental Impact Assessment studies1, databases, gazetteers, scientific papers, soil maps, vegetation maps, atlases etc., available both in paper form and as digital files.
  • Existing GIS layers illustrating species distribution.
  • Expert knowledge: contact with taxonomic or geographic experts is likely to provide significant additional data to facilitate the analysis and will also provide an opportunity to gain feedback on the analysis results.
  • Field survey data: where ecogeographic data is scarce there may be insufficient data to undertake meaningful ecogeographic analysis and it will then be necessary to collate fresh data from field observations of the target taxa.
  • For examples of data sources, click here.

In addition, have a careful look at the recommendations listed in Annex A of Castañeda et al. (2011), which aims to facilitate the recording of passport data.

Ideally, occurrence data should be available for every CWR included in the study, though it should be stressed that georeferencing is often required to ensure the necessary data is complete. The broader the sampling of occurrence data the more geographically and ecologically representative the data, and ultimately the results, will be.

6. Collation of occurrence data into database

The use of the existing Occurrence data collation template is recommended. The template caters for different types of data (genebank accessions, herbarium specimens, bibliographic references, internet references, biodiversity or botanical databases (e.g. GBIF), personal communications from experts and field observations) and, if the data is to be used for ecogeographic diversity analyses, the template also helps the user to prepare the data for use in the CAPFITOGEN tools (Parra-Quijano et al. 2016).

Schematic representation of occurrence data verification

7. Data verification

  • Check for duplicates. There are several types of duplicates:

    • Duplicate records: occurrence records that refer to the exact same record but the information came from different sources or was reported twice from the same source. These should be removed from the dataset.
    • Duplicate accessions/herbarium vouchers: genebank accessions or herbarium vouchers that were collected in the same locality, by the same collectors on the same date but held in different institutions. Usually these refer to collections that were divided by the collectors to be distributed among different institutions. These should not be removed from the dataset but should be tagged as duplicates. They may be useful, for example, to give an idea of the amount of seed available, however they are not relevant for the diversity, gap or climate change analyses.
    • Duplicate populations: populations (with the exact same coordinates) that have been sampled more than once at different dates by the same, or different, collectors. These are not duplicate records nor duplicate accessions/vouchers and should not be removed from the dataset. They will give an indication of how intensively that particular species is being collected, but they are also not relevant for diversity, gap or climate change analyses.

  • Check for spelling errors and standardize data format.
  • Georeference all the entries, if possible. All data should also be georeferenced by using (online) gazetteers, maps, Google Earth etc. (see here for georeferencing resources).
  • Assign a level of geographic precision. Different levels of precision can be assigned to each record. The appropriate level of precision to be considered for each type of data analysis can then be decided upon. For example, to locate areas for active in situ conservation, only very accurate data (levels 1, 2 and 3) might be used, but to map hotspots, both accurate data and coarser scale data might be used (levels 1, 2, 3 and 4).
Examples of location data and their corresponding level of geographic precision (from Maxted et al. 2013).
  • Check for outlier locations. Distribution maps should be created (using GIS if possible) to look for outlier collection sites. All outlying individual records should then be corrected or, if correction is not possible, tagged and not used in the analysis.

8. Diversity analyses of collated occurrence data

Data analyses may include:

  • Distribution maps and assessment of sampling bias: these analyses will provide an understanding how CWR taxa are distributed throughout the target region as well as revealing any bias in the occurrence data.

    • Species distribution maps. These may include observed distribution maps (generally a point map based on occurrence data) and predicted distribution maps (which require species distribution modelling). Predicted distribution maps are often used when the occurrence dataset is incomplete for a particular species. Incomplete datasets are often due to a lack of survey work having been carried out for a species whose distribution is largely unknown. Species distribution maps are generally created using a Geographic Information System (GIS). There are several softwares that can be used (click here for examples) but we recommend DIVA-GIS which is freely available online, or a more sophisticated software (e.g. ArcGIS) for more options for analysis and map presentation. Predicted distribution maps can be produced using a species distribution modelling software. We recommend the use of MaxEnt (Phillips et al. 2006) due to its performance when compared with other modelling approaches and its widespread use in conservation analyses (Elith et al. 2006).
    • Assessment of sampling bias. Analysis of bias determines whether species or certain geographic areas are over- or under-represented in an occurrence dataset. The results of this analysis will provide an understanding of how the collation of data has impacted the results of an ecogeographic analysis. In addition, it will reveal under-represented taxa and/or areas that have not been sampled and where the taxon is likely to occur, therefore highlighting the need of further field survey work to obtain a more balanced occurrence dataset (for more information, refer to Hijmans et al. 2001). Regarding geographic sampling bias in particular, Scheldeman and van Zonneveld (2010) present methods—including species accumulation curves and the rarefaction method—that can be undertaken using DIVA-GIS.

  • Ecogeographic diversity analysis: this analysis will provide an understanding of the patterns of ecogeographic diversity across the distribution of each priority CWR taxon. The results can then be used to identify appropriate areas for active in situ conservation or populations suitable for collection and ex situ conservation. An ecogeographic diversity study may involve the production of ecogeographic land characterization maps (ELC maps) (Parra-Quijano et al. 2008, 2012b). These maps identify various ecogeographic scenarios in which a species occurs, which—assuming they are a good proxy for genetic diversity—reflect the adaptations of the studied species that enable it to thrive in that particular set of ecological conditions. As a result, ELC maps help to identify important diversity within a taxon, which links to the potential utilization of intra-CWR diversity for crop improvement. Ecogeographic diversity analysis can be undertaken using the CAPFITOGEN tools (Parra-Quijano et al. 2016) and in general it involves:

    • Deciding whether a species-specific or a generalist ELC map is to be produced. A species-specific ELC map is produced based on the selection of appropriate geophysical/bioclimatic/edaphic variables that are most likely to determine the shape of the geographical distribution of that particular species. The map should reflect the ecogeographical diversity within one species across the target geographic area. It is a more complex and time-consuming approach than a generalist ELC map but is also more biologically meaningful from the species point of view as it is a better reflection of the potential adaptive scenarios for that species.
      In contrast, a generalist map characterizes the target geographic area from an ecogeographic perspective rather than reflecting the potential adaptive scenarios of a specific CWR. Therefore, the selection of variables must be derived from an analysis of the environmental factors that are most likely to limit or condition plant life in that area, rather than those that are likely to be relevant in shaping the distribution of a single species. It is a simple approach and less time-consuming than the former, but it must be acknowledged that different CWR taxa are likely to respond differently to the range of environments in the geographic area, and that the ELC map is a general approximation to the selective pressures that may be generating local adaptation (Iriondo pers. comm. 2016).
    • Selecting ecogeographic variables. If a species-specific ELC map is to be created, then the variables selected must be those that are most likely to define the species’ distribution/adaptive scenarios in a geographic area (Parra-Quijano et al. 2016). On the other hand, if a generalist approach is undertaken, then variables that limit plant life in general across the target geographic area are preferred (Iriondo pers. comm. 2016). Appropriate ecogeographic variables can be selected based on knowledge gained though bibliographic surveys, expert knowledge and statistical analyses. Parra-Quijano et al. (2016) provide some guidelines on how to select the ecogeographic variables, including a detailed explanation of the SelecVar tool from the CAPFITOGEN tool set (Parra-Quijano et al. 2016) that can be used to select appropriate variables. It includes a total of 105 variables (67 bioclimatic, 31 edaphic and 7 geophysical).
    • Creating the ELC map using the selected ecogeographic variables. ELCmapas from the CAPFITOGEN tool set (Parra-Quijano et al. 2016) can be used to create an ELC map.

  • Hotspot and complementarity analyses: these analyses provide an understanding of the patterns of diversity within and among priority CWR taxa. They can be undertaken at species level and at ecogeographic diversity level (and even at genetic level; click here to find how you can obtain CWR genetic data).

    • Hotspots analysis. A ‘hotspot’ refers to an area with a high concentration of CWR or high concentration of ecogeographic/genetic diversity. At species level, a hotspot map based on an occurrence dataset can easily be produced using DIVA-GIS. To produce hotspot maps based on predicted species distribution models, a more sophisticated software will need to be used (e.g. Arc-GIS) (see here for examples). DIVA-GIS can also be used to produce hotspot maps at the ecogeographical level. This involves overlaying the species’ distribution onto an ELC map and extracting the ecogeographic category for each species occurrence data point. The binomial species-ecogeographic categories produced using this method are then used to produce the hotspot map, rather than using the species itself as the unit.
    • Complementarity analysis. This analysis uses an iterative process of grid square—or another defined geographic unit—selection to identify the minimum number of sites in the target geographic area that are needed to conserve all priority CWR (Rebelo and Sigfried 1992; 1994a,b). The analysis can be based on a grid (i.e. the target area is divided into grid squares and the grid square size is manually set) or based on an existing network of protected areas (to assess the representativeness of CWR in the network). The first selected grid square/area is the grid/area that contains the highest concentration of the target CWR, and the second selected grid/area is the one with the highest concentration of CWR excluding the taxa already in the first selected grid/area. This selection process is repeated until the selection of further grid squares/areas would only duplicate taxa already included in the previously selected ones (Rebelo 1992; 1994a,b). The complementarity analysis approach is generally preferred to the hotspot approach because it identifies a network of in situ conservation sites that covers most (if not all) target CWR.

    Like the hotspot analysis, the complementarity analysis can be undertaken at species, ecogeographic or genetic diversity level, but rather than using the species name as the unit, the binomial species-ecogeographic category should be used instead. This involves overlaying the species distribution onto the ELC map and extracting the ecogeographic category for each occurrence data point. Complementarity analysis can be carried out using the tool Complementa from the CAPFITOGEN tool set (Parra-Quijano et al. 2016) for both grid and protected area analysis, or DIVA-GIS can be used for grid analysis only. More information on the establishment of genetic reserves to actively conserve CWR based on these analyses is provided here.

  • Other data analyses:

    • Ecogeographic characterization of the CWR taxon (e.g. using the ecogeographic categories in which the taxon occurs) or of ex situ collections or in situ populations. ECOGEO from the CAPFITOGEN tool set (Parra-Quijano et al. 2016) can be used for this purpose.
    • Mapping and detection of ecogeographic patterns (e.g. identifying whether a CWR occurs on a particular soil type, or whether the frequency of a character state changes along an environmental gradient).
    • Gap analysis, an important step in any conservation planning process (click here for details on how undertake a gap analysis).

9. Data synthesis

After the data has been collated and analyzed, the following products should be produced: an occurrence database (which contains raw data after verification and standardization), an (optional) conspectus (which summarizes all of the data collated for each CWR) and a final report (which interprets the data obtained and is usually a part of a conservation planning report or a CWR National Strategic Action Plan/National Strategy).



1 Environmental Impact Assessment (EIA) have been defined by the IAIA and IEA (1999) as “the process of identifying, predicting, evaluating and mitigating the biophysical, social, and other relevant effects of development proposals prior to major decisions being taken and commitments made.” In other words, they permit assessing the possible negative and positive impacts that a project (e.g. highway, dam, building, etc.) may have on the natural, social and economic aspects. Regarding the biophysical aspect, EIA reports generally provide species lists of Flora (and Fauna) that occur in the area where the project is to be developed thus constituting important sources of species distributional data.

The Interactive Toolkit for Crop Wild Relative Conservation Planning was developed within the framework of the SADC CWR project www.cropwildrelatives.org/sadc-cwr-project (2014-2016),
which was co-funded by the European Union and implemented through ACP-EU Co-operation Programme in Science and Technology (S&T II) by the African, Caribbean and Pacific (ACP) Group of States.
Grant agreement no FED/2013/330-210.