#################### NDVI datasets #################### Updated: 2026-02-21 .. contents:: :local: :depth: 2 ================================================ Introduction: What is NDVI? ================================================ The Normalized Difference Vegetation Index (NDVI) is one of the most widely used remote sensing indicators for monitoring vegetation conditions and dynamics. It is derived from satellite imagery using the contrasting spectral reflectance of vegetation in the red (RED) and near-infrared (NIR) bands, according to the formula: .. math:: NDVI = \frac{NIR - RED}{NIR + RED} NDVI values range from −1 to +1, where: * Values close to **+1** indicate dense, healthy green vegetation, * Values around **0** indicate sparse or absent vegetation (e.g., bare soil, urban areas), * Negative values typically indicate water bodies, snow or clouds. .. note:: Some software and visualization workflows may store or display NDVI in rescaled integer ranges (e.g., 0–200 or 0–255) for convenience. In this project, NDVI is reported in the standard **−1 to +1** floating-point scale. NDVI is widely applied in: * Monitoring seasonal and inter-annual vegetation dynamics, * Assessing the effects of land use change, deforestation and drought on vegetation cover, * Supporting agricultural monitoring and food security analyses, * Integrating vegetation exposure metrics into environmental epidemiology and public health studies. In this project, NDVI time series were derived from three satellite products: * **MODIS MOD13Q1** – 250 m spatial resolution, 16-day composites, from 2000 onwards. * **Landsat (TM, ETM+, OLI, OLI-2)** – 30 m spatial resolution, annual composites, from 1985 onwards. * **Sentinel-2 MSI** – 10 m spatial resolution, annual composites, from 2017 onwards. All three products were processed to generate annual or periodic mean NDVI values at the municipal level for all of continental Brazil. ================================================ How to access the source imagery ================================================ The satellite imagery used to derive NDVI in this project is freely available through the following platforms: * The `NASA Earthdata portal `_ provides access to MODIS products, including MOD13Q1, via the LP DAAC (Land Processes Distributed Active Archive Center). * The `Google Earth Engine Data Catalog `_ provides access to the complete Landsat Collection 2 Level-2 and Sentinel-2 Level-2A surface reflectance collections used in this project. * The `USGS Earth Explorer `_ portal provides direct access to Landsat Collection 2 Level-2 products. * The `Copernicus Data Space Ecosystem `_ provides access to Sentinel-2 products (the legacy SciHub endpoint has been discontinued). ================================================ MODIS NDVI (250 m) ================================================ Overview -------- The MODIS MOD13Q1 product provides vegetation index data at 250 m spatial resolution in 16-day composites, derived from the MODIS sensor onboard NASA's Terra satellite. The product is generated by compositing all daily surface reflectance acquisitions within each 16-day window, selecting the best available pixel observations based on criteria such as low cloud contamination and high vegetation signal. NDVI values in the MOD13Q1 product are stored as 16-bit signed integers and must be rescaled by a factor of **0.0001** to convert to standard floating-point NDVI values in the −1 to +1 range. Data Downloading ---------------- MODIS NDVI data were downloaded from the NASA Earthdata portal using the MODIS product **MOD13Q1 (Version 061)**. The data acquisition process: * Identifies the set of MODIS granules in the sinusoidal grid that cover continental Brazil (South American granules), using a predefined granule reference file for the highest spatial resolution MODIS products. * For each granule and each 16-day acquisition period in the time series (from **2000 to the present**), the corresponding HDF file is downloaded from NASA's Earthdata system if it has not already been obtained locally. * Data are organized by granule index (horizontal and vertical tile identifiers), facilitating modular retrieval and local storage management. This procedure ensures that the local archive is complete and avoids redundant downloads of previously acquired granules. Data Processing --------------- The MODIS NDVI processing comprises three sequential steps: format and projection conversion, mosaic creation and municipal-level extraction. 1. Format and projection conversion ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Each downloaded HDF file is in the native MODIS format, with data stored in the **Sinusoidal** cartographic projection — a non-standard projection that is not directly compatible with most geospatial workflows. The pre-processing converts each granule to a standard format: * **Format conversion** – The NDVI band is extracted from the HDF file and saved as a **GeoTIFF** file, which is widely supported by geospatial tools and libraries. * **Projection reprojection** – The data are reprojected from the MODIS Sinusoidal projection to the **WGS 84 Geographic Coordinate System (EPSG:4326)**, ensuring spatial consistency with the Brazilian municipal boundary layer and other datasets. These conversions are performed using the Geospatial Data Abstraction Library (GDAL). 2. National mosaic creation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ After conversion, the individual GeoTIFF granule files covering Brazil are merged into a **single raster mosaic** for each 16-day acquisition date. This mosaic provides wall-to-wall NDVI coverage of continental Brazil for each available date in the time series, and serves as the input for municipal-level extraction. 3. Municipal-level extraction ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For each acquisition date, the mean NDVI value is extracted for every municipality in continental Brazil using the official Brazilian municipal boundary layer (IBGE). The extraction applies zonal statistics to the national NDVI mosaic, computing the **mean pixel value** within each municipal polygon. A **scale factor of 0.0001** is applied to the extracted mean values to convert the raw 16-bit integer NDVI data to floating-point values in the standard −1 to +1 range. For further details, please refer to the MODIS NDVI section of the `CIDACS GitHub repository `_. Processing results ------------------ The main output of the MODIS NDVI processing is a tabular dataset in CSV format, where: * Each **row** corresponds to a Brazilian municipality and a specific 16-day acquisition date. * Each **column** represents either identification fields (e.g., municipal code, acquisition date) or the mean NDVI value for that municipality and date (after scaling, in the −1 to +1 range). The dataset covers the period from **2000 to the present** and provides one record per municipality per 16-day composite period. ================================================ Landsat NDVI (30 m) ================================================ Overview --------- The Landsat-based NDVI product is derived from the Landsat Collection 2 Tier 1 Level-2 surface reflectance archive, processed via Google Earth Engine (GEE). Four Landsat sensor generations are combined into a single harmonized annual time series covering **1985 to 2024**: * **Landsat 5 TM** (1985–2012) – RED: SR_B3, NIR: SR_B4, * **Landsat 7 ETM+** (1999–2022) – RED: SR_B3, NIR: SR_B4, (note: SLC-off gaps after 2003 may affect spatial completeness) * **Landsat 8 OLI** (2013–2024) – RED: SR_B4, NIR: SR_B5, * **Landsat 9 OLI-2** (2021–2024) – RED: SR_B4, NIR: SR_B5. Data Processing --------------- 1. Image collection and filtering ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For each sensor, the corresponding surface reflectance image collection hosted in GEE is filtered by: * **Spatial extent** – Only images intersecting the bounding box of continental Brazil (approximately 74°W to 34°W, 34°S to 5.5°N) are retained. * **Temporal range** – Each sensor is filtered to its operational period (see above). * **Cloud cover** – Scenes with overall cloud cover greater than **80%** are excluded at the scene level. 2. Cloud masking ^^^^^^^^^^^^^^^^^ Pixel-level cloud masking is applied using the **QA_PIXEL** quality band, which is part of the Landsat Collection 2 Level-2 product. Pixels flagged as cloud or cloud shadow are excluded from the analysis. 3. Radiometric scaling ^^^^^^^^^^^^^^^^^^^^^^^ Landsat Collection 2 Level-2 surface reflectance values are stored as scaled integers. A linear scaling factor is applied to both the RED and NIR bands before NDVI calculation: .. math:: \rho = DN \times 0.0000275 + (-0.2) 4. NDVI calculation and annual compositing ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ NDVI is calculated for each cloud-free pixel in each image. All valid NDVI observations within a given calendar year are then composited by computing the **pixel-wise annual mean** across all available observations, producing a single spatially continuous annual NDVI layer per year. 5. Municipal-level aggregation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For each annual NDVI composite, the mean NDVI value is extracted for every municipality in continental Brazil using the official 2022 IBGE municipal boundary layer (5,570 municipalities), applying a mean reducer at 30 m resolution. For further details, please refer to the Landsat NDVI section of the `CIDACS GitHub repository `_. Processing results ------------------ The main output is a set of CSV files, one per year (1985–2024), named ``NDVI_por_municipio_{year}.csv``, containing: * ``CD_MUN`` – Municipal code (IBGE), * ``NM_MUN`` – Municipality name, * ``mean`` – Mean annual NDVI value for the municipality (dimensionless, −1 to +1), * ``year`` – Reference year. ================================================ Sentinel-2 NDVI (10 m) ================================================ Overview --------- The Sentinel-2-based NDVI product is derived from the Sentinel-2 MSI Level-2A (surface reflectance) collection, also processed via Google Earth Engine (GEE). It provides the highest spatial resolution among the three NDVI products (10 m). NDVI is computed from bands **B8** (NIR, 10 m) and **B4** (RED, 10 m). Data Processing --------------- 1. Image collection and filtering ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The Sentinel-2 SR collection (``COPERNICUS/S2_SR``) is filtered by: * **Spatial extent** – Continental Brazil bounding box. * **Temporal range** – From 2017 onwards. * **Cloud cover** – Scenes with cloudy pixel percentage greater than **80%** are excluded at the scene level. 2. Cloud masking ^^^^^^^^^^^^^^^^^ Pixel-level cloud masking uses the **Scene Classification Layer (SCL band)**, which classifies each pixel into categories. Pixels classified as cloud shadow (SCL = 3), medium-probability cloud (SCL = 8), high-probability cloud (SCL = 9) and thin cirrus (SCL = 10) are excluded. 3. NDVI calculation and annual compositing ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ NDVI is computed from bands B8 and B4 for each cloud-free pixel. .. note:: Sentinel-2 surface reflectance bands may be stored with a scale factor in some archives/platforms. For NDVI specifically, explicit rescaling is typically **not required** because the common scale factor cancels out in the ratio, provided RED and NIR share the same scaling. All valid observations within the target period are composited by computing the **pixel-wise mean**, producing one annual NDVI layer per year. 4. Municipal-level aggregation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For each annual NDVI composite, the mean NDVI value is extracted for every municipality in continental Brazil using the official 2022 IBGE municipal boundary layer, applying a mean reducer at **10 m** resolution. For further details, please refer to the Sentinel-2 NDVI section of the `CIDACS GitHub repository `_. Processing results ------------------ The main output is a set of CSV files, one per year (2017 onwards), named ``NDVI_Sentinel2_{year}_Annual.csv``, containing: * ``CD_MUN`` – Municipal code (IBGE), * ``NM_MUN`` – Municipality name, * ``mean`` – Mean annual NDVI value for the municipality (dimensionless, −1 to +1), * ``year`` – Reference year, * ``period`` – Period label (e.g., ``Annual``). The three NDVI products available in this project differ in spatial resolution, temporal coverage and temporal resolution, reflecting trade-offs between spatial detail, historical depth and acquisition frequency. The table below summarizes the key differences: .. list-table:: :header-rows: 1 :widths: 25 25 25 25 :align: center * - Field - MODIS-based NDVI (250 m) - Landsat-based NDVI (30 m) - Sentinel-2-based NDVI (10 m) * - Product name / collection - MOD13Q1 – MODIS/Terra Vegetation Indices 16-Day L3 Global 250m - Landsat Collection 2, Tier 1, Level-2 Surface Reflectance - COPERNICUS/S2_SR (Google Earth Engine) * - Sensor(s) - MODIS (Terra satellite) - Landsat 5 TM (1985–2012), Landsat 7 ETM+ (1999–2022), Landsat 8 OLI (2013–2024), Landsat 9 OLI-2 (2021–2024) - Sentinel-2 MSI (Level-2A Surface Reflectance) * - Spatial resolution - 250 meters - 30 meters - 10 meters * - Temporal coverage - 2000 to present - 1985 to 2024 - 2017 onwards * - Native temporal resolution - 16-day composites - Varies by sensor and acquisition (scene-based) - Varies by acquisition (scene-based) * - Output temporal resolution - Per acquisition date (16-day) at municipal level - Annual mean NDVI per municipality - Annual mean NDVI per municipality * - Geographic coverage - Continental Brazil (via South American MODIS granules) - Continental Brazil - Continental Brazil * - Native file format / platform - HDF (Hierarchical Data Format), Sinusoidal projection - Google Earth Engine (image collections) - Google Earth Engine (image collections) * - Cloud cover filter (scene-level) - Not applicable (quality/compositing embedded in product) - Scenes with cloud cover > 80% excluded - Scenes with cloudy pixel percentage > 80% excluded * - Cloud masking (pixel-level) - Embedded in product compositing / QA layers - QA_PIXEL band - SCL band * - Output format - CSV - CSV (one file per year) - CSV (one file per year) ================================================ Conclusion ================================================ The three NDVI products — MODIS (250 m, 16-day), Landsat (30 m, annual) and Sentinel-2 (10 m, annual) — together provide a comprehensive and multi-scale characterization of vegetation dynamics across all Brazilian municipalities. Their combination enables long-term trend analysis (Landsat, from 1985), high-frequency monitoring of seasonal cycles (MODIS, from 2000) and fine-scale spatial characterization of recent years (Sentinel-2). The workflows described here ensure: * Transparent and reproducible derivation of NDVI from standardized surface reflectance products. * Consistent cloud masking and compositing procedures across sensors. * Harmonized spatial units (municipalities) for direct integration with health, environmental and socio-economic datasets. These NDVI products can be used to characterize vegetation exposure, monitor land cover change and support environmental epidemiology studies at the municipal level across Brazil. .. rubric:: References .. [1] Didan K. MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V061 [Internet]. NASA EOSDIS Land Processes DAAC; 2021 [cited 2025 Feb 15]. doi:10.5067/MODIS/MOD13Q1.061 .. [2] USGS. Landsat Collection 2 Level-2 Science Product Guide [Internet]. US Geological Survey; [cited 2025 Feb 15]. Available from: https://www.usgs.gov/landsat-missions/landsat-collection-2-level-2-science-products .. [3] European Space Agency (ESA). Sentinel-2 MSI – Level-2A Product [Internet]. ESA Copernicus; [cited 2025 Feb 15]. Available from: https://sentinels.copernicus.eu/web/sentinel/missions/sentinel-2 .. [4] Gorelick N, Hancher M, Dixon M, et al. Google Earth Engine: Planetary-scale geospatial analysis for everyone. *Remote Sensing of Environment*. 2017;202:18–27. doi:10.1016/j.rse.2017.06.031 **Contributors** .. list-table:: :header-rows: 1 :widths: 25 75 :align: center * - Name - Affiliation * - Henrique Ferreira dos Santos - Center for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil * - José Vinicius Alves - Center for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil