Skip to contents

Overview

Acute malnutrition remains a critical public health concern, particularly in fragile and humanitarian settings, often requiring targetted interventions. Its spatial distribution is often uneven, with clusters of elevated burden. Understanding where these unusual high-burden areas exist is crucial for effective planning, resource allocation, early warning systems, and timely response to alleviate the plight of the affected population.

In this article, you will learn how to use the wowi R package to understand the spatial distribution of acute malnutrition in your data set. Before we start with the walk through, it is important to have some critical concepts under your belt.

The spatial scan approach

To identify statistically significant spatial clusters, analysts frequently rely on SaTScan, a powerful tool for spatial and spatio-temporal scan statistics. SaTScan’s Bernoulli model is particularly suited for binary outcomes such as:

  • “Case” = child is acutely malnourished
  • “Control” = child is not acutely malnourished

It evaluates whether observed clustering of cases exceeds what would be expected by chance Martin Kulldorff (July 2022).

Using wowi to identify spatial clusters of high rates of acute malnutrition

The wowi R package provides a tidy, reproducible interface for running Bernoulli spatial scan analysis via SaTScan. It depends on the rsatscan R package, which enables the use of SaTScan software from within R. While rsatscan offers general-purpose functionality, wowi was specifically developed for acute malnutrition analysis, tailoring the tools to the needs of nutrition-focused spatial investigations.

What outputs does wowi provide?

The package returns the standard set of outputs generated by SaTScan:

  1. An interactive HTML map displaying the detected clusters. These can be opened with any web browser, and user can play it with it - zoom in and out, click on the bubbles and see the statistics about the bubbles, and more.
  2. A text-based output with extension .txt containing the results.
  3. Additional GIS-based files (e.g., shapefiles), which can be useful for further geospatial manipulation or integration into other mapping workflows.
  4. Furthermore, it also returns a tidy data frame summarising the detected clusters, parsed from the text-based file described in point 2. This summary enables easier manipulation and integration into an analyst’s downstream workflow.

Tip 💡

The output described in point 4 offers an IPC Acute Malnutrition-related insight for IPC users. It indicates whether the number of survey clusters or enumeration area IDs included in the detected cluster—and the total number of children—meet the IPC Acute Malnutrition criteria for disaggregated analysis, as recommended when the design effect exceeds 1.3 (IPC Global Partners 2021)

The results are presented in the final column of the table and take one of two possible values:

  • “yes” – the number of cluster IDs is ≥ 5 and the number of children with valid measurements is ≥ 100.

  • “no” – the criteria above are not met.

From data to clusters: the wowi analysis workflow

The analysis workflow with wowi begins with the standard anthropometric data processing steps: data wrangling and quality checks. For this purpose, wowi relies on the mwana package, which will be installed or updated automatically when you install wowi.

Further downstream in the analysis, the sf package is required to generate shapefile-related outputs. Note that sf is not installed with wowi, so you must install it separately.

Note ⓘ

wowi will not function unless the SaTScan software is installed on your machine. You can download it from: https://www.satscan.org.

Once installed, copy the path to the SaTScan executable on your machine. This is needed for R to locate and run it.

We will now begin the demonstration using the package’s built-in dataset, anthro.

head(anthro)
#> # A tibble: 6 × 11
#>   district cluster   sex   age weight height oedema  muac     y     x precision
#>   <chr>      <dbl> <dbl> <dbl>  <dbl>  <dbl> <chr>  <dbl> <dbl> <dbl>     <dbl>
#> 1 Kotido        15     2  48.9   15.9  108.  n        128  2.93  34.1         8
#> 2 Kaabong      161     1  43.8   15    106.  n        128  3.65  34.1         4
#> 3 Kaabong      161     1  37.0   13.6   91.9 n        149 NA     NA          NA
#> 4 Kaabong      161     1  NA      7    100   n        150 NA     NA          NA
#> 5 Kaabong      160     1  22.5    8.6   74.2 n        119  3.65  34.1         4
#> 6 Kaabong      160     1  32.1   13.4   88.1 n        163 NA     NA          NA

This is a SMART survey dataset, which includes geographical coordinates: the x and y columns, along with the corresponding precision recorded in the precision column.

Note ⓘ

Geographical coordinates are indispensable for spatial cluster detection — without precise location data, spatial analysis simply cannot be performed.

Data wrangling

You can scan for spatial clusters using acute malnutrition case definitions based on weight-for-height z-scores and/or oedema, mid-upper arm circumference (MUAC) and/or oedema, or a combined definition. The data wrangling workflow should be configured accordingly. In this article, we will demonstrate using the former.

library(mwana)

a <- anthro |>
  mw_wrangle_wfhz(
    sex = sex,
    weight = weight,
    height = height,
    .recode_sex = FALSE
  ) |>
  define_wasting(
    zscores = wfhz,
    edema = oedema,
    .by = "zscores"
  ) |>
  dplyr::rename(
    longitude = x,
    latitude = y
  )
#> ================================================================================

Hereafter, you can check the quality of the data. Learn how to do so here.

Running the spatial scan

This is handled by the ww_run_satscan() function. Read its full documentation by typing ?ww_run_satscan in your console or by clicking here.

You can run the analysis on either a single-area dataset or a multiple-area dataset. We will begin with the former, followed by the latter

Single-area data set

## Filter out one district from the data set -----
d <- subset(a, district == "Kotido")
## Load rsatscan ----
library(rsatscan) #<1>

## Set up the parameters ----
results <- ww_run_satscan(
  .data = d,
  filename = "Kotido",
  dir = directory, #<2>
  sslocation = "/Applications/SaTScan.app/Contents/app", #<3>
  ssbatchfilename = "satscan", #<4>
  satscan_version = "10.3.2",
  .by_area = FALSE,
  .scan_for = "high-low-rates",
  .gam_based = "wfhz",
  area = NULL,
  cleanup = TRUE,
  verbose = FALSE
)
  1. Although you will never use the rsatscan package explicitly, you must make it available in your environment. This is because it creates a specific environment object, ssenv, required for wowi to function. If this is not loaded, wowi will not run.

  2. Specify the folder where you would like the output results to be saved for your analysis.

  3. Provide the path to your SaTScan GUI installation. This varies depending on your operating system (OS). For macOS, it is typically “/Applications/SaTScan.app/Contents/app”; for Windows, “C:/Program Files/SaTScan”.

  4. Indicate the name of the SaTScan batch file. For macOS, use “satscan”; for Windows, use “SaTScanBatch64”.

Getting acquainted with the files

The code above will generate seven files in the folder you specified using the dir argument. These files will share a common base name, as defined by the string provided in the filename argument.

#> [1] "Kotido.cas"             "Kotido.clustermap.html" "Kotido.col.prj"        
#> [4] "Kotido.ctl"             "Kotido.geo"             "Kotido.gis.prj"        
#> [7] "Kotido.gis.shp"         "Kotido.gis.shx"         "Kotido.prm"
  • The .cas, .ctl and .geo files are SaTScan input datasets representing cases, controls and geographical coordinates, respectively.
  • The .clustermap.html file is an interactive HTML visualisation with an embedded Google Map displaying the detected clusters.
  • All other files are related to shapefiles and can be used for further GIS analysis and manipulation.

Furthermore, we can view the results in a text-based format. To do this, we access the object to which we assigned the results of ww_run_satscan(). In our demo, this object is called results.

To see the text-based results, we use results$.txt. As this file is quite long, for the purposes of this demo we will only display the first 48 rows.

#>  [1] "                                 _____________________________"                                 
#>  [2] ""                                                                                               
#>  [3] "                                        SaTScan v10.3.2"                                        
#>  [4] "                                 _____________________________"                                 
#>  [5] ""                                                                                               
#>  [6] "results"                                                                                        
#>  [7] ""                                                                                               
#>  [8] "Program run on: Sat Aug  2 16:20:17 2025"                                                       
#>  [9] ""                                                                                               
#> [10] "Purely Spatial analysis"                                                                        
#> [11] "scanning for clusters with high or low rates"                                                   
#> [12] "using the Bernoulli model."                                                                     
#> [13] "_______________________________________________________________________________________________"
#> [14] ""                                                                                               
#> [15] "SUMMARY OF DATA"                                                                                
#> [16] ""                                                                                               
#> [17] "Study period.......................: 2025/08/02 to 2025/08/02"                                  
#> [18] "Number of locations................: 36"                                                        
#> [19] "Total population...................: 333"                                                       
#> [20] "Total number of cases..............: 26"                                                        
#> [21] "Percent cases......................: 7.8"                                                       
#> [22] "_______________________________________________________________________________________________"
#> [23] ""                                                                                               
#> [24] "CLUSTERS DETECTED"                                                                              
#> [25] ""                                                                                               
#> [26] "1.Location IDs included.: 10, 9"                                                                
#> [27] "  Coordinates / radius..: (34.113909 N, 3.087933 E) / 1.20 km"                                  
#> [28] "  Span..................: 1.20 km"                                                              
#> [29] "  Population............: 25"                                                                   
#> [30] "  Number of cases.......: 6"                                                                    
#> [31] "  Expected cases........: 1.95"                                                                 
#> [32] "  Observed / expected...: 3.07"                                                                 
#> [33] "  Relative risk.........: 3.70"                                                                 
#> [34] "  Percent cases in area.: 24.0"                                                                 
#> [35] "  Log likelihood ratio..: 3.458213"                                                             
#> [36] "  P-value...............: 0.55"                                                                 
#> [37] ""                                                                                               
#> [38] "2.Location IDs included.: 2, 3, 1"                                                              
#> [39] "  Coordinates / radius..: (33.930353 N, 3.172303 E) / 4.00 km"                                  
#> [40] "  Span..................: 5.88 km"                                                              
#> [41] "  Population............: 35"                                                                   
#> [42] "  Number of cases.......: 0"                                                                    
#> [43] "  Expected cases........: 2.73"                                                                 
#> [44] "  Observed / expected...: 0"                                                                    
#> [45] "  Relative risk.........: 0"                                                                    
#> [46] "  Percent cases in area.: 0"                                                                    
#> [47] "  Log likelihood ratio..: 3.013494"                                                             
#> [48] "  P-value...............: 0.68"

This file isn’t very convenient if we want to perform further manipulation. Don’t worry — ww_run_satscan() has got you covered ☂️. It parses the text-based results into a tidy data frame that you can easily work with 😎. To access it, simply subset the results object like this: results$.df:

#> # A tibble: 4 × 18
#>   survey_area nr_EAs total_children total_cases `%_cases` location_ids geo      
#>   <chr>        <int>          <int>       <int>     <dbl> <chr>        <chr>    
#> 1 Kotido          36            333          26         7 10,9         34.11390…
#> 2 Kotido          36            333          26         7 2,3,1        33.93035…
#> 3 Kotido          36            333          26         7 15           34.11670…
#> 4 Kotido          36            333          26         7 28,30        34.05993…
#> # ℹ 11 more variables: radius <chr>, span <chr>, children <int>, n_cases <int>,
#> #   expected_cases <dbl>, observedExpected <dbl>, relative_risk <dbl>,
#> #   `%_cases_in_area` <dbl>, log_lik_ratio <dbl>, pvalue <dbl>, ipc_amn <chr>

Multiple-area data set

We use a dataset containing multiple areas. In our example dataset, there are several districts, so we want to run the spatial scan on a district-wise basis. To do this, we set the filename parameter to NULL, then set .by_area = TRUE, and finally specify the column in our dataset that stores the areas—in this case, district.

For this demo, we will filter out just two districts.

## Filter out two districts -----
ds <- subset(a, district == "Kotido" | district == "Abim")
## Set up the parameters ----
multiple_area <- ww_run_satscan(
  .data = ds,
  filename = NULL,
  dir = directory,
  sslocation = "/Applications/SaTScan.app/Contents/app",
  ssbatchfilename = "satscan",
  satscan_version = "10.3.2",
  .by_area = TRUE,
  .scan_for = "high-low-rates",
  .gam_based = "wfhz",
  area = district,
  cleanup = TRUE,
  verbose = FALSE
)

This returns the following outputs:

#>  [1] "Abim.cas"               "Abim.clustermap.html"   "Abim.col.prj"          
#>  [4] "Abim.ctl"               "Abim.geo"               "Abim.gis.prj"          
#>  [7] "Abim.gis.shp"           "Abim.gis.shx"           "Abim.prm"              
#> [10] "Kotido.cas"             "Kotido.clustermap.html" "Kotido.col.prj"        
#> [13] "Kotido.ctl"             "Kotido.geo"             "Kotido.gis.prj"        
#> [16] "Kotido.gis.shp"         "Kotido.gis.shx"         "Kotido.prm"

We can also obtain the tidy data frame:

#> # A tibble: 8 × 19
#>   survey_area nr_EAs total_children total_cases `%_cases` location_ids     geo  
#>   <chr>        <int>          <int>       <int>     <dbl> <chr>            <chr>
#> 1 Kotido          36            333          26         7 10,9             34.1…
#> 2 Kotido          36            333          26         7 2,3,1            33.9…
#> 3 Kotido          36            333          26         7 15               34.1…
#> 4 Kotido          36            333          26         7 28,30            34.0…
#> 5 Abim            31            223          15         6 111,116,117,112… 33.6…
#> 6 Abim            31            223          15         6 135,136,137,133  33.9…
#> 7 Abim            31            223          15         6 127,129,128,130  33.8…
#> 8 Abim            31            223          15         6 126,143,131      33.7…
#> # ℹ 12 more variables: radius <chr>, span <chr>, children <int>, n_cases <int>,
#> #   expected_cases <dbl>, observedExpected <dbl>, relative_risk <dbl>,
#> #   `%_cases_in_area` <dbl>, log_lik_ratio <dbl>, pvalue <dbl>, ipc_amn <chr>,
#> #   area <chr>

References

IPC Global Partners. 2021. Integrated Food Security Phase Classification Technical Manual Version 3.1: Evidence and Standards for Better Food Security and Nutrition Decisions. https://www.ipcinfo.org/ipcinfo-website/resources/ipc-manual/en/.
Martin Kulldorff. July 2022. SaTScan User Guide for Version 10.1. SaTScan Software for the spatial, temporal; space-time scan statistics. http://www.satscan.org/.