Using SWAT Model and Field Data to Determine Potential of NASA-POWER Data for Modelling Rainfall-Runoff in Incalaue River Basin

Incalaue is a tributary of Lugenda River in NSR (Niassa Special Reserve) in North-Eastern Mozambique. NSR is a data-poor remote area and there is a need for rainfall-runoff data to inform decisions on water resources management, and scientific methods are needed for this wide expanse of land. This study assessed the potential of a combination of NASA-POWER (Na-tional Aeronautics and Space Administration and Prediction of Worldwide Energy Resources) remotely sensed rainfall data and FAO (Food and Agri-culture Organization of the United Nations) soil and land use/cover data for modelling rainfall-runoff in Incalaue river basin. DEM (Digital Elevation Model) of 1:250,000 scale and a grid resolution of 30 m × 30 m downloaded from USGS (the United States Geological Survey) website; clipped river basin FAO digital soil and land use/cover maps; and field-collected data were used. SWAT (Soil and Water Assessment Tool) model was used to assess rainfall -runoff data generated using the NASA-POWER dataset and gauged rainfall and river flow data collected during fieldwork. FAO soil and land use/cover datasets which are globally available and widely used in the region were used for comparison with soil data collected during fieldwork. Field collected data showed that soil in the area is predominantly sandy loam and only sand content and bulk density were fall and river flow observed in the field and modelled were confirmed by res-idents as the trend in the area. This approach was used because there was no historical rainfall and river flow data since the river basin is ungauged for hydrologic data. The study showed that NASA-POWER data has the potential for use for modelling the rainfall-runoff in the basin. The difference in rain-fall-runoff relationship with field-collected data could be because of landscape characteristics or topsoil layer not catered for in the FAO soil data.


Introduction
Modelling of landscape rainfall-runoff to determine amounts and contributing areas is important for land to use/cover planning, and environmental management as this offers information on river water source areas [1] [2] [3]. Knowledge of land use/cover (LULC) variations and changes are important in rainfall-runoff studies to determine factors affecting overland flow and water losses. The quantity and characteristics of rainfall-runoff in a landscape are affected by a combination of LULC as well as slope and soil characteristics which are unique for different landscapes. Modelling landscape hydrology with distributed models is important to understand river flow changes at spatial and temporal scales [4] [5].
Impacts of climate variability and land use/cover change on landscape hydrology are difficult to determine in ungauged river basins because of the difficulty to estimate meteorological parameters and their surface rainfall-runoff effects [6]. River flow data is one of the major challenges in river basins hydrology studies and Predictions in Ungauged Basins (PUB) should carefully limit uncertainty in assessments [7]. The commonly used regionalization approach can be erroneous and should be attempted only with great care, and it is important to use reliable online proven site-specific datasets [8]. The global meteorology, surface solar energy, and climatology data are important parameters that are usually overestimated due to their change dynamics broadly being at a large landscape scale. This challenge in the hydrological sciences was appreciated in the International Association of Hydrological Sciences (IAHS) initiative aimed at achieving advances in PUB [9].
Soil water influences vegetation patterns and stands in landscapes and these important determinant factors of rainfall-runoff generation in a river basin [10]. Understanding factors that influence rainfall-runoff in river basins is important to estimate environmental management needs to sustain water availability [11]. Conservation ecologists in wildlife areas require knowledge of the spatial distribution of factors that influence rainfall-runoff and water availability impacts in habitats [12].
SWAT model has been applied in many parts of the world at various spatial and temporal scales, and environmental conditions to predict land use/cover and change impacts on water availability [13] [14] [15] [16]. SWAT is a physically-based and semi-distributed model that can be used at the watershed scale to predict water yields in river basins in areas of different LULC and soils. The model was chosen for this study because of its high adaptability to investigate a wide range of related parameters in river basin rainfall-runoff assessments and flexibility in ungauged basins [15] [17] [18] [19]. Understanding rainfall-runoff relationship; river flow trends; and prediction is necessary to support decision making for achieving sustainable water resources management in river basins [15]. The SWAT model is useful to investigate hydrological processes for water resources planning and management [13] [14] [15]. The objective of this paper was to run a SWAT model and assess the relationship between gauged data and NASA-POWER data using FAO soil data, and to test the potential of this remotely sensed data for river flow prediction in absence of intensive hydro-meteorological monitoring.

Study Area
Incalaue river basin (695.5 km 2 ) is located in Niassa Special Reserve (NSR) partly located both in Cabo Delgado and Niassa Provinces in Northern Mozambique ( Figure 1). NSR is a wildlife reserve area that hosts scattered human population settlements. NSR is the country's largest protected area, spanning 42,300 km 2 . The reserve is the largest and best-preserved tract of Miombo woodland left in Africa [20]. The region has to mean annual rainfall ranging from 800 mm to 1450 mm and the climate is strongly seasonal, with the annual rainfall occurring for 4 -5 months between December to April [21]. In the dry seasons, rivers have little or no river flow with deviations between seasons which creates uncertainty not only to the local communities but also tourism in NSR [20] [22].
Incalaue is a tributary of the Lugenda River whose basin covers a wide expanse in the reserve. Soils are dominated by shallow layers on granite rock which makes them well-drained [23]. Vegetation in the area has broadly been classified as dry woodland [24].
This northern Mozambique region is particularly data-poor and most research there has only been on land use/cover as well as carbon and fire dynamics [20] [25]. There is a mixture of LULC classes dominated by woodland vegetation interspersed with rock-inselbergs. The river flow levels reduce drastically and usually dry up during the dry season; the area has a few groundwater points; vegetation shed leaves in the dry season, and river flows in the rainy season sometimes overtop the river banks ( Figure 2). There are human population settlements in areas of Lisongole and Ntimbo 1 on opposite sides of the river (both ≤10 km away from the nearest river bank); and there is also Mbatamila camp (the administrative field office location for reserve management) in the basin upstream. Communities in the basin depend on landscape ecosystem services and biodiversity for their livelihoods [20].

Materials and Methods
Successful prediction of river flows and scenarios require the exercise to reduce a wide range of predictive uncertainties on rainfall products challenges in land use/cover mapping [26]. This study attempts to address this challenge by using of SWAT model to assess rainfall-runoff simulation using field-collected data to test the reliability of NASA-POWER meteorological data to simulate river flow.
The SWAT model is based on geography and natural hydrological processes at the watershed scale based on a combination of land use, soil, and slope parameters. In this paper, we assess the potential of NASA-POWER data to model rainfall-runoff using its relationship with gauged data when run in the SWAT model. ing and correction was done using the artificial "sinks" method. A threshold value of 500 pixels was selected. In this method, flow direction and accumulation grids were used to determine the accumulated weight of each pixel on a downslope and a threshold value (500 pixels), beyond which all grid pixels were considered being stream pixels. This approach was also used to map catchment boundary by using contributing up-slope area method. The model catchment boundaries and stream networks were both generated from the DEM using ArcGIS 10.4.1.

Land Use and Soil Data
The study area was clipped from the FAO Digital Soil Map of Mozambique. Average physical properties for water holding capacity; hillslope length; hillslope; upslope contributing area; and maximum cover of land that is impervious were assumed to depend mainly on the slope of the basin and considered automatic for the DEM ( Figure 2).
The study used Landsat a multispectral image with a 30-meter resolution for land use classification. The Landsat satellite image scene was obtained from the USG archives (https://ers.cr.usgs.gov/) for land use classification (Table 1) The basin has a sharp elevation gradient (799 m.asl -277 m.asl) and the Incalaue River drains the catchment with tributaries of Nipatembe, Lulo and Manyanganya ( Figure 3). There are 6 vegetation classes of high-density woodland, medium-density woodland, low-density woodland, wooded grassland, mountain forests, and wetland ( Figure 4). The rest of the basin area is built-up area, burned vegetation areas and inselbergs. Soil samples were taken at the corners of 5 square meter sampling plots, and samples were uniformly mixed for classification. Soil depth pits were dug in the center of the plot as deep as possible until the hard rock would be reached. These soil sampling plots were put randomly inaccessible vegetation classes in the area ( Table 2).
Vegetation and soil sampling would be done simultaneously at the same location during fieldwork. Vegetation in a location that was confirmed to exist in a class was based on the mapping of known classes [27] [28]. The soil pits could not be dug due to wildlife hazard risks in the mountain forest and wetland vegetation classes as these are not accessible because of wildlife concentration in these habitats. In the area, grass, shrub and bush vegetation have been reported to grow roots at least 1 m depth [29].

Meteorological Data
NASA-POWER satellite-based weather data was used to calibrate the SWAT model for comparison with field data. POWER provides a gridded database of freely available global meteorology and surface solar energy climatology data. The data is available to download with a resolution of 1/2 by 1/2 arc degree longitude and latitude making it potentially suitable for hydrometeorological studies. Data generation is funded through the NASA Earth Science Directorate Applied Science Program. The NASA POWER data has largely been used in agroclimatology modelling [16] [30]- [36]. The model has been used for estimating the renewable energy potential in Africa [37]. NASA-POWER data was downloaded for a center point for the catchment at latitude −12.333 and longitude 37.8125.
SWAT model-generated data was used to generate historical meteorological data for the basin for the model since the basin ungauged. In the SWAT model, there is a WXGEN weather generator model which is used to generate acceptable   climatic data for modelling purposes [18] [38]. Using the Green & Ampt method for infiltration, maximum temperature, minimum temperature, solar radiation and relative humidity, the weather generator independently generates the distribution of rainfall within the day; and wind speed is generated independently. This tool was downloaded from the SWAT website (http://www.brc.tamus.edu/swat/soft_links.html). SWAT WXNGEN data could be available for the area up to 2014 and we depended on its relationship with NASA-POWER to adopt the latter for modelling of the remaining time. In the model, missing weather data were given a negative value (−99.0) in the model which instructs the weather generator of the model to generate weather data for that day.

River Flow Data
Flow data is one of the major challenges to water modelling in ungagged catchments. In this study, the SWAT model inbuilt rainfall-runoff model was used to get the flow for use in modelling because there was no nearby catchment that was gauged. There was a good correlation coefficient (0.8) for rainfall-runoff modelled using the FAO dataset for the period 1980-2014 and thus SWAT WXGEN data was adopted for calibration ( Figure 5). A coefficient of determination commonly known as R-squared (or R 2 ) is a measure of the amount of variance in the dependent variable that is explained by the independent variable. It shows the strength of a linear relationship between two variables and examines how differences in one variable can be explained by the difference in a second variable. SWAT data available for the area for the years 1980 to 2014 showed rainfall-runoff with flow peaks at the end of the year and the start of the next year (October to April). The minimum flow levels, when averaged, showed that monthly river flow can be zero during the dry season with river flow peaks for year being January to April with a low minimum for the months of June to October. The strong coefficient of determination (R 2 = 0.8) shows a good water balance in the basin. This shows that the variance in the river flow is explained by rainfall variation which makes the relationship reliable for the generation of river flow from rainfall data.

Community Consultations and Experiences
Household interviews were used to collect data from the local community knowledge and experiences on seasonal rainfall-runoff, water availability, trends and threats to water availability. This approach was used to gather information verify modelled data since the basin is ungauged. Community consultations were held in the human settlement areas of Ntimbo 1 and Lisongole in the downstream area. Sampling was done by randomly selecting household heads or adult family members who had stayed in the area for >20 years. This was used the back-up remotely sensed data on trends and field-collected data.
The estimated number of households was 123; where 56 were in Ntimbo 1 and 67 in Lisongole. Interviews were held in the dry season timing in the afternoons when people are not in gardens. This approach of opening up household interviewee was used to avoid bias as members are randomly selected depending on availability in homesteads while ensuring efficiency by sampling adults with experience of the area [39]. The closeness of communities using the river at similar points ensures data reliability and further enhances historical data reliability.

Statistical Data Analyses
Data analysis was done in Microsoft Excel.

Field Collected Data
Soils data showed that wooded grasslands had the most uniformly mixed soil among sampling sites in areas with less stony compacted soil (Table 3). In the area, grass, shrub and bush vegetation have been reported to grow roots at least one-meter depth [29].
The shallow stony soil that was not very easy to dig through in the wet soil zone is a sign of compaction that leads to higher rainfall-runoff given that it is a hilly landscape. Soil samples were taken to the laboratory at Eduardo Mondlane University in Maputo for laboratory analysis and these showed soil groups in vegetation classes (Table 4).
The soils were predominantly sandy and sand particle size (Table 5). This kind of soil in this sloping landscape means more rainfall-runoff and sedimentation.
Soils were mainly composed of particle sizes in the classes of <2 mm classes which shows sand soil with smaller granules and this makes it to be prone to erosion that can cause dense sedimentation in the river channel. The uniformity of samples from different sampling sites was tested to estimate deviations in distribution across the landscape using log-log plots ( Figure 6). The results above showed that it is only sand content and bulk density that can be related to all the sampling sites. The study results above interestingly show that FAO characterization of the soil misses the top-soil layers and characteristics and these are even most important on river flow generation, composition and quality. The study thus shows a need for soil characterization to support rainfall-runoff studies.  There is no data available on groundwater harvest in the area and only one borehole was observed within the catchment at Lisongole village and another in a nearby town of Mecula. This means that groundwater springs and contribution to river flow cannot be completely ruled out. However, in the bigger landscape and within the basin itself there are dambos which are shallow vegetated areas with wetter vegetation in the dry season and micro-dambos were observed to Computational Water, Energy, and Environmental Engineering

Model Results
NASA-POWER data shows seasonal similarity with gauged rainfall but with much higher values (Figure 7). NASA-POWER data confirmed additionally that June, July, August and September as the dry moths and this could imply the reduced river flow and river channel sedimentation in this wet season as was confirmed by the SWAT model. The similarity in rainfall trends but a weaker rainfall-runoff relationship for gauged data can be attributed to landscape characteristics such as water losses from catchment storage. The NASA-POWER meteorological data shows a good seasonal trend for temperature, relative humidity and rainfall again in support of reliability for the study (Figure 8).
SWAT model WXGEN rainfall data generates a close rainfall-runoff relationship compared to when NASA-POWER is used (Figure 9). Data collected by this study also showed a positive trend a good pattern for rainfall and river flow. The peaks for the two years of fieldwork are December   and April in each case the dry season starting in May ( Figure 10).
The rainfall relationship over the two years of fieldwork shows a positive trend but with a stronger relationship than observed in SWAT model generated WXGEN data and NASA-POWER data ( Figure 11).
Rainfall and river flow data recorded during fieldwork also showed a positive rainfall-runoff relationship with seasons of no rainfall and no river flow periods, and little rainfall and no river flow because of the sandy soil and catchment water storage. The short database collected cannot effectively explain the lower coefficient of variation and the relationship.

Field Data Collected from the Community and Observations
Data from the community showed reliance of groundwater springs and confirmed modelling trends. The study showed that area climate variations more than human factors influence of river water availability and trends ( Table 6).
The field data collected and observations were similar to community reports and the situation means vulnerability for human settlement community about Figure 10. Field collected rainfall over the study time. Figure 11. Rainfal-runoff model using gauged data.  Table 6. Community collected data on river flow, water use and land use/cover threats.
Potential activity Observed in the field/reported Potential impact on the environment Water availability in the river and trends Yes (Observed sharp reduction in seasonal water availability instream for two years for dry and wet seasons).
Very high flows in the rainy season and water not available for flow in the river the dry season only existing in small pools.
Rainfall run off patterns and nature Yes (observed higher rainfall-runoff peak for second monitoring season). Reported rainfall increase and higher flow peaks but longer low flow and no flow seasons High hydrologic response slope curve can mean sharp rainfall-runoff hydrograph curve and dry season low water availability instream.

Rainfall trends
• Change in timing of rainy season towards start and earlier in the year (October to Deceember) from end-year away from December toJanuary • Increase in length of dry seasons • Increase in amounts for rainfall events • Increase intensity of dry season rain days Seasonal changes can affect water availability instream River flow trends water availability and sustainability which requires integrated water resources management and water supply investment. This study found a rainfall-runoff trend as was reported by the community further supporting the potential of the SWAT model to generate rainfall-runoff using NASA-POWER data ( Figure 12). There is a small and more stable change in river flow over the peak seasons in the time peri0d of 2001-2021 studied. This shows increase in water loss possibly by natural landscape processes and changes than human influences causes because no rainfall-runoff storage was observed or reported in the catchment.