Open Journal of Statistics

Volume 7, Issue 5 (October 2017)

ISSN Print: 2161-718X   ISSN Online: 2161-7198

Google-based Impact Factor: 0.53  Citations  

Using Boosted Regression Trees and Remotely Sensed Data to Drive Decision-Making

HTML  XML Download Download as PDF (Size: 716KB)  PP. 859-875  
DOI: 10.4236/ojs.2017.75061    2,366 Downloads   6,043 Views  Citations

ABSTRACT

Challenges in Big Data analysis arise due to the way the data are recorded, maintained, processed and stored. We demonstrate that a hierarchical, multivariate, statistical machine learning algorithm, namely Boosted Regression Tree (BRT) can address Big Data challenges to drive decision making. The challenge of this study is lack of interoperability since the data, a collection of GIS shapefiles, remotely sensed imagery, and aggregated and interpolated spatio-temporal information, are stored in monolithic hardware components. For the modelling process, it was necessary to create one common input file. By merging the data sources together, a structured but noisy input file, showing inconsistencies and redundancies, was created. Here, it is shown that BRT can process different data granularities, heterogeneous data and missingness. In particular, BRT has the advantage of dealing with missing data by default by allowing a split on whether or not a value is missing as well as what the value is. Most importantly, the BRT offers a wide range of possibilities regarding the interpretation of results and variable selection is automatically performed by considering how frequently a variable is used to define a split in the tree. A comparison with two similar regression models (Random Forests and Least Absolute Shrinkage and Selection Operator, LASSO) shows that BRT outperforms these in this instance. BRT can also be a starting point for sophisticated hierarchical modelling in real world scenarios. For example, a single or ensemble approach of BRT could be tested with existing models in order to improve results for a wide range of data-driven decisions and applications.

Share and Cite:

Colin, B. , Clifford, S. , Wu, P. , Rathmanner, S. and Mengersen, K. (2017) Using Boosted Regression Trees and Remotely Sensed Data to Drive Decision-Making. Open Journal of Statistics, 7, 859-875. doi: 10.4236/ojs.2017.75061.

Cited by

[1] Diversity of global fisheries governance: Types and contexts
Fish and …, 2023
[2] Evaluation of the metabolomic profile through 1H-NMR spectroscopy in ewes affected by postpartum hyperketonemia
Scientific reports, 2022
[3] Driving forces of forest expansion dynamics across the Iberian Peninsula (1987–2017): a spatio-temporal transect
Iglesias, M Ninyerola, P Serra… - Forests, 2022
[4] Comparison of Classification Performances of MARS and BRT Data Methods: AB? DE-2016 Case
EGITIM VE BILIM-EDUCATION AND …, 2022
[5] Using Machine Learning to Predict the Risk of Human-Elephant Conflict in the Nepal-India Transboundary Region
2022
[6] MARS ve BRT Veri Madenciliği Yöntemlerinin Sınıflama Performanslarının Karşılaştırılması: ABİDE-2016 Örneği
EĞİTİM VE BİLİM, 2022
[7] Smart Environment Monitoring Models Using Cloud‐Based Data Analytics: A Comprehensive Study
… Approach for Cloud Data Analytics in IoT, 2021
[8] Patterns and drivers of rodent abundance across a South African multi-use landscape
Animals, 2021
[9] Environmental drivers of reef manta ray (Mobula alfredi) visitation patterns to key aggregation habitats in the Maldives
2021
[10] Identifying predictors of international fisheries conflict
2021
[11] Fine‐scale oceanographic drivers of reef manta ray (Mobula alfredi) visitation patterns at a feeding aggregation site
2021
[12] Large-scale High-resolution Coastal Mangrove Forests Mapping across West Africa with Machine Learning Ensemble and Satellite Big Data
2021
[13] Application of machine learning algorithms and their ensemble for landslide susceptibility mapping
2020
[14] Remote islands are vulnerable to non-indigenous species: Utilization of data analytics to investigate potential modes of introduction and pest interceptions
2020
[15] ABİDE 2016 fen başarısının yordanmasında MARS ve BRT veri madenciliği yöntemlerinin karşılaştırılması
2020
[16] Reef manta rays, Mobula afredi, of the Chagos Archipelago: Habitat use and the effectiveness of the region's marine protected area
2019
[17] Estimating Spatial and Temporal Trends in Environmental Indices Based on Satellite Data: A Two-Step Approach
2019
[18] Serum proteomic analysis of melanoma patients with immunohistochemical profiling of primary melanomas and cultured cells: Pilot study
2019
[19] Data-Driven Decision Making in Precision Agriculture: The Rise of Big Data in Agricultural Systems
2019
[20] Performance indicators in football: The im-portance of actual performance for the market value of football players
SCIAMUS – Sport und Management, 2019
[21] portance of actual performance for the market value of football players
2019
[22] Relationships in the data
2018
[23] Sam Clifford-Bayesian Statistics
2018
[24] Influence of Spatial Aggregation on Prediction Accuracy of Green Vegetation Using Boosted Regression Trees
Remote Sensing, 2018
[25] Education and Science
[26] Eğitim ve Bilim

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.