Pixel-by-Pixel Analysis of Soil and Leaf Coverage in Purslane: A CIELAB Approach

Abstract

This study utilized a computer application developed in Visual StudioTM using C# to extract pixel samples (RGB) from multiple images (26 images obtained from August 20, 2024, to September 22, 2024), of a purslane pot taken from a top-down perspective at a distance of 30 cm. These samples were projected into the CIELAB color space, and the extracted pixels were plotted on the a*b* plane, excluding the luminance value. A polygon was then drawn around all the plotted pixels, defining the color to be identified. Subsequently, the application analyzed another image to determine the number of pixels within the polygon. These identified pixels were transformed to white, and the percentage of these pixels relative to the total number of pixels in the image was calculated. This process yielded percentages for brown (soil), green (leaf cover), and pink (stem color). A single polygon was sufficient to accurately identify the green and brown colors in the images. However, due to varying lighting conditions, customized polygons were necessary for each image to accurately identify the stem color. To validate the green polygon’s accuracy in identifying purslane leaves, all leaves in the image were digitized in AutoCADTM, and the green area was compared to the total image area to obtain the observed green percentage. The green percentage obtained with the polygon was then compared to the observed green percentage, resulting in an R2 value of 0.8431. Similarly, for the brown color, an R2 value of 0.9305 was found. The stem color was not subjected to this validation due to the necessity of multiple polygons. The R2 values were derived from percentage data obtained by analyzing the total pixels in the images. When sampling to estimate the proportion and analyzing only the suggested sample size of pixels, R2 values of 0.93049 for brown and 0.8088 for green were obtained. The average analysis time to determine the brown soil percentage using the polygon (BP) for 26 images with an average size of 1070 × 1210 pixels was 44 seconds. In contrast, sampling to estimate the proportion reduced the analysis time to 0.9 seconds for the same number of images. This indicates that significant time savings can be achieved while obtaining similar results.

Share and Cite:

Quevedo-Nolasco, A. , Aguado-Rodríguez, G. , Lara-Viveros, F. and Landero-Valenzuela, N. (2025) Pixel-by-Pixel Analysis of Soil and Leaf Coverage in Purslane: A CIELAB Approach. Agricultural Sciences, 16, 227-239. doi: 10.4236/as.2025.162015.

1. Introduction

In Mexico, there are more than 358 plant species known as “quelites”, where only tender leaves and stems are consumed, including purslane. Its yield is mainly determined by the development of stems and leaves [1]. It is worth mentioning that in purslane, some physicochemical properties of the stems differ from those of the leaves [2]. For this reason, segmentation of regions of interest is an important pre-processing step in many color image analysis procedures [3]. Segmentation of plant objects in digital images is also an important pre-processing step for effective phenotyping by image analysis [3].

To classify colors in images, some studies have used the CIELAB color format. The CIELAB space, also known as CIELAB, was established by the International Commission on Illumination [4]. It is composed of a three-dimensional space with axes L*, a*, and b*. The coordinate a* defines the deflection from the achromatic point for clarity, towards red if a* > 0, and towards green if a* < 0. Similarly, the coordinate b* defines the deviation towards yellow if b* > 0, and towards blue if b* < 0 [5].

Reference [6] used the CIELAB color format without considering the effect of luminosity. In agriculture, this format has been used for tomatoes because classifying tomato ripeness levels manually has several drawbacks, namely requiring a long process, having a low level of accuracy, and being inconsistent [7]. Additionally, different strategies have been employed to evaluate color using the CIELAB format, such as [8], who used a color plane in their research utilizing the CIELAB color format. Furthermore, some studies have focused on soil color. This is because soil color is a key indicator of soil properties and conditions, influencing both agronomic and environmental variables. Conventional methods for soil color determination have come under scrutiny due to their limited accuracy and reliability [9].

Due to the above, this study proposes a strategy to estimate the color percentage of purslane (stems and leaves) and soil color using the CIELAB format for quick results via a computer application.

2. Materials and Methods

The research was conducted at the College of Postgraduates. A pot with purslane seeds was placed in unobstructed sunlight, with soil from a plot located at coordinates 19.4602165, −98.9032438 (latitude and longitude). The pot, with a volume of 5.225 dm3 (0.1845 ft3), contained loam soil analyzed using the tactile method proposed by the USDA-NRCS [10]. Watering was applied as needed and supplemented with rainwater.

2.1. Image Capture

Pictures were taken from the zenith of the pot, 30 cm above the soil and plants, looking downwards. The images had dimensions of 3060 pixels by 4080 pixels. The images were obtained between August 20, 2024, and September 22, 2024, at 7:00 a.m. (except on September 8th and 18th, when pictures were taken at 5:00 p.m.) to cover the complete cycle of purslane from the emergence of the first leaves to post-flowering. Around midday, taking the photo would result in the user’s shadow appearing in the picture. Therefore, it was decided to take pictures in the morning to avoid this issue. This allowed capturing different shades of green in the leaves and brown in the soil under various lighting conditions (cloudy and sunny days) and soil moisture. As the objective of this study was to identify the green color of the plant (to obtain the leaf area) and the brown color of the soil (to obtain the soil area), it was not necessary to take pictures daily. There were no significant changes in the plant from one day to the next, so pictures were not taken in those days.

2.2. Image Preparation

It is worth mentioning that the main interest in the images was to obtain the color percentages only within the pot and around the contour of the leaf area. To prevent brown pixels outside the pot or the contour of the leaf area from being analyzed, a blue background was added around them in all images (Figure 1). The blue background was chosen because it is clearly distinct from the brown color and the green of the purslane leaves. This procedure was applied to all the images.

Figure 1. Original image of the pot with purslane taken on September 1st, 2024 (left), and image modified with a blue color outside the pot and leaf area (right).

2.3. Process to Obtain the Percentage of Exposed Brown Soil from Images Using Polygons (BP)

A methodology similar to that, described by [11] was used. An application programmed using C# in Visual Studio 2010™ was utilized, consisting of three menus. The first menu allows opening and displaying an image. Then, pixel sample extraction begins by forming a square of size Z × Z pixels, where Z is defined by the user. The center of the square is indicated by the user by double-clicking on the area of the image that shows the brown color (the place from which the RGB pixel values are extracted). The extracted pixels are marked in white on the original image for visualization (Figure 2, left). They are then projected onto the a* - b* plane of the CIELAB color format, discarding luminosity (after converting RGB to CIELAB).

This process is repeated with several images to identify the different shades of brown under various lighting and soil moisture conditions. In the a* - b* plane, where all the pixels are projected, a polygon is drawn around the identified pixels and their coordinates are obtained (Figure 2, right). The resulting polygon for the brown color had the coordinates: 47, 19; 1, −13; −6, −8; −2, −2; 1, 5; 0, 10; 2, 20; 7, 34; 21, 50; 47, 19.

Figure 2. Areas of extracted pixels from the image marked in white (left), which were projected onto the a*-b* plane of the CIELAB color format (right).

The second menu allows opening an image. The previously constructed polygon, in this case for the brown color, is provided. The image is then analyzed with an algorithm that determines if each pixel in the image is within the polygon (for this, prior transformations from the RGB color format to CIELAB are performed). Only the pixels within the polygon are considered brown. The percentage of exposed brown soil obtained using the polygon (BP) is calculated by dividing the number of brown pixels by the total number of pixels in the image, multiplied by 100. The image is modified by setting the pixels identified as brown to white (Figure 3).

Figure 3. Image without the identified brown color (left) and image with pixels identified as brown changed to white (right).

The third menu allows for analyzing all the images contained in a previously created folder on the computer. The user provides the folder address, the polygon, and the image format (jpg or png). The program generates a text file with the image name, total number of pixels, number of brown pixels, and percentage of brown color obtained using the polygon (BP). A pixel jump can be defined for faster analysis by analyzing a smaller percentage of pixels. If a pixel jump (PJ) of 1 is indicated, the algorithm will go through all pixels in the image one by one, both in rows and columns, thus analyzing 100% of the pixels. However, if the pixel jump (PJ) is 2, the algorithm will go through every two pixels in both rows and columns, analyzing only 25% of the pixels in the image. In this way, by increasing the pixel jump, a smaller number of pixels in the image will be analyzed. Finally, in this menu, the percentage of exposed brown soil (BP) in each image within a folder on the computer can be estimated through proportion sampling. It is worth mentioning that the sample size to estimate the proportion was calculated assuming a 95% confidence level (with P for statistical significance set at 0.05), as proposed by [12], a margin of error of 0.01% (some authors, such as [13], propose a permissible error in the estimate of around 5%, however, since there were no constraints on the cost of samples or time, this stricter error margin was used to ensure a larger sample size and greater precision), and an estimated proportion of 50% (0.5). This proportion was used because the sample size was estimated at various proportions (5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, and 95%), and the largest sample size was always obtained using the 50% proportion by using the formula to estimate the sample size of a proportion proposed by [14]. Since each image size may vary (after coloring the background blue, the image size changed slightly), the sample size differed among images. Using the sample size, pixel sampling in an image was done by applying a regular grid. This procedure involved overlaying a grid on the image, with equidistant lines horizontally and vertically. This way, sampling points were uniformly selected at line intersections (throughout the image), ensuring representative and equitable sampling.

2.4. Process to Obtain the Percentage of Green Leaf Coverage (GP) from Images

To obtain the percentage of green leaf coverage (GP), the polygon identified by [11], for this color, was used. However, in the case of purslane, some shades of green were not identified. The same procedure used to identify the brown color mentioned above was applied, but this time for the green color of the purslane leaves. The following polygon was obtained by adjusting three points of the original polygon by [11], to include all shades of green (Figure 4, left): −13, 85; −10, 67; −7, 26; −4, 14; −5, 0; −10, −6; −20, −8; −50, −10; −71, 5; −85, 32; −79, 65; −58, 75; −13, 85.

Figure 4. Polygon plotted in the a*-b* space to identify green pixels of the purslane leaf coverage (left) and polygons obtained to identify pink shades of the purslane stems (right).

2.5. Process to Obtain the Percentage of Exposed Pink Stems (PP) from Images

To determine the soil coverage percentage more accurately, the percentage of exposed pink stems (PP) inside and outside the pot was obtained. The same procedure as previously described was applied here, specifically for the colors of the purslane stems. However, a single polygon that could identify all the stems was not found due to their similarity to the soil color. It is worth mentioning that the polygon was adjusted for each image until the stems were visually identified. The user must ensure that the desired color pixels are correctly identified. A correct polygon is indicated when stem pixels are detected without including non-stem pixels. If the user selects a larger area, soil pixels may be detected, leading to errors. Therefore, 11 different polygons were obtained (Figure 4, right) for all the images analyzed (26). It is worth mentioning that visually, the polygons correctly identified the stems in all the images, as shown in Figure 5. Additionally, the color of the stems was the only color affected by lighting conditions throughout the day and cloudiness. This is because, in the CIELAB color format, the color of the stems was very similar to the color of the soil used.

Figure 5. Color of purslane stems identified under low soil moisture conditions (left) and, high soil moisture conditions (right).

2.6. Observed Percentages of Exposed Brown Soil (BO) and Green Leaf Coverage (GO)

The images were digitized at a 1:1 scale in centimeters using the AutoCADTM program. Then, the total area of each image was calculated in cm2. Subsequently, in each image, the contour of all the green leaves of the purslane was digitized (Figure 6, center) using the sketch command and a line size of 0.01 cm. The area of each digitized green surface was calculated, and the total green area of each image was determined. To obtain the observed percentage of green leaf coverage (GO), the total area identified as green in AutoCADTM was divided by the total area of the image and multiplied by 100.

To obtain the observed percentage of brown soil (BO), the contour of the pot was digitized, and when the leaf area protruded, it was also included in the contour (Figure 6, right). Therefore, the area of the soil along with the leaf coverage area was obtained. This area was divided by the total area of the image and multiplied by 100 to obtain the percentage of soil plus leaves. Thus, the resulting percentage was subtracted from the observed green leaf coverage percentage (GO) to obtain the observed percentage of exposed brown soil (BO).

Figure 6. Original image (left), digitized green areas with the pot circle marked as a reference (center), and pot contour encompassing brown soil and green leaf area (right).

3. Results

The program visually identified the green colors of the leaf area and the brown colors of the soil correctly, as shown for some images under different moisture conditions in Figure 7.

Figure 7. Images with the exterior of the pot covered in blue (A), pixels identified as brown changed to white (B), blue pixels changed to white (C), and pixels identified as green changed to white (D), measured under low (left), medium (center), and high soil moisture conditions (right).

To quantitatively validate the color selection process, a plot was created relating the percentage of brown soil color obtained with the polygon (BP) to the observed percentage of brown soil color (BO), obtained with the areas calculated in AutoCADTM. This plot is shown in Figure 8.

Figure 8. Relationship between the percentage of brown soil color estimated with AutoCAD (BO) and the percentage of brown soil color using the pixel recognition program with the polygon (BP).

The percentage of green leaf coverage estimated with the polygon (GP) was plotted against the percentage obtained with AutoCADTM (GO) throughout the analyzed period (Figure 9).

Figure 9. Relationship between the percentage of green leaf coverage obtained directly with the areas digitized in AutoCAD (GO) and the percentage of green leaf coverage obtained with the polygon found for the green color (GP).

It is worth mentioning that the percentage of brown color was also estimated indirectly using the polygons (BPI). This percentage is obtained by starting from 100% and subtracting the percentages of blue color (Figure 1, left), green leaf coverage obtained with the polygon (GP), and stems in each image (Figure 5). The percentage of brown color estimated indirectly using the polygons (BPI) was plotted against the observed percentage of brown color obtained with AutoCAD (Figure 10).

Figure 10. Relationship between the observed percentage of brown soil color (BO) and the percentage obtained indirectly using polygons (BPI).

Studies are conducted on samples because it is usually impossible to study an entire population. The sample must therefore be representative of the population. This is best ensured by using proper sampling methods. The sample must also be adequate in size, neither too large nor too small [12].

In this study, the average time to determine the percentage of brown soil color using the polygon (BP) by analyzing 100% of pixels in 26 images, each averaging 1070 × 1210 pixels, was found to be 44 seconds. On the other hand, when sampling was performed to estimate the proportion (BP) using the formula proposed by Daniel [14], the analysis time for the same images was less than 1 second (0.9 seconds). A R2 value of 0.93049 was obtained by relating the observed percentage of brown soil color (BO) with that obtained using the pixel recognition program and the polygon (BP). It is worth mentioning that the same procedure was used to obtain the R2, but instead of using all pixels, sampling was employed to determine BP, resulting in a R2 of 0.93048.

Performing the aforementioned procedure for the green color, a R2 value of 0.8431 was obtained by analyzing all the pixels in the image and 0.8088 by sampling pixels.

It is necessary to indicate that in all the images, both the brown and green colors were not distributed homogeneously but were concentrated in the center. This can influence the sampling because, with most of the pixels of interest concentrated in the center of the image, there may be an error in accurately determining the color percentage (brown or green). Reference [12] mentions that a sample that is smaller than necessary would have insufficient statistical power to answer the primary research question, and a statistically nonsignificant result could merely be because of inadequate sample size (Type 2 or false negative error). To avoid this, determination coefficients (R2) were obtained by relating observed data with polygon-derived data for brown and green colors (Figure 11) at different pixel intervals.

Figure 11. Relationship between pixel jump (PJ) and determination coefficients (obtained between estimated observed brown color percentage, BO, and brown color percentage obtained with the polygon by sampling in the image with the indicated pixel jump) for brown color (B), and determination coefficients (obtained between estimated observed green color percentage and green color percentage obtained with the polygon by sampling in the image with indicated pixel jump) for green color (G).

4. Discussion

In this study, a methodology was developed and applied to identify and quantify the green and brown colors in images of purslane plants. It was found that the percentage of exposed brown soil (BP) and green leaf coverage (GP) using the polygon in the CIELAB plane showed a high correlation with values obtained using AutoCAD (observed), with an R2 of 0.93 and 0.84, respectively. This is consistent with Chen et al. [15], who found r values of −0.84 when predicting soil total organic carbon using an L variable from the CIELAB color format. Using the same L variable, Baek et al. [16] found an R2 greater than 0.9 in their study. Similarly, Nuraini et al. [7] produced a model that can detect the level of ripeness of tomatoes with an accuracy of 88.194% using the same color format. It is worth mentioning that high R2 values were obtained despite different lighting and soil moisture conditions, which is consistent with Baek et al. [16] who found that CIELAB was suitable for dealing with irregular light conditions in the field. Other authors, such as Schmidt and Ahn [17] found L* to be one of the most useful variables in their study. The high correlation between the methods suggests that the polygon-based methodology is a reliable tool for identifying and quantifying colors in plant images.

A limitation of this study is that the methodology depends on the user’s accuracy in selecting pixels of interest and correctly drawing the polygon in the a*-b* plane, which can introduce variability in results. Future research could focus on automating the pixel selection process and evaluating the methodology in different types of vegetation and environmental conditions. The key to ensuring the methodology works effectively in other crops will be to manage the background color of the image properly. This is because, in this study, the green and brown colors were easily identified as they occupied different areas of the a-b plane in the CIELAB color format. However, for the stem (Figure 4, right) and soil (Figure 2, right) color, since they were in very close areas, identifying the stems with a single polygon was not possible.

It is worth mentioning that, regarding the estimation of the brown and green color percentages in the images using the polygon method with pixel jumps, it was found that the R2 values for the brown color remained constant up to a jump of 75 pixels between each measurement, while for green, they remained constant only up to a jump of just 10 pixels. This was expected because the brown area in the images varied from 19% to 38%, while the green area varied from 3% to 13%, as shown in Figures 8 and 9, respectively. Since the green area is smaller, it is natural to increase the sample size to find the green pixels. When sampling to estimate the proportion with image pixels as proposed by Andrade [12], it was observed that the R2 value was very similar to that obtained by analyzing all pixels of the image. This indicates that it is better to rely on sampling to estimate the proportion rather than proposing a pixel jump for the images.

According to the results, a practical application of the tool used in this study is the detection of water stress, identification of diseases, and evaluation of soil/weed coverage in a relatively short time for preventive decision-making.

5. Conclusions

In this study, the brown color of the soil and the green color of the purslane leaves were successfully identified with coefficients of determination (R2) of 0.9305 and 0.8431, respectively, using a single polygon for each color. However, in the case of the stem color, as they are pink and similar to the soil color, this technique is not suitable since a customized polygon is required for each image.

Additionally, the analyses suggest that sampling to estimate the proportion proved to be an adequate technique to obtain values for the desired color percentage, compared to those obtained by analyzing all the pixels in the image. This indicates that sampling can be an efficient and accurate alternative.

The proposed methodology proved to be a reliable and efficient tool for plant image analysis, allowing for precise results in a significantly reduced time when using proportion sampling. However, the accuracy of the method can be affected by variability in pixel selection and polygon drawing, suggesting the need for future research to automate these processes and evaluate the methodology in different types of vegetation and environmental conditions.

In summary, the polygon-based methodology in the CIELAB color space offers a promising solution for color quantification in agricultural images, contributing to the advancement of phenotyping techniques and image analysis in the field of precision agriculture.

Acknowledgements

The authors appreciate the support granted by the National Council of Science, Humanities, and Technology (CONAHCYT) through the project CIR/0027/2022: Strategies for Collecting Rainwater in Semi-Urban Conditions for Agricultural Use. We also thank the Campus Montecillo Postgraduate College for providing the necessary facilities for the development of this research. Additionally, we acknowledge the institutions of the collaborators for providing the conditions for the development of the research.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Lagunes-Fortiz, E., Villanueva-Verduzco, C., Lagunes-Fortiz, E.R., Zamora-Macorra, E.J., Ávila-Alistac, N. and Villanueva-Sánchez, E. (2021) La densidad de siembra en el crecimiento de la verdolaga. Revista Mexicana de Ciencias Agrícolas, 12, 317-329.
https://doi.org/10.29312/remexca.v12i2.2848
[2] Desta, M., Molla, A. and Yusuf, Z. (2020) Characterization of Physico-Chemical Properties and Antioxidant Activity of Oil from Seed, Leaf and Stem of Purslane (Portulaca oleracea L.). Biotechnology Reports, 27, e00512.
https://doi.org/10.1016/j.btre.2020.e00512
[3] Kumar, P. and Miklavcic, S.J. (2018) Analytical Study of Colour Spaces for Plant Pixel Detection. Journal of Imaging, 4, Article 42.
https://doi.org/10.3390/jimaging4020042
[4] Commission Internationale de l’Éclairage (CIE) (1978) Recommendations on Uniform Color Spaces, Color Difference and Psychometric Color Terms. CIE Central Bureau.
[5] Domínguez Soto, J.M., Román Gutiérrez, A.D., Prieto García, F. and Acevedo Sandoval, O. (2018) Sistema de Notación Munsell y CIELab como herramienta para evaluación de color en suelos. Revista Mexicana de Ciencias Agrícolas, 3, 141-155.
https://doi.org/10.29312/remexca.v3i1.1489
[6] Anagnostopoulos, C., Koutsonas, A., Anagnostopoulos, I., Loumos, V. and Kayafas, E. (2005) Tile Classification Using the CIELAB Color Model. In: Sunderam, V.S., van Albada, G.D., Sloot, P.M.A. and Dongarra, J.J., Eds., Computational ScienceICCS 2005, Springer, 695-702.
https://doi.org/10.1007/11428831_86
[7] Nuraini, R., Soares, T.G., Dayurni, P. and Mulyadi, M. (2023) Tomato Ripeness Detection Using Linear Discriminant Analysis Algorithm with CIELAB and HSV Color Spaces. Building of Informatics, Technology and Science (BITS), 5, 523-531.
https://doi.org/10.47065/bits.v5i2.4192
[8] Suzuki, T., Ito, C., Kitano, K. and Yamaguchi, T. (2024) CIELAB Color Space as a Field for Tracking Color-Changing Chemical Reactions of Polymeric Ph Indicators. ACS Omega, 9, 36682-36689.
https://doi.org/10.1021/acsomega.4c05320
[9] Rizzo, R., Wadoux, A.M.J., Demattê, J.A.M., Minasny, B., Barrón, V., Ben-Dor, E., et al. (2023) Remote Sensing of the Earth’s Soil Color in Space and Time. Remote Sensing of Environment, 299, Article ID: 113845.
https://doi.org/10.1016/j.rse.2023.113845
[10] USDA-NRCS (2020) Guide to Texture by Feel.
https://www.nrcs.usda.gov/sites/default/files/2022-11/texture-by-feel.pdf
[11] Acuayte-Valdes, E., Barrientos-Priego, A.F., Duran-Peralta, E., Cabrera-Morales, M. and Aguado-Rodríguez, G.J. (2023) CIE L*a*b* Polygon for Quantifying Scab (Sphaceloma perseae) in Avocado Fruit Images. Revista Chapingo Serie Horticultura, 30, 3-15.
https://doi.org/10.5154/r.rchsh.2023.06.004
[12] Andrade, C. (2020) Sample Size and Its Importance in Research. Indian Journal of Psychological Medicine, 42, 102-103.
https://doi.org/10.4103/ijpsym.ijpsym_504_19
[13] Singh, A.S. and Masuku, M.B. (2014) Sampling Techniques and Determination of Sample Size in Applied Statistics Research: An Overview. International Journal of Economics, Commerce and Management, 2, 1-22.
[14] Daniel, J. (2012) Sampling Essentials: Practical Guidelines for Making Sampling Choices. SAGE Publications, Inc.
https://doi.org/10.4135/9781452272047
[15] Chen, Y., Zhang, M., Fan, D., Fan, K. and Wang, X. (2018) Linear Regression between Cie-Lab Color Parameters and Organic Matter in Soils of Tea Plantations. Eurasian Soil Science, 51, 199-203.
https://doi.org/10.1134/s1064229318020011
[16] Baek, S.H., Park, K.H., Jeon, J.S. and Kwak, T.Y. (2022) Using the CIELAB Color System for Soil Color Identification Based on Digital Image Processing. Journal of the Korean Geotechnical Society, 38, 61-71.
https://doi.org/10.7843/kgs.2022.38.5.61
[17] Schmidt, S.A. and Ahn, C. (2021) Analysis of Soil Color Variables and Their Relationships between Two Field-Based Methods and Its Potential Application for Wetland Soils. Science of The Total Environment, 783, Article ID: 147005.
https://doi.org/10.1016/j.scitotenv.2021.147005

Copyright © 2025 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.