A Bipolar Fuzzy Approach to Image Segmentation: Enhancing Similarity Measures and Entropy Computation ()
1. Introduction
Image segmentation plays a crucial role in digital image processing, enabling the extraction of meaningful structures from complex image data. Traditional segmentation methods, including thresholding, region-growing, and edge detection, often suffer from challenges in handling imprecise boundaries and noise.
The limitations of crisp set theory become evident when attempting to model real-world phenomena with inherent vagueness, ambiguity, and subjectivity. As [1] and [2] suggest, rigid, dichotomous logic often fails to capture the complexity of human thought, perception, and decision-making, which are rarely confined to absolute categories. Traditional mathematical models struggle to balance precision and interpretability, often requiring an unequivocal structure while also needing to reflect the fluid nature of real-world uncertainties.
To bridge this gap, fuzzy set theory offers a more adaptive and expressive framework by allowing for gradual membership, enabling models to handle partial truths and uncertainty more effectively. Unlike crisp logic, fuzzy logic preserves stability even under minor variations in assumptions, making it particularly suitable for complex systems such as artificial intelligence, decision support systems, and natural language processing. By incorporating fuzzy logic, mathematical models gain the ability to represent reality more accurately, ensuring that they remain both mathematically rigorous and practically relevant in domains where strict boundaries fail to capture nuanced interpretations.
In many real-life domains, it is necessary to be able to deal with bipolar information – information possessing both what is possible and what is not possible. It is noted in [3] that positive information represents what is granted to be possible, while negative information represents what is granted to be impossible. According to [3], dual Boolean logic lacks the representational and reasoning capabilities for directly modeling the coexistence and interaction of bipolar relationships. While fuzzy logic may model uncertainty, it is limited when it comes to describing polarity. To resolve these two features, bipolar fuzzy set theory combines both the fuzziness and polarity into a unified model, hence providing a theoretical basis for bipolar clustering, conflict resolution and coordination [4].
This paper presents a bipolar fuzzy-based computational approach that enhances classical fuzzy segmentation models by incorporating dual membership functions that represent both the presence and absence of image features. The proposed method improves segmentation accuracy and potentially facilitates precise region classification and boundary detection outcomes. Additionally, the approach integrates Bipolar Fuzzy Jaccard Similarity to refine similarity assessments and Bipolar Rényi Entropy to quantify uncertainty, making it highly effective for complex image segmentation tasks. Experimental validation demonstrates its superior performance in applications such as medical imaging, remote sensing, and AI-driven pattern recognition, where precise segmentation is crucial for analysis and decision-making.
2. Theoretical Framework
This section presents the foundational concepts of bipolar fuzzy sets, similarity measures, and entropy functions, which form the basis of the proposed computational approach.
2.1. Fundamental Concepts of BFS
A few basic definitions are given here below as found in [5].
Let
be a nonempty set. A pair
is called a bipolar-valued fuzzy set (or bipolar fuzzy set) in
if
and
are mappings.
In particular, the bipolar fuzzy empty set (resp. the bipolar fuzzy whole set), denoted by
, is a bipolar fuzzy set in
defined by
The collection of all bipolar fuzzy sets of the set
will be denoted by
.
2.2. Entropy and Similarity Measures
A fundamental question in the realm of fuzzy sets involves the use of relative comparisons to calculate the similarities and distances between fuzzy sets [6] [7]. Similarity involves recognizing patterns and making associations which enable one to classify objects and concepts, particularly in the domain of classification and clustering. In these applications, an unknown object is assigned to a particular category if its similarity measure to objects within that category is higher than its similarity to objects in other categories [8].
2.2.1. Similarity
In essence, the similarity of two fuzzy sets is 1 if they are identical i.e., if they both contain the same values with the same degree of membership, and 0 if they have nothing in common, i.e., they do not contain any of the same values. These two properties are referred to as reflexivity and overlapping respectively.
Definition (Similarity Measure)
Intuitively, the similarity of two fuzzy sets is 1 if they are identical i.e., if they both contain the same values with the same degree of membership, and 0 if they have nothing in common, i.e., they do not contain any of the same values. These two properties are referred to as reflexivity and overlapping respectively.
A function
is called a similarity measure if
1) Reflexivity:
, meaning an element always has full similarity to itself.
2) Symmetry:
. This property ensures that similarity is bidirectional.
3) Overlapping:
. This property ensures that shared characteristics contribute strongly to the similarity score.
4) Transitivity: if
and
then
. This property is important for clustering and classification, ensuring that if
is similar to
and
is similar to
, then
should be similar to
.
It is not necessary for a similarity measure to have all these properties [9]. There are situations when symmetry does not need to be satisfied and in other situations, it has been argued whether transitivity is necessary or even useful.
2.2.2. The Jaccard Similarity Measure
The Jaccard similarity/index compares the similarity between two fuzzy sets by calculating the size of their intersection divided by the size of their union. According to [10], the fuzzy Jaccard similarity between two fuzzy sets
and
in a universal set
is defined as
Instead of simply counting shared elements, fuzzy Jaccard similarity is computed using the ratio of the sum of the minimum membership values (fuzzy intersection) to the sum of the maximum membership values (fuzzy union).
2.3. Entropy
Entropy, a fundamental concept in information theory, measures the uncertainty or randomness of a system. Shannon entropy, introduced in 1948, is the most well-known measure of information uncertainty and forms the basis of modern information theory [2]. However, Shannon entropy has limitations when dealing with higher-order moments of probability distributions, long-tailed distributions, or multi-fractal systems. In such scenarios, a more generalized entropy measure is required.
To address these limitations, Alfréd Rényi introduced the Rényi entropy in 1961 as a one-parameter generalization of Shannon entropy [11]. Rényi entropy provides flexibility in quantifying information by introducing an order parameter, α, which allows fine-tuning sensitivity to different probabilities within a distribution. This feature makes it highly useful in fields like machine learning, cryptography, statistical mechanics, quantum computing, and image processing.
2.3.1. Classical Shannon and Rényi Entropies
The Rényi entropy of order
for a probability distribution
is defined as
where
is the order of the entropy, controlling the weighting of probabilities.
Note
1) When
Renyi entropy converges to Shannon entropy
2) When
Renyi entropy focusses only on the most probable event.
The motivation behind Rényi entropy arises from its ability to:
1) Extend Shannon Entropy: While Shannon entropy assumes an equal weighting of probabilities, Rényi entropy allows adjustable sensitivity to rare or dominant events based on the value of α\alphaα.
2) Enhance Information Divergence Measures: The Rényi divergence, a generalization of Kullback-Leibler (KL) divergence, provides a more robust way to measure differences between probability distributions [12].
3) Improve Uncertainty Quantification in Fuzzy and Quantum Systems: Many real-world problems involve uncertainty beyond probabilistic randomness—for instance, fuzzy logic and quantum mechanics, where entropy measures must be adapted accordingly [13].
4) Enable Robust Feature Selection and Learning Algorithms: In machine learning and deep learning, Rényi entropy helps in feature selection, adversarial learning, and information-theoretic optimization [12].
5) Enhance Image Processing Methods: Rényi entropy is particularly effective in image segmentation and edge detection, where different levels of entropy sensitivity are required to enhance contrast and extract meaningful structures [14].
Given these advantages, Rényi entropy has become a fundamental tool in information science, artificial intelligence, and complex system analysis.
2.3.2. Entropy on Fuzzy Sets
In the context of fuzzy sets, the concept of the entropy of a fuzzy set was first introduced by [6] as a means of quantifying the degree of fuzziness within a fuzzy set. This laid the foundation for measuring the uncertainty or ambiguity present in fuzzy systems. The formal axiomatization of fuzzy entropy was later developed by [15], who drew parallels between fuzzy entropy and Shannon probability entropy, interpreting it as a measure of the amount of information contained in a fuzzy set.
Further advancements were made by [16] who proposed that the entropy of a fuzzy set could be calculated based on its distance from the nearest crisp (non-fuzzy) set, providing a geometric perspective on fuzziness measurement. Yager [17] [18] offered an alternative interpretation, defining entropy as the distance between a fuzzy set and its complement. According to [16], any meaningful measure of fuzziness should reflect the lack of distinction between a fuzzy set and its negation, leading to the proposal of a metric for quantifying the distance between a fuzzy set and its complement.
2.3.3. Formulation of Entropy on Fuzzy Sets
Formally, in [8] the entropy
of a fuzzy set
defined on a universal set
with membership function is expressed as:
where
represents the membership degree of element
in the fuzzy set
.
Illustrative example
Define
Then the measure of fuzziness of
can be defined as
For
,
yields the Hamming metric.
Since
, this becomes
.
For
we have the Euclidean metric
and
since
, then this becomes
2.4. Min-Max Normalization in Image Processing
Min-max normalization is a widely used technique in image processing, machine learning, and data preprocessing for transforming data into a predefined range, typically [0, 1] or [−1, 1]. This method is particularly effective in scenarios where pixel intensity values or feature distributions need to be scaled for consistency across different datasets.
This sub-section reviews min-max normalization from the perspective of fuzzy set theory, highlighting its applications, limitations, and recent advancements.
2.4.1. Grayscale Images
In digital topology, a grayscale image can be represented as a digital image where pixel intensities are defined on a discrete grid, and adjacency relations determine connectivity According to [19], a grayscale image is a function
where
1)
is a finite subset of the integer lattice representing the image domain.
2)
is the maximum intensity level.
3)
represents the grayscale intensity value at pixel
, where 0 corresponds to black, L corresponds to white, and intermediate values represent shades of gray [20].
2.4.2. Graph-Based Representation of a Grayscale Image
In Graph-based approach, a grayscale image can be modeled as an undirected graph
where the vertices
represent pixels in the image,
define the adjacency relations.
Each vertex will possess a grayscale intensity value
with 0 corresponding to a low intensity pixel (black) and 255 corresponding to the highest intensity pixel (white). All intermediate values represent shades of gray [20].
Illustrative example
A
grayscale image
will have a graphical representation with 4-adjacency relation as shown below
For the same image, the graphical representation with 8-adjacency relation will be
2.4.3. Mathematical Formulation
The min-max normalization ensures that data is linearly scaled within [0, 1], allowing direct interpretation as fuzzy membership values [21]. This transformation is defined by
where;
1)
is the original data value,
2)
is the normalized value in the range [0, 1] and,
3)
and
are the minimum and maximum values in the dataset.
2.4.4. Applications in Fuzzy-Based Image Processing
Min-max normalization is widely applied in fuzzy-based image processing, where it ensures that data is transformed into a standardized range suitable for fuzzy logic operations. By normalizing image intensities, feature values, or clustering inputs, it enhances the effectiveness of fuzzy techniques in image segmentation, similarity measurement, and contrast enhancement. Two of the areas where this formulation finds application are:
1) Fuzzy clustering techniques, such as Fuzzy C-Means (FCM), rely heavily on min-max normalization to standardize input features before assigning fuzzy memberships to clusters. Without proper normalization, feature scales may distort clustering results, leading to misclassification [21] [22].
2) In fuzzy-based image retrieval, similarity measures compute distances between images in a feature space. Min-max normalization ensures that feature values fall within a comparable range, enhancing the accuracy of fuzzy histogram similarity measures and fuzzy correlation techniques [23].
2.4.5. Limitations of Min-Max Normalization in Applications
Despite its effectiveness in fuzzy-based image processing, min-max normalization has several limitations that can affect its performance in certain applications. One major drawback is its sensitivity to outliers, as extreme values in the dataset disproportionately influence the scaling range, leading to distorted fuzzy membership values [24]. This can be particularly problematic in medical imaging or low-contrast image analysis, where small variations in intensity carry significant meaning. Additionally, min-max normalization assumes a linear transformation, which may not always be suitable for complex or non-linearly distributed data, such as images with varying lighting conditions or textures [23]. The technique also depends on the fixed minimum and maximum values within the dataset, which can make it ineffective for dynamically changing environments, such as real-time image processing or adaptive fuzzy systems [25]. Moreover, by strictly mapping values between 0 and 1, min-max normalization can lead to loss of contrast and information, particularly when applied to images with already narrow intensity distributions [26].
As a result, alternative normalization methods, such as sigmoid-based transformations which uses a logistic function to transform data, preserving fine-grained differences in middle-range values [27] or adaptive fuzzy normalization, which dynamically adjusts min-max scaling based on data distributions [28] are often preferred for applications requiring greater robustness and flexibility.
3. Results
3.1. Conceptual Framework for Bipolar Fuzzy Similarity Measures
In the context of fuzzy image representation, each pixel intensity
in an image is transformed into a single-valued membership function. To extend this idea into the bipolar fuzzy context, each pixel will be assigned two membership values,
where
and
represent the degree of relevance to the desired feature and undesired features respectively.
For grayscale images, the bipolar fuzzy memberships functions may be defined as
where:
is the grayscale intensity of the pixel
.
is the lowest intensity in the image (typically 0 for black intensity).
is the highest intensity value in the image (usually 255 for white intensity).
.
This proposed transformation can be shown by taking
. In this case
(a pixel with higher intensity will have a higher positive membership).
The negative membership function
determines how strongly a pixel belongs to the background, noise, or undesired features. By definition this will be given by
This formulation ensures that brighter pixels have higher
- values and lower
values and vice versa.
In example above,
if
then
This corresponds perfectly with the fact that since the positive membership measures the presence of a desired feature, the negative membership naturally represents the presence of the counter property of that feature. Therefore, a pixel of higher intensity will have a lower negative membership, representing higher presence of the low intensity characteristics.
3.1.1. Bipolar Fuzzy Similarity Measures
To compare two bipolar fuzzy images
and
we extend classical fuzzy similarity measures to account for both positive and negative memberships. The general form of a bipolar fuzzy similarity measure may be expressed as:
where:
is the similarity between positive memberships.
is the similarity between negative memberships.
are weighting factors that determine the importance of positive and negative similarities.
In Section 4.2.3, a characterization of Bipolar Jaccard similarity for Bipolar fuzzy sets was proposed. This was defined for two bipolar fuzzy sets
and
may as
This characterization is equivalent to
In this formulation, the numerator captures the overlap in positive memberships while taking care of dissimilarities in negative memberships, while the denominator normalizes the measure to ensure that
.
3.1.2. Bipolar Jaccard Similarity for Grayscale Images Using Min-Max
Normalization
In this section, we present a systematic step-by-step approach for computing the bipolar Jaccard similarity between two grayscale images for experimental validation. The utility of this approach can be seen in domains where intensity differences carry critical information and therefore the bipolar approach is potentially more effective in smoothening out minor variations thereby making similarity computation more robust against unwanted variations in pixel intensity.
This step-by-step approach is presented as a means of validating the effectiveness of the proposed Bipolar Fuzzy Jaccard Similarity, demonstrating its ability to enhance similarity computations.
Suppose we have to
grayscale images as below
Image 1
Image 2
Step 1: Compute the minimum and maximum pixel intensities
Find the minimum and maximum pixel values across both images in order to normalize the images
Step 2: Compute positive membership function
using min-max normalization
For every pixel
, we can then by min-max normalization
In
we have
Similarly for
Step 3: Compute negative membership function
The negative membership function
is defined by
.
In
we have
For
Step 4: Compute bipolar Jaccard similarity
The bipolar Jaccard similarity is given by
It can be verified that
The Bipolar Jaccard Similarity value of −0.077 suggests that the images have slightly more dissimilarities than similarities. This means that while some pixels have similar intensity values, there are still noticeable variations. The negative membership values indicate a higher degree of dissimilarity than similarity (if the similarity score were closer to 1, it would suggest that the images are nearly identical, whereas a score approaching −1 would indicate significant differences between them).
3.1.3. Bipolar Fuzzy Jaccard Similarity Using Sigmoid-Based Membership
Instead of using linear min-max normalization, we define sigmoid-based fuzzy membership functions by
where:
is the pixel intensity.
is the threshold intensity.
controls the steepness of the function (chosen as 0.1 for this example).
Suppose we have to
grayscale images as below
Image 1
Image 2
Then computing this in Python using the Bipolar fuzzy Jaccard formula,
we obtain a Bipolar Fuzzy Jaccard Similarity −0.1120.
See Appendix xx for the Full Python Implementation of the Bipolar Fuzzy Jaccard Similarity Algorithm.
The BJS with sigmoid transformation is lower (−0.112) than with the min-max transformation (−0.077) implying the two functions assigns bipolar fuzzy values differently. This difference could be as a result of:
1) Min-max transformation preserves relative pixel differences, making it more sensitive to slight variations.
2) Sigmoid-based transformation compresses extreme values, emphasizing mid-range intensities, which affects how differences in brightness levels contribute to similarity.
3) The penalty for mismatches (negative membership function) behaves differently in sigmoid-based transformation, leading to a more negative similarity score in some cases.
Key differences between the two approaches
Normalization Method |
Effect on Membership Values |
Effect on Similarity Score |
Min-Max Normalization (linear) |
Preserves relative intensity differences |
More sensitive to small pixel changes, leading to lower similarity if pixel values differ |
Sigmoid-Based Membership
(non-linear) |
Sigmoid-Based Membership (non-linear) |
Less sensitive to small differences but may yield negative similarity scores due to contrast compression |
3.2. Extending Rényi Entropy to Bipolar Fuzzy Sets
To extend Rényi entropy to bipolar fuzzy sets, we define a new entropy function that considers both the positive and negative membership degrees of elements in a BFS.
3.2.1. Bipolar Fuzzy Probability Distribution
Given a bipolar fuzzy set
, the bipolar fuzzy probability distribution may be defined as
where
represents the positive probability contribution.
where
represents the negative probability contribution.
Using these probability distributions, we can define the bipolar Rényi entropy (BRE) by
This formulation ensures that the definition encapsulates both the supporting and opposing membership degrees.
3.2.2. Properties of the Proposed BRE
1) If
then
reduces to the classical Rényi entropy. The proposed BRE for bipolar fuzzy sets generalizes the standard Rényi entropy.
2) The parameter
enables dynamic control over the entropy computation, with higher
emphasizing dominant elements.
To rigorously examine the proposed BRE, we analyze its mathematical properties, ensuring its validity as a generalization of classical Rényi entropy.
Non-negativity
A fundamental property of entropy measures is that it must be non-negative i.e.,
.
Proof
and
. Since
then
and
.
Therefore
.
Monotonicity
BRE, like classical entropy measures, is expected to be monotonically decreasing in
meaning that as the probability contribution increases with increase in
, the entropy associated with it decreases, signifying a reduction in uncertainty.
Proof
Consider the function
Taking the derivative with respect to
.
Since
and
then their logarithms will be negative
which implies
is monotonically decreasing in
. Therefore,
is monotonically decreasing
Regarding the monotonicity of entropy, it is important to note that maximum entropy occurs when the probability distribution is uniform, and the entropy decreases as the probability distribution becomes more skewed towards a few specific events. Therefore, intuitively, with more information about a system, the uncertainty about its state decreases, leading to a decrease in entropy.
Concavity
Entropy is generally expected to be concave. To establish that BRE is concave, we need to compute its second derivative and show that it is non-positive i.e.,
.
Proof
Differentiating
a second time
4. Conclusion
A key contribution of this work is the formulation of the Bipolar Fuzzy Jaccard Similarity, which refines similarity assessments by considering both shared and opposing characteristics in segmented regions. Additionally, the paper extends Rényi entropy to the bipolar fuzzy domain, enhancing the quantification of uncertainty in image segmentation tasks. The integration of bipolar fuzzy similarity and entropy measures provides a robust and interpretable computational framework for image segmentation, particularly in domains requiring comprehensive image analysis.