Explainable Deep Fake Framework for Images Creation and Classification ()
1. Introduction
Deep learning is a practical and efficient technique that has been widely applied in various domains, including computer vision and NLP. It has revolutionized these fields by enabling machines to learn and make predictions from large amounts of data [1] . Deepfake becomes harder to distinguish between actual and fake information. Despite the development of numerous deepfake detection and classification techniques, these systems frequently fail to identify deepfakes in practical scenarios. Especially when images are altered with new approaches not included in the training set, these systems frequently fail to differentiate images correctly. By using face database [2] face recognition technology is used to verify a person’s identity. Biometric security includes face recognition among its categories. Deep learning-based networks are used by face recognition technology [3] [4] to recognize and learn certain face patterns. A mathematical representation is created using the face-related data.
In recent years, deep learning-based techniques have been established to identify and categorize deepfake photos, and numerous studies have been conducted to better understand how deepfakes operate [5] [6] . The study’s findings suggest that the Xception model, a convolutional neural network (CNN) [7] architecture named “Xception: Deep Learning with Depthwise Separable Convolutions” by François Chollet in 2017, performs best when dealing with datasets that have fewer elements and manipulation techniques because it appears to be better at storing particular anomalies. On the other hand, when taught with a wider variety of datasets, the Vision Transformer performs better. Deepfake generation is carried out using generic adversarial networks based on artificial intelligence [8] .
This paper introduces three models: the first approach is called Instant ID which uses ID embedding, a way of keeping the identity of the reference image while letting it change styles easily; the second approach called Deep Learning with Depthwise Separable Convolutions (Xception) classifies the real and deepfake images; the Third approach called Local Interpretable Model-Agnostic Explanations(LIME) provides a method for interpreting the predictions of any machine learning model in a local and interpretable manner. It approximates the model locally around the prediction of interest using a simpler, interpretable model, such as a linear model or decision tree.
The study will contribute to the understanding of deep learning techniques [9] [10] [11] and their application in addressing the challenges posed by deepfakes, ultimately aiming to improve the ability to distinguish between real and manipulated content in images.
The outlines of the research work include:
• A model called Xception is introduced to classify patterns and predict them from the real one.
• The Xception model is designed to achieve high accuracy on image classification tasks while being computationally efficient and requiring fewer parameters compared to other popular CNN models such as VGGNet and ResNe.
• The classification technique of the collected dataset is shown to be more accurate and predictive with 100% than other deep fake state-of-the-art studies.
• The LIME technique is used to visually interpret individual predictions generated by the model, emphasize important features, and provide explanations for the model’s predictions.
2. Related Work
Convolutional Neural Networks (CNNs) [12] [13] appear to be better at storing certain abnormalities and perform well when dealing with datasets that include fewer elements and manipulation techniques, according to an investigation of many deep learning architectures. On the other hand, training the Vision Transformer with a wider variety of datasets increases its effectiveness.
Rafique et al. [14] proposed a framework that combines error-level analysis and deep learning for deep fake detection and classification. The framework involves performing error-level analysis on the image to determine if it has been modified, followed by deep feature extraction using Convolutional Neural Networks (CNNs). The extracted features are then classified using Support Vector Machines (SVMs) and K-Nearest Neighbors (KNN) algorithms.
Sugant et al. [15] focused on deep fake face recognition using deep learning techniques. It implements deep fake face image analysis using the Fisherface algorithm and Local Binary Pattern Histogram (FF-LBPH). The proposed model includes the use of CNNs for deep fake detection and classification.
Silva et al. [16] proposed a hierarchical interpreting forensics algorithm that incorporates humans in the detection loop. The work curates data through a deep learning detection algorithm and shares an explainable decision with humans alongside forensic analyses on the decision region.
In other articles, the variable analysis (VA) method is applied that identifies a small number of features for robust deep fake detection and classification [17] - [21] . Decision trees (DT) and logistic regression (LR) are used to illustrate the efficacy of the suggested model. Their study’s dataset was obtained from UCI and Kaggle, and the findings showed that logistic regression performed better than other classifiers.
Today, deep learning (DL) [22] [23] [24] algorithms play a significant role in deepfake detection and classification. In several studies, it has been shown that DL is more effective in classifying ADS than ML [25] [26] [27] . Raj et al. [28] demonstrated that deep learning methods, specifically Convolutional Neural Networks (CNN), outperformed traditional ML methods for deepfake detection. The LSTM-RNN model is proposed for automated deepfake detection [29] [30] .
The field of image forensics develops techniques to detect manipulated images. This comprehensive review covers state-of-the-art methods and datasets, benefiting researchers in this field. These insights provide a glimpse into the diverse and evolving landscape of deepfake detection and classification using deep learning and explainable techniques [31] [32] [33] [34] .
The main drawbacks of the previous research works were less accurate classification in most cases. Our work creates a Deapfake image from the original one; and classifies the real and deepfake images. Additionally, provides a method for interpreting the predictions of any machine learning model in a local and interpretable manner.
3. Methodology
The purpose of this research is to introduce an explainable Deep Fake framework for Images creation and classification. The framework consists of three main parts: the first approach is called Instant ID which is used to create Deapfake image from the original one; the second approach called Deep Learning (Xception) classifies the real and deepfake images; the third approach called LIME provides a method for interpreting the predictions of any machine learning model in a local and interpretable manner.
Figure 1 explains the main three steps of the proposed model.
3.1. Data Preparation
These steps include two main phases: 1) Data Pre-processing and 2) Data Splitting. In the following paragraphs, each phase will be described in detail:
• Data pre-processing: This step involves identifying and handling any inconsistencies, errors, or missing values present in the collected data. Techniques such as removing duplicates, imputing missing values, or correcting inconsistencies will be employed to ensure data integrity. Moreover, this step involves transforming the data to ensure compatibility with the chosen AI model. In this step, category variables are encoded and numerical features are scaled. Before using a deep learning model, it is essential to resize data for fixing image size; and normalize the data because different attributes have different scales and values. All attribute data was normalized in the range [−1, 1] using the Z normalization approach, which removes the mean and scales the data to unit variance, as represented in Equation 1:
(1)
• Data splitting: The pre-processed dataset is divided into subsets for training, testing, and validation. Common approaches include random splitting or stratified sampling to ensure representative subsets for each phase. The dataset was split into training and testing sets, with a ratio of approximately 70% training data and 30% testing data.
3.2. Data Creation
These steps include three main phases: 1) Data Augmentation, 2) Data Acquisition, and 3) Instant ID. In the following paragraphs, each phase will be described in detail:
• Data augmentation creating new data from old data, a technique known as “data augmentation” can be used to artificially expand the size of a training dataset. This lessens the chance of overfitting and enhances a model’s capacity for generalization. The Keras Sequential API-based data augmentation pipeline. The pipeline consists of two data augmentation layers: 1) RandomFlip (“horizontal”): This layer randomly flips the input images horizontally. 2) Random Rotation (0.1): This layer randomly rotates the input images by up to 0.1 radians. By applying these transformations to the training data, the pipeline generates new images that are slightly different from the original images. Figure 2 shows a sample of data augmentation.
• Data acquisition is the process of sampling signals to measure actual physical
Figure 1. Main three steps for proposed model.
circumstances and translating the results into digital numerical values that a computer can control. It involves gathering data from various sensors or transducers and converting physical or electrical signals into digital data for further analysis and processing.
• Instant ID an innovative model in the field of identity generation, is revolutionizing the way we create and preserve identities. With its ability to produce high-fidelity images of individuals without any prior training data, Instant ID offers a zero-shot approach to identity generation. Moreover, Instant ID: Identity Preservation in Seconds with Zero Shots. With just one image, Instant ID is a brand-new, cutting-edge technique that achieves ID-Preserving generation without the need for tuning and supports a variety of downstream activities.
However, there are some potential factors that may obscure the replicability of the Instant ID model. These factors include:
Limited information on the model’s implementation: While the Instant ID model is described in research papers and code repositories, there may be limited information available on the specific implementation details and training procedures. This lack of detailed documentation may make it challenging for researchers to replicate the model accurately.
Dependency on pre-trained text-to-image diffusion models: Instant ID is designed to seamlessly integrate with popular pre-trained text-to-image diffusion models like SD1.5 and SDXL. The replicability of Instant ID may depend on the availability and compatibility of these pre-trained models.
Lack of user responsibility: The developers of Instant ID emphasize that users are granted the freedom to create images using this tool but are obligated to comply with local laws and utilize it responsibly. The developers do not assume any responsibility for potential misuse by users. This lack of control over how the model is used may affect its replicability in certain contexts.
It is important to note that these potential factors may not necessarily hinder the replicability of the Instant ID model in all cases. Researchers and practitioners interested in replicating the model should refer to the available documentation, research papers, and code repositories for more information on the implementation and usage of Instant ID [35] .
3.3. Xception Model
Convolutional neural network (CNN) architecture known as the Xception model was first presented by François Chollet in the 2017 publication “Xception: Deep Learning with Depthwise Separable Convolutions”. Compared to other well-known CNN models like VGGNet and ResNet, it is intended to produce high accuracy on image classification tasks while being computationally efficient and using fewer parameters.
The Xception model consists of 36 convolutional layers, organized into 14 modules. Each module contains a depth wise separable convolution layer, Resize to the fix the length, a batch normalization layer, and an activation layer. The model also includes max pooling layers for down sampling and fully connected layers for classification.
The Xception model has been shown to achieve state-of-the-art results on various image classification datasets, including the ImageNet dataset. It is particularly well-suited for tasks where computational efficiency and low memory usage are important, such as mobile and embedded applications.
The model starts with a Rescaling layer that scales the pixel values of the input images to the range [0, 1]. The next layer is a Conv2D layer with 128 filters, a kernel size of 3, and a stride of 2. This layer applies convolution operations to the input images, extracting features from them. The output of this layer is then passed through a Batch Normalization layer and an Activation layer with the “relu” activation function.
The model then enters a series of residual blocks. Each residual block consists of the following sequence of layers: 1) Activation layer with the “relu” activation function. 2) SeparableConv2D layer with a specified number of filters (either 256, 512, or 728), a kernel size of 3, and a padding of “same”. 3) Batch Normalization layer. 4) Activation layer with the “relu” activation function. 5) SeparableConv2D layer with the same number of filters as the previous layer, a kernel size of 3, and a padding of “same” Batch Normalization layer. 6) MaxPooling2D layer with a pool size of 3 and a stride of 2, used for down sampling.
After the residual blocks, the model includes a SeparableConv2D layer with 1024 filters, a kernel size of 3, and a padding of “same”. The output of this layer is then passed through a Batch Normalization layer and an Activation layer with the “relu” activation function.
The model then uses a GlobalAveragePooling2D layer to reduce the spatial dimensions of the feature map to a single vector.
Finally, the activation function for the output layer depends on the number of classes: “sigmoid” for binary classification (2 classes) and “softmax” for multi-class classification.
3.4. Model Interpretation
The XAI framework is utilized to identify Deepfake and provide meaningful interpretations of the outcomes generated by the model. The framework leverages advanced AI techniques to analyze and interpret complex data patterns associated with Deepfake.
During the training phase, the model learns to recognize patterns and relationships within the data that are indicative of Deepfake. This process involves optimizing the model’s parameters in order to reduce the difference between its predictions and the actual labels of Deepfake. Once the model is trained, the XAI framework focuses on providing interpretability of the model’s outcomes. This is achieved through various techniques designed to shed light on the decision-making process of the model and the factors driving its predictions.
LIME (Local Interpretable Model-Agnostic Explanations) offers justifications for any classifier or regressor’s predictions. LIME creates a fresh dataset of disturbances around the instance that needs to be explained. Then, each instance in the newly created dataset is classified using the machine learning classifier that has been trained. LIME has many advantages as: 1) Functions well with text, images, and tabular data Because of its versatility, 2) LIME can be used with a wide range of data formats, including text, pictures, and tabular data, 3) LIME is independent of the intricacies of the underlying model.
In this study, the Lime library provides a method for interpreting the predictions of any machine learning model in a local and interpretable manner. It approximates the model locally around the prediction of interest using a simpler, interpretable model, such as a linear model or decision tree.
4. Experimental Results
4.1. Dataset Specification and Collection
The dataset consists of a total of 589 files, which are divided into two distinct classes. Among these files, 472 are allocated for training purposes, while the remaining 117 are used for validation. The table, labeled as “Data set counting 2 classes,” provides a comprehensive breakdown of the dataset. It includes information on the number of images in each category, namely Fake (0) and Real (1), for both the training and validation sets. The training set comprises 472 images, with 311 falling under the Fake category and 161 under the Real category. On the other hand, the validation set consists of 117 images, with 69 classified as Fake and 48 as Real. Overall, the dataset contains a total of 380 Fake images and 209 Real images as shown in Table 1.
Table 2 presents the performance evaluation metrics for the Xception model. The performance of the model is assessed on both the training and validation datasets. The table includes four metrics: accuracy, precision, recall, and F1-score. The model achieved perfect scores (1.00) for all metrics on both the training and validation datasets. This indicates that the model is highly effective in classifying the data and making accurate predictions.
Table 3 presents the classification report for the Xception model. The report includes four metrics: precision, recall, F1-score, and support. The model achieved perfect scores (1.00) for all metrics for both classes (0 and 1). This indicates that the model is highly effective in classifying the data and making accurate predictions. The accuracy of the model is also 1.00, which means that it correctly classified all 117 samples in the dataset (Figure 3).
4.2. Performance Evaluation
The proposed model’s performance should be evaluated using classification metrics. Although classification accuracy is a commonly used metric, it may not be the most appropriate one for imbalanced datasets, where one class is much more represented than the others. Therefore, several other performance metrics have been developed, including precision, recall (also known as sensitivity) and F1-score. The confusion matrix in Figure 4 presents the performance of Xception model. The matrix shows the number of false positives (FP), false negatives (FN), true negatives (TN) and true positives (TP), for each class. The model is classifying between the classes “Real” and “Fake”. The TP value for the “Real”
Table 1. Data set counting 2 classes.
Table 2. Performance of Xception model.
Table 3. Xception model classification report.
Figure 4. Confusion matrix of Xception model.
class is 69, which means that the model correctly classified 69 real samples as real. The FP value for the “Real” class is 0, which means that the model did not incorrectly classify any fake samples as real. The FN value for the “Real” class is 0, which means that the model did not incorrectly classify any real samples as fake. The TN value for the “Real” class is 48, which means that the model correctly classified 48 fake samples as fake [30] [36] .
(2)
(3)
Figure 5 shows the training and validation accuracy of a deep learning model. The model’s accuracy gradually increases with the number of epochs on the training set, while the validation accuracy plateaus after around 50 epochs. To improve the model’s performance, one could try reducing the number of epochs or using a regularization technique such as dropout.
Figure 6 shows the result of Xception model. The captions indicate the predicted probability of the image being fake, the true probability of the image being fake, and whether the image is actually fake or real. The model is able to correctly classify all of the real samples and all of the fake samples. This indicates that the model is highly effective in making accurate predictions.
To explain a prediction for a particular instance, LIME generates a set of synthetic data points by perturbing the original instance and observing how the model’s prediction changes. The weights of the synthetic data points are then used to determine the importance of each feature in the original instance for the model’s prediction.
LIME is particularly useful for explaining complex machine learning models, such as deep neural networks, which can be difficult to interpret directly. By
Figure 5. Accuracy curve of Xception model.
approximating the model locally, LIME provides a simplified explanation that is easier to understand and interpret.
Figure 7 shows the explanation for the two classes using the LIME library. From Figure 6, the explanation displays the top 10 superpixels that contribute positively to the real class, while concealing the rest of the image. This insightful visualization helps identify the specific visual features the model relies on for its prediction, shedding light on the decision-making process.
4.3. Comparison with Other Models
Table 4 presents a comparative analysis of deep fake detection approaches. The
(a) [Original Real] (b) [Real with lime]
Figure 7. The explanation for the two classes.
Table 4. The comparative analysis of deep fake detection approaches.
evaluated approaches include the proposed model and the VGG16 and CNN architecture [37] [38] . The dataset used for evaluation comprises real and fake faces, as well as photoshopped real and fake faces. The accuracy and F1 scores are reported as performance metrics. The proposed model achieved a perfect accuracy and F1 score of 100, indicating its effectiveness in detecting deep fake images. In comparison, the VGG16 and CNN architecture achieved an accuracy and F1 score of 94, demonstrating relatively lower performance in identifying manipulated images. The results highlight the superior performance of the proposed model in deep fake creation and classification than the state-of-arts.
5. Conclusion
The proliferation of deep fake content in images, poses a significant challenge in the digital landscape. The ease of access to advanced tools and computing infrastructure has facilitated the creation and dissemination of deepfakes, leading to the potential spread of disinformation, hoaxes, and panic. As a result, the need for robust deepfake classification using deep learning techniques has become increasingly critical. This comprehensive study has delved into the various aspects of deep fake detection and classification, leveraging insights from state-of-the-art methods and datasets. The review has highlighted the evolving landscape of deepfake detection and classification, encompassing the utilization of Convolutional Neural Networks (CNNs) called Xception and other deep learning architectures for image analysis. Additionally, LIME provides a method for interpreting ML prediction for the classification image as real or fake. The research community’s efforts in this domain, as evidenced by numerous papers and code repositories dedicated to deepfake detection, reflect the urgency and importance of addressing the challenges posed by deep fakes. As the use of deep learning techniques to manipulate images continues to grow, the need for effective deepfake detection systems becomes even more pronounced. In conclusion, this comprehensive study serves as a valuable resource for researchers and practitioners in the field of deepfake detection and classification. It seeks to support ongoing attempts to counter the spread of deepfake content and its possible negative effects on society by illuminating the most recent methods and difficulties. The shortcomings of the existing approaches demonstrate the continued need for the development of a reliable and effective deepfake detection and classification solution based on ML and DL techniques. Moreover, it is important to consider the diversity and representativeness of the datasets to ensure that the framework’s performance is not limited to specific scenarios or domains. Future work could focus on using a big Image data to test the model well.