TITLE:
Dual-Stream Detection of HTTP Injection Attacks: A Hybrid Architecture Combining ModernBERT and Character-Level CNNs
AUTHORS:
Emil Wangilisasi, Judith Leo, Anael Sam
KEYWORDS:
Web Application Security, HTTP Injection, Deep Learning, ModernBERT, Convolutional Neural Networks
JOURNAL NAME:
Journal of Intelligent Learning Systems and Applications,
Vol.18 No.3,
June
29,
2026
ABSTRACT: Web applications remain critically vulnerable to injection attacks, including SQL Injection (SQLi), OS Command Injection, and Cross-Site Scripting (XSS) among others, which exploit the semantic gap between user-supplied input and executable code. Traditional Web Application Firewalls (WAFs) rely on signature-based pattern matching, rendering them susceptible to evasion through payload obfuscation. While Transformer-based language models excel at capturing long-range contextual dependencies and have demonstrated promise in detecting malicious intent, they alone may not fully exploit the fine-grained character-level patterns that distinguish obfuscated attack payloads. This paper proposes a Dual-Stream Hybrid Architecture that combines the contextual reasoning capabilities of Modern Bidirectional Encoder Representations from Transformers (ModernBERT), a state-of-the-art encoder supporting 8192 token sequences, with a Character-Level 1D Convolutional Neural Network (CNN) optimized for morphological pattern recognition. The semantic stream captures high-level payload intent, while the syntactic stream detects low-level structural signatures of obfuscation techniques (e.g., URL encoding, hexadecimal evasion, comment injection). Experiments on a composite dataset of ~93,500 HTTP requests, aggregated from public payload repositories, synthetic attack campaigns generated with SQLMap and XSStrike on vulnerable web applications, live honeypot traffic captured via T-Pot (SNARE/Tanner), and benign browsing sessions simulated with Playwright, demonstrate that our hybrid approach achieves 98.7% accuracy and 98.3% F1-score, outperforming standalone ModernBERT and CNN baselines by 2.4% and 8.6% respectively.