TITLE:
Layer Cake: On Language Representation and Compute Characteristics in Text Classification
AUTHORS:
Peter J. Worth Jr., Ionut Cardei
KEYWORDS:
Text Classification, Large Language Models, LLMs, Natural Language Processing, NLP, Language Representation, Word Embeddings, Vector Space Models, Semantic Spaces, Artificial Intelligence, AI, Computational Linguistics
JOURNAL NAME:
International Journal of Intelligence Science,
Vol.15 No.4,
September
9,
2025
ABSTRACT: Since transformer-based language models were introduced in 2017, they have been shown to be extraordinarily effective across a variety of NLP tasks including but not limited to language generation. The introduction and widespread adoption of these LLMs, which encode extremely high-dimensional semantic spaces, comes at a significant cost in terms of system and computational resource requirements, requirements that have reshaped the entire chip (GPU) and data center industry as hardware, cloud, and infrastructure providers try to keep up with the demand. This has motivated the research community to develop a variety of design strategies that optimize the use of these resources; however, computational requirements continue to grow in proportion to model size and complexity. With this study, we introduce a framework called Layer Cake for the precise measurement of the relative computational resource requirements necessary for text classification using a variety of classifiers from across the Machine Learning (ML) and Deep Learning (DL) landscape, leveraging different language model families and focusing on the forms of language representation used in different test scenarios. We find that while LLMs do yield the best results across classifiers on average, these improvements come at a significant computational overhead. For example, from a Macro-F1 score perspective, LLM-based classifiers outperform their static embedding language model counterparts (Word2Vec, FastText & GloVe), even when encapsulated in DL architectures such as Convolutional Neural Networks or Long-Short-Term Networks by 8.87% on average, and perform 12.73% better than ML classifiers such as Support Vector Machines and Logistic Regression models. However, this uptick in model performance comes at a computational overhead cost of 4398.07% compared to the GPU requirements of static word embedding DL classifiers, and a 4126.02% increase in computation time relative to ML classifiers, the latter of which are CPU, rather than GPU, bound.