AxLaM: energy-efficient accelerator design for language models for edge computing

DR Home
→
Electrical Engineering
→
Journal Articles
→
View Item

dc.contributor.author	Issac, Tom Glint
dc.contributor.author	Mittal, Bhumika
dc.contributor.author	Sharma, Santripta
dc.contributor.author	Ronak, Abdul
dc.contributor.author	Goud, Abhinav
dc.contributor.author	Kasture, Neerja
dc.contributor.author	Momin, Zaqi
dc.contributor.author	Krishna, Aravind
dc.contributor.author	Mekie, Joycee
dc.coverage.spatial	United Kingdom
dc.date.accessioned	2025-01-24T15:05:29Z
dc.date.available	2025-01-24T15:05:29Z
dc.date.issued	2025-01
dc.identifier.citation	Issac, Tom Glint; Mittal, Bhumika; Sharma, Santripta; Ronak, Abdul; Goud, Abhinav; Kasture, Neerja; Momin, Zaqi; Krishna, Aravind and Mekie, Joycee, "AxLaM: energy-efficient accelerator design for language models for edge computing", Philosophical Transactions of the Royal Society A, DOI: 10.1098/rsta.2023.0395, vol. 383, no. 2288, Jan. 2025.
dc.identifier.issn	1364-503X
dc.identifier.issn	1471-2962
dc.identifier.uri	https://doi.org/10.1098/rsta.2023.0395
dc.identifier.uri	https://repository.iitgn.ac.in/handle/123456789/10961
dc.description.abstract	Modern language models such as bidirectional encoder representations from transformers have revolutionized natural language processing (NLP) tasks but are computationally intensive, limiting their deployment on edge devices. This paper presents an energy-efficient accelerator design tailored for encoder-based language models, enabling their integration into mobile and edge computing environments. A data-flow-aware hardware accelerator design for language models inspired by Simba, makes use of approximate fixed-point POSIT-based multipliers and uses high bandwidth memory (HBM) in achieving significant improvements in computational efficiency, power consumption, area and latency compared to the hardware-realized scalable accelerator Simba. Compared to Simba, AxLaM achieves a ninefold energy reduction, 58% area reduction and 1.2 times improved latency, making it suitable for deployment in edge devices. The energy efficiency of AxLaN is 1.8 TOPS/W, 65% higher than FACT, which requires pre-processing of the language model before implementing it on the hardware.
dc.description.statementofresponsibility	by Tom Glint Issac, Bhumika Mittal, Santripta Sharma, Abdul Ronak, Abhinav Goud, Neerja Kasture, Zaqi Momin, Aravind Krishna and Joycee Mekie
dc.format.extent	vol. 383, no. 2288
dc.language.iso	en_US
dc.publisher	The Royal Society
dc.subject	Transformer accelerator
dc.subject	Language model BERT
dc.subject	Hardware accelerator
dc.title	AxLaM: energy-efficient accelerator design for language models for edge computing
dc.type	Article
dc.relation.journal	Philosophical Transactions of the Royal Society A

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Journal Articles [411]

Show simple item record

Search Digital Repository

Browse

All of DSpace
This Collection
- Titles
- Authors
- By Advisor
- By Issue Date
- Subjects
- By Type
- By Degree
- By Department

AxLaM: energy-efficient accelerator design for language models for edge computing

Files in this item

This item appears in the following Collection(s)

Search Digital Repository

Browse

All of DSpace

This Collection

My Account