Bloomberg, now a trailblazer in the fusion of Artificial Intelligence, has unveiled its groundbreaking research paper introducing BloombergGPTTM, a specialised large-scale generative AI model meticulously honed for the financial realm. Unlike its counterparts, BloombergGPT excels in financial Natural Language Processing (NLP) tasks, all while maintaining its prowess in general Large Language Model (LLM) benchmarks.
A Paradigm Shift in Financial AI
Recent strides in LLMs have unfurled a gamut of novel applications across diverse domains. Nevertheless, the intricacies and idiosyncratic terminology of the financial sector beckoned the need for a domain-tailored model. BloombergGPT marks the inception of this epochal advancement, poised to elevate extant financial NLP endeavours including sentiment analysis, named entity recognition, news classification, and question answering, among others. Beyond mere enhancement, BloombergGPT promises to revolutionise the utilisation of the extensive reservoir of data residing in the Bloomberg Terminal, unlocking the full potential of AI in the financial sphere.
Bloomberg’s Decade-Long AI Odyssey
For over a decade, Bloomberg has stood at the vanguard of applying AI, Machine Learning, and NLP in the finance domain. Presently, Bloomberg oversees an extensive repertoire of NLP tasks primed for augmentation with a finance-savvy language model. Pioneering a hybrid methodology amalgamating finance data with versatile datasets, Bloomberg researchers have crafted a model that attains unparalleled performance in financial benchmarks, all the while maintaining competitiveness in general LLM benchmarks.
Constructing BloombergGPT
The fruition of BloombergGPT stemmed from an amalgamation of expertise from Bloomberg’s ML Product and Research group and the AI Engineering team. This collaborative effort led to the creation of one of the most expansive domain-specific datasets, drawing from the deep well of the company’s four-decade repository of financial language documents. This formidable corpus, comprising 363 billion tokens, was augmented with a 345 billion token public dataset, culminating in a colossal training corpus harbouring over 700 billion tokens.
From this extensive corpus, a 50-billion parameter decoder-only causal language model was honed. Rigorous validation was conducted against a spectrum of benchmarks encompassing finance-specific NLP, Bloomberg’s internal benchmarks, and diverse categories of general-purpose NLP tasks. Noteworthy is the fact that BloombergGPT eclipses its open model counterparts of commensurate size in financial tasks, while still delivering comparable or superior performance in general NLP benchmarks.
A Glimpse into the Future
Shawn Edwards, Bloomberg’s Chief Technology Officer, underscored the monumental stride achieved, emphasising the profound value inherent in developing the inaugural LLM tailored for the financial domain. He stated, “BloombergGPT will enable us to tackle many new types of applications, while it delivers much higher performance out-of-the-box than custom models for each application, at a faster time-to-market.”
Gideon Mann, Head of Bloomberg’s ML Product and Research team, expounded on the quintessential role of data in the quality of ML and NLP models. He elucidated, “Thanks to the collection of financial documents Bloomberg has curated over four decades, we were able to carefully create a large and clean, domain-specific dataset to train a LLM that is best suited for financial use cases. We’re excited to use BloombergGPT to improve existing NLP workflows, while also imagining new ways to put this model to work to delight our customers.”