How Enterprises Can Build Their Own Large Language Model Similar to OpenAI’s ChatGPT | by Pronojit Saha | Jul, 2023


Want to build your own ChatGPT? Here are three ways you can do so

Language models have gained significant attention in recent years, revolutionizing various fields such as natural language processing, content generation, and virtual assistants. One of the most prominent examples is OpenAI’s ChatGPT, a large language model that can generate human-like text and engage in interactive conversations. This has sparked the curiosity of enterprises, leading them to explore the idea of building their own large language models (LLMs).

However, the decision to embark on building an LLM should be reviewed carefully. It requires significant resources, both in terms of computational power and data availability. Enterprises must weigh the benefits against the costs, evaluate the technical expertise required, and assess whether it aligns with their long-term goals.

In this article, we show you three ways of building your own LLM, similar to OpenAI’s ChatGPT. By the end of this article, you will have a clearer understanding of the challenges, requirements, and potential rewards associated with building your own large language model. So let’s dive in!

To understand whether enterprises should build their own LLM, let’s explore the three primary ways they can leverage such models.

Fig 1: Different Ways to Leverage Large Language Models (Image by Author)

1. Closed sources LLMs: Enterprises can utilize pre-existing LLM services like OpenAI’s ChatGPT, Google’s Bard, or similar offerings from different providers. These services provide a ready-to-use solution, allowing businesses to leverage the power of LLMs without the need for significant infrastructure investment or technical expertise.

Pros:

  • Quick and easy deployment, saving time and effort.
  • Good performance on ​​​generic text generation tasks.

Cons:

  • Limited control over the model’s behavior and responses
  • Less accurate on domain or enterprise-specific data
  • Data privacy concerns​ since data is sent to the third party hosting the service
  • Dependency on third-party providers and potential pricing fluctuations.

2. Using domain-specific LLMs: Another approach is to use domain-specific language models, such as BloombergGPT for finance, BioMedLM for biomedical applications, MarketingGPT for marketing applications, CommerceGPT for e-commerce applications, etc. These models are trained on domain-specific data, enabling more accurate and tailored responses in their respective fields.

Pros:

  • Improved accuracy in specific domains due to training on relevant data.
  • Availability of pre-trained models tailored to specific industries.

Cons:

  • Limited flexibility in adapting the model beyond its designated domain.
  • Dependency on the provider’s updates and availability of domain-specific models.
  • Slightly better accuracy but still limited by not being specific to your enterprise data
  • Data privacy concerns​ since data is sent to the third party hosting the service

3. Build and host a custom LLM: The most comprehensive option is for enterprises to build and host their own LLM using their specific data. This approach provides the highest level of customization and privacy control over the generated content. It allows organizations to fine-tune the model to their unique requirements, ensuring domain-specific accuracy and alignment with their brand voice.

Pros:

  • Complete customization and control: A custom model enables businesses to generate responses that align precisely with their brand voice, industry-specific terminology, and unique requirements.
  • Cost-effective if properly setup​ (finetuning cost in the order of $100s)​
  • Transparent: Entire data & model are known​ to the enterprise
  • Best accuracy: By training the model on enterprise-specific data & requirements, it can better understand and respond to enterprise-specific queries, resulting in more accurate and contextually relevant outputs.
  • Privacy friendly: Data & Model stay in your environment. Having a custom model allows enterprises to retain control over their sensitive data, minimizing concerns related to data privacy and security breaches.
  • Competitive Advantage: A custom large language model can be a significant differentiator in industries where personalized and accurate language processing plays a crucial role.

Cons:

  • Need significant ML & LLM expertise to build a custom large language model

It’s important to note that the approach to custom LLM depends on various factors, including the enterprise’s budget, time constraints, required accuracy, and the level of control desired. However, as you can see from above building a custom LLM on enterprise-specific data offers numerous benefits.

Custom large language models offer unparalleled customization, control, and accuracy for specific domains, use cases, and enterprise requirements. Thus enterprises should look to build their own enterprise-specific custom large language model, to unlock a world of possibilities tailored specifically to their needs, industry, and customer base.

You can build your custom LLM in three ways and these range from low complexity to high complexity as shown in the below image.

Fig 2: Three Ways to Build Your Custom LLM (Image by Author)

L1. Utilization Tuned LLM

One prevalent method for leveraging pre-trained LLMs involves devising effective prompting techniques to address diverse tasks. An example of a common prompting approach is In-Context Learning (ICL), which entails expressing task descriptions and/or demonstrations in natural language text. Furthermore, the utilization of Chain-of-Thought (CoT) can augment in-context learning by incorporating a sequence of intermediate reasoning steps within prompts. To build an L1 LLM,

Fig 3: A comparative illustration of ICL & CoT prompting. (Image from the paper “A survey of Large Language Models” — Reference №7)

To build an L1 LLM,

  1. Begin by selecting a suitable pre-trained LLM (which can be found in the Hugging Face model library or other online resources), ensuring its compatibility with commercial use by reviewing the license.
  2. Next, identify relevant data sources for your specific domain or use case, assembling a diverse and comprehensive dataset that encompasses a wide range of topics and language variations. For L1 LLM, labeled data is not required.
  3. In the customization process, the model parameters of the chosen pre-trained LLM remain unaltered. Instead, prompt engineering techniques are employed to tailor the LLM’s responses to the dataset.
  4. As mentioned above, In-Context Learning and Chain-of-Thought Prompting are two popular prompt engineering approaches. These techniques, collectively known as Resource Efficient Tuning (RET), offer a streamlined means of obtaining responses without requiring significant infrastructure resources.

L2. Instruction Tuned LLM

Instruction tuning is the approach to fine-tuning pre-trained LLMs on a collection of formatted instances in the form of natural language, which is highly related to supervised fine-tuning and multi-task prompted training. With instruction tuning, LLMs are enabled to follow the task instructions for new tasks without using explicit examples (akin to zero-shot capability), thus having an improved generalization ability. To build this instruction-tuned L2 LLM,

  1. Begin by selecting a suitable pre-trained LLM (which can be found in the Hugging Face model library or other online resources), ensuring its compatibility with commercial use by reviewing the license.
  2. Next, identify relevant data sources for your target domain or use case. A labeled dataset containing a variety of instructions specific to your domain or use case is necessary. For instance, you can refer to the dolly-15k dataset provided by Databricks, which offers instructions in different formats such as closed-qa, open-qa, classification, information retrieval, and more. This dataset can serve as a template to construct your own instruction dataset.
  3. Moving on to the supervised fine-tuning process, we introduce new model parameters to the original base LLM chosen in step 1. By adding these parameters, we can train the model for specific epochs to fine-tune it for the given instructions. The advantage of this approach is that it avoids the need to update billions of parameters present in the base LLM, instead focusing on a smaller number of additional parameters (thousands or millions) while still achieving accurate results in the desired task. This approach also helps reduce costs.
  4. The next step is to do the fine-tuning. Various fine-tuning techniques such as prefix tuning, adapters, low-rank attention, and more on these will be elaborated on in a future article. The process of adding new model parameters discussed in point 3 above is also dependent on these techniques. For more detailed information, please refer to the references section. These techniques fall under the category of Parameter Efficient Fine Tuning (PEFT), as they enable customization without updating all parameters of the base LLM.

L3. Alignment Tuned LLM

Since LLMs are trained to capture the data characteristics of pre-training corpora (including both high-quality and low-quality data), they are likely to generate toxic, biased, or even harmful content for humans. Thus it might be necessary to align LLMs with human values, e.g., helpful, honest, and harmless. For this alignment purpose, we use the technique of reinforcement learning with human feedback (RLHF), an effective tuning approach that enables LLMs to follow the expected instructions. It incorporates humans in the training loop with elaborately designed labeling strategies. To build this alignment-tuned L3 LLM,

  1. Begin by selecting an open-source pre-trained LLM (which can be found in the Hugging Face model library or other online resources) or your L2 LLM as your base model.
  2. The primary technique for building an alignment-tuned LLM is RLHF, which combines supervised learning and reinforcement learning. It starts with taking a fine-tuned LLM on a specific domain or instruction corpus (from step 1) and using it to generate responses. Then those responses are annotated using a human to train a supervised reward model (typically using another pretrained LLM as the base model). Finally, the LLM (from step 1) is again fine-tuned by doing reinforcement learning (PPO) with the reward model to generate the final response.
  3. Thus two LLMs are trained: one for the reward model and another for fine-tuning the LLM for generating the final response. The base model parameters in both cases can be updated selectively, depending on the desired accuracy in the response. For example, in some RLHF methods, only the parameters in specific layers or components involved in reinforcement learning are updated to avoid overfitting and retain the general knowledge captured by the pre-trained LLM.

An interesting artifact of this process is that the successful RLHF systems to date have used reward language models with varying sizes relative to the text generation (e.g. OpenAI 175B LM, 6B reward model, Anthropic used LM and reward models from 10B to 52B, DeepMind uses 70B Chinchilla models for both LM and reward). An intuition would be that these preference models need to have a similar capacity to understand the text given to them as a model would need in order to generate said text.

There is also RLAIF (Reinforcement Learning with AI Feedback) which can be used in place of RLHF. The main difference here is instead of the human feedback an AI model serves as the evaluator or critic, providing feedback to the AI agent during the reinforcement learning process.

Enterprises can harness the extraordinary potential of custom LLMs to achieve exceptional customization, control, and accuracy that align with their specific domains, use cases, and organizational demands. Building an enterprise-specific custom LLM empowers businesses to unlock a multitude of tailored opportunities, perfectly suited to their unique requirements, industry dynamics, and customer base.

The journey to building own custom LLM has three levels starting from low model complexity, accuracy & cost to high model complexity, accuracy & cost. Enterprises must balance this tradeoff to suit their needs to the best and extract ROI from their LLM initiative.

Fig 4: Tradeoffs among the three LLM levels (Image by Author)

References

  1. What is prompt engineering?
  2. In-Context Learning (ICL) — Q. Dong, L. Li, D. Dai, C. Zheng, Z. Wu, B. Chang, X. Sun, J. Xu, L. Li, and Z. Sui, “A survey for in-context learning,” CoRR, vol. abs/2301.00234, 2023.
  3. How does in-context learning work? A framework for understanding the differences from traditional supervised learning | SAIL Blog (stanford.edu)
  4. Chain of Thought Prompting — J. Wei, X. Wang, D. Schuurmans, M. Bosma, E. H. Chi, Q. Le, and D. Zhou, “Chain of thought prompting elicits reasoning in large language models,” CoRR, vol. abs/2201.11903, 2022.
  5. Language Models Perform Reasoning via Chain of Thought — Google AI Blog (googleblog.com)
  6. Instruction Tuning — J. Wei, M. Bosma, V. Y. Zhao, K. Guu, A. W. Yu, B. Lester, N. Du, A. M. Dai, and Q. V. Le, “Fine-tuned language models are zero-shot learners,” in The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25–29, 2022. OpenReview.net, 2022.
  7. A survey of Large Language Models — Wayne Xin Zhao, Kun Zhou*, Junyi Li*, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, Yifan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie and Ji-Rong Wen, arXiv:2303.18223v4 [cs.CL], April 12, 2023



Source link

Leave a Comment