Earlier than an organization or a developer adopts generative synthetic intelligence (GenAI), they typically surprise methods to get enterprise worth from the combination of AI into their enterprise. With this in thoughts, a elementary query arises: Which method will ship the perfect worth on funding — a big all-encompassing proprietary mannequin or an open supply AI mannequin that may be molded and fine-tuned for a corporation’s wants? AI adoption methods fall inside a large spectrum, from accessing a cloud service from a big proprietary frontier mannequin like OpenAI’s GPT-4o to constructing an inside answer within the firm’s compute setting with an open supply small mannequin utilizing listed firm information for a focused set of duties. Present AI options go effectively past the mannequin itself, with an entire ecosystem of retrieval techniques, brokers, and different purposeful elements resembling AI accelerators, that are useful for each massive and small fashions. Emergence of cross-industry collaborations just like the Open Platform for Enterprise AI (OPEA) additional the promise of streamlining the entry and structuring of end-to-end open supply options.
This primary selection between the open supply ecosystem and a proprietary setting impacts numerous enterprise and technical choices, making it “the AI developer’s dilemma.” I consider that for many enterprise and different enterprise deployments, it is smart to initially use proprietary fashions to find out about AI’s potential and decrease early capital expenditure (CapEx). Nevertheless, for broad sustained deployment, in lots of circumstances corporations would use ecosystem-based open supply focused options, which permits for an economical, adaptable technique that aligns with evolving enterprise wants and {industry} tendencies.
GenAI Transition from Shopper to Enterprise Deployment
When GenAI burst onto the scene in late 2022 with Open AI’s GPT-3 and ChatGPT 3.5, it primarily garnered shopper curiosity. As companies started investigating GenAI, two approaches to deploying GenAI rapidly emerged in 2023 — utilizing big frontier fashions like ChatGPT vs. the newly launched small, open supply fashions initially impressed by Meta’s LLaMa mannequin. By early 2024, two primary approaches have solidified, as proven within the columns in Determine 1. With the proprietary AI method, the corporate depends on a big closed mannequin to supply all of the wanted expertise worth. For instance, taking GPT-4o as a proxy for the left column, AI builders would use OpenAI expertise for the mannequin, information, safety, and compute. With the open supply ecosystem AI method, the corporate or developer might go for the right-sized open supply mannequin, utilizing company or personal information, custom-made performance, and the mandatory compute and safety.
Each instructions are legitimate and have benefits and drawbacks. It isn’t an absolute partition and builders can select elements from both method, however taking both a proprietary or ecosystem-based open supply AI path supplies the corporate with a method with excessive inside consistency. Whereas it’s anticipated that each approaches will likely be broadly deployed, I consider that after an preliminary studying and transition interval, most corporations will comply with the open supply method. Relying on the utilization and setting, open supply inside AI might present important advantages, together with the power to fine-tune the mannequin and drive deployment utilizing the corporate’s present infrastructure to run the mannequin on the edge, on the shopper, within the information middle, or as a devoted service. With new AI fine-tuning instruments, deep experience is much less of a barrier.
Throughout all industries, AI builders are utilizing GenAI for quite a lot of purposes. An October 2023 poll by Gartner discovered that 55% of organizations reported rising funding in GenAI since early 2023, and lots of corporations are in pilot or manufacturing mode for the rising expertise. As of the time of the survey, corporations have been primarily investing in utilizing GenAI for software program growth, adopted carefully by advertising and customer support capabilities. Clearly, the vary of AI purposes is rising quickly.
Massive Proprietary Fashions vs. Small and Massive Open Supply Fashions
In my weblog Survival of the Fittest: Compact Generative AI Models Are the Future for Cost-Effective AI at Scale, I present an in depth analysis of enormous fashions vs. small fashions. In essence, following the introduction of Meta’s LLaMa open source model in February 2023, there was a virtuous cycle of innovation and fast enchancment the place the academia and broad-base ecosystem are creating extremely efficient fashions which can be 10x to 100x smaller than the big frontier fashions. A crop of small fashions, which in 2024 have been principally lower than 30 billion parameters, may closely match the capabilities of ChatGPT-style massive fashions containing effectively over 100B parameters, particularly when focused for explicit domains. Whereas GenAI is already being deployed all through industries for a variety of enterprise usages, the usage of compact fashions is rising.
As well as, open supply fashions are principally lagging only six to 12 months behind the efficiency of proprietary fashions. Utilizing the broad language benchmark MMLU, the advance tempo of the open supply fashions is quicker and the hole appears to be closing with proprietary fashions. For instance, OpenAI’s GPT-4o got here out this yr on Could 13 with main multimodal options whereas Microsoft’s small open supply Phi-3-vision was launched only a week in a while Could 21. In rudimentary comparisons performed on visible recognition and understanding, the fashions confirmed some comparable competencies, with a number of checks even favoring the Phi-3-vision mannequin. Initial evaluations of Meta’s Llama 3.2 open source release recommend that its “imaginative and prescient fashions are aggressive with main basis fashions, Claude 3 Haiku and GPT4o-mini on picture recognition and a spread of visible understanding duties.”
Massive fashions have unbelievable all-in-one versatility. Builders can select from quite a lot of massive commercially obtainable proprietary GenAI fashions, together with OpenAI’s GPT-4o multimodal mannequin. Google’s Gemini 1.5 natively multimodal mannequin is accessible in 4 sizes: Nano for cell gadget app growth, Flash small mannequin for particular duties, Professional for a variety of duties, and Extremely for extremely complicated duties. And Anthropic’s Claude 3 Opus, rumored to have approximately 2 trillion parameters, has a 200K token context window, permitting customers to add massive quantities of knowledge. There’s additionally one other class of out-of-the-box massive GenAI fashions that companies can use for worker productiveness and artistic growth. Microsoft 365 Copilot integrates the Microsoft 365 Apps suite, Microsoft Graph (content material and context from emails, recordsdata, conferences, chats, calendars, and contacts), and GPT-4.
Most massive and small open supply fashions are sometimes extra clear about software frameworks, instrument ecosystem, coaching information, and analysis platforms. Mannequin structure, hyperparameters, response high quality, enter modalities, context window measurement, and inference price are partially or absolutely disclosed. These fashions typically present info on the dataset in order that builders can decide if it meets copyright or high quality expectations. This transparency permits builders to simply interchange fashions for future variations. Among the many rising variety of small commercially obtainable open supply fashions, Meta’s Llama 3 and 3.1 are primarily based on transformer structure and obtainable in 8B, 70B, and 405B parameters. Llama 3.2 multimodal mannequin has 11B and 90B, with smaller variations at 1B and 3B parameters. In-built collaboration with NVIDIA, Mistral AI’s Mistral NeMo is a 12B mannequin that options a big 128k context window whereas Microsoft’s Phi-3 (3.8B, 7B, and 14B) affords Transformer fashions for reasoning and language understanding duties. Microsoft highlights Phi fashions for example of “the surprising power of small language models” whereas investing closely in OpenAI’s very massive fashions. Microsoft’s various curiosity in GenAI signifies that it’s not a one-size-fits-all market.
Mannequin-Integrated Knowledge (with RAG) vs. Retrieval-Centric Era (RCG)
The subsequent key query that AI builders want to handle is the place to seek out the info used throughout inference — throughout the mannequin parametric reminiscence or outdoors the mannequin (accessible by retrieval). It is likely to be arduous to consider, however the first ChatGPT launched in November 2022 didn’t have any entry to information outdoors the mannequin. It was skilled on September 21, 2022 and notoriously had no inclination of occasions and information previous its coaching date. This main oversight was addressed in 2023 when retrieval plug-ins the place added. In the present day, most fashions are coupled with a retrieval front-end with exceptions in circumstances the place there is no such thing as a expectation of accessing massive or constantly updating info, resembling devoted programming fashions.
Present fashions have made important progress on this problem by enhancing the answer platforms with a retrieval-augmented era (RAG) front-end to permit for extracting info exterior to the mannequin. An environment friendly and safe RAG is a requirement in enterprise GenAI deployment, as proven by Microsoft’s introduction of GPT-RAG in late 2023. Moreover, within the weblog Knowledge Retrieval Takes Center Stage, I cowl how within the transition from shopper to enterprise deployment for GenAI, options needs to be constructed primarily round info exterior to the mannequin utilizing retrieval-centric era (RCG).
RCG fashions may be outlined as a particular case of RAG GenAI options designed for techniques the place the overwhelming majority of information resides outdoors the mannequin parametric reminiscence and is generally not seen in pre-training or fine-tuning. With RCG, the first function of the GenAI mannequin is to interpret wealthy retrieved info from an organization’s listed information corpus or different curated content material. Moderately than memorizing information, the mannequin focuses on fine-tuning for focused constructs, relationships, and performance. The standard of information in generated output is anticipated to method 100% accuracy and timeliness.
OPEA is a cross-ecosystem effort to ease the adoption and tuning of GenAI techniques. Utilizing this composable framework, builders can create and consider “open, multi-provider, strong, and composable GenAI options that harness the perfect innovation throughout the ecosystem.” OPEA is anticipated to simplify the implementation of enterprise-grade composite GenAI options, together with RAG, brokers, and reminiscence techniques.
All-in-One Basic Objective vs. Focused Personalized Fashions
Fashions like GPT-4o, Claude 3, and Gemini 1.5 are basic function all-in-one basis fashions. They’re designed to carry out a broad vary of GenAI from coding to talk to summarization. The newest fashions have quickly expanded to carry out imaginative and prescient/picture duties, altering their perform from simply massive language fashions to massive multimodal fashions or imaginative and prescient language fashions (VLMs). Open supply basis fashions are headed in the identical path as built-in multimodalities.
Nevertheless, quite than adopting the primary wave of consumer-oriented GenAI fashions on this general-purpose kind, most companies are electing to make use of some type of specialization. When a healthcare firm deploys GenAI expertise, they might not use one basic mannequin for managing the provision chain, coding within the IT division, and deep medical analytics for managing affected person care. Companies deploy extra specialised variations of the expertise for every use case. There are a number of completely different ways in which corporations can construct specialised GenAI options, together with domain-specific fashions, focused fashions, custom-made fashions, and optimized fashions.
Area-specific fashions are specialised for a selected subject of enterprise or an space of curiosity. There are each proprietary and open supply domain-specific fashions. For instance, BloombergGPT, a 50B parameter proprietary massive language mannequin specialised for finance, beats the larger GPT-3 175B parameter model on numerous monetary benchmarks. Nevertheless, small open supply domain-specific fashions can present a superb different, as demonstrated by FinGPT, which supplies accessible and clear assets to develop FinLLMs. FinGPT 3.3 makes use of Llama 2 13B as a base mannequin focused for the monetary sector. In recent benchmarks, FinGPT surpassed BloombergGPT on quite a lot of duties and beat GPT-4 handily on monetary benchmark duties like FPB, FiQA-SA, and TFNS. To grasp the great potential of this small open supply mannequin, it needs to be famous that FinGPT may be fine-tuned to include new information for lower than $300 per fine-tuning.
Focused fashions concentrate on a household of duties or capabilities, resembling separate focused fashions for coding, picture era, query answering, or sentiment evaluation. A latest instance of a focused mannequin is SetFit from Intel Labs, Hugging Face, and the UKP Lab. This few-shot textual content classification method for fine-tuning Sentence Transformers is quicker at inference and coaching, attaining excessive accuracy with a small variety of labeled coaching information, resembling solely eight labeled examples per class on the Buyer Opinions (CR) sentiment dataset. This small 355M parameter mannequin can finest the GPT-3 175B parameter mannequin on the various RAFT benchmark.
It’s necessary to notice that focused fashions are unbiased from domain-specific fashions. For instance, a sentiment evaluation answer like SetFitABSA has focused performance and may be utilized to varied domains like industrial, leisure, or hospitality. Nevertheless, fashions which can be each focused and area specialised may be more practical.
Personalized fashions are additional fine-tuned and refined to satisfy explicit wants and preferences of corporations, organizations, or people. By indexing explicit content material for retrieval, the ensuing system turns into extremely particular and efficient on duties associated to this information (personal or public). The open supply subject affords an array of choices to customise the mannequin. For instance, Intel Labs used direct choice optimization (DPO) to enhance on a Mistral 7B mannequin to create the open supply Intel NeuralChat. Builders can also fine-tune and customise fashions through the use of low-rank adaptation of enormous language (LoRA) fashions and its extra memory-efficient model, QLoRA.
Optimization capabilities can be found for open supply fashions. The target of optimization is to retain the performance and accuracy of a mannequin whereas considerably decreasing its execution footprint, which might considerably enhance price, latency, and optimum execution of an meant platform. Some strategies used for mannequin optimization embrace distillation, pruning, compression, and quantization (to 8-bit and even 4-bit). Some strategies like combination of specialists (MoE) and speculative decoding may be thought-about as types of execution optimization. For instance, GPT-4 is reportedly comprised of eight smaller MoE fashions with 220B parameters. The execution solely prompts components of the mannequin, permitting for rather more economical inference.
Generative-as-a-Service Cloud Execution vs. Managed Execution Setting for Inference
One other key selection for builders to think about is the execution setting. If the corporate chooses a proprietary mannequin path, inference execution is finished by means of API or question calls to an abstracted and obscured picture of the mannequin operating within the cloud. The dimensions of the mannequin and different implementation particulars are insignificant, besides when translated to availability and the associated fee charged by some key (per token, per question, or limitless compute license). This method, generally known as a generative-as-a-service (GaaS) cloud providing, is the precept method for corporations to devour very massive proprietary fashions like GPT-4o, Gemini Extremely, and Claude 3. Nevertheless, GaaS can be provided for smaller fashions like Llama 3.2.
There are clear optimistic points to utilizing GaaS for the outsourced intelligence method. For instance, the entry is normally instantaneous and straightforward to make use of out-of-the-box, assuaging in-house growth efforts. There may be additionally the implied promise that when the fashions or their setting get upgraded, the AI answer builders have entry to the most recent updates with out substantial effort or modifications to their setup. Additionally, the prices are virtually solely operational expenditures (OpEx), which is most well-liked if the workload is preliminary or restricted. For early-stage adoption and intermittent use, GaaS affords extra assist.
In distinction, when corporations select an inside intelligence method, the mannequin inference cycle is integrated and managed throughout the compute setting and the present enterprise software program setting. It is a viable answer for comparatively small fashions (roughly 30B parameters or much less in 2024) and doubtlessly even medium fashions (50B to 70B parameters in 2024) on a shopper gadget, community, on-prem information middle, or on-cloud cycles in an setting set with a service supplier resembling a digital personal cloud (VPC).
Fashions like Llama 3.1 8B or comparable can run on the developer’s local machine (Mac or PC). Utilizing optimization strategies like quantization, the wanted consumer expertise may be achieved whereas working throughout the native setting. Utilizing a instrument and framework like Ollama, builders can handle inference execution regionally. Inference cycles may be run on legacy GPUs, Intel Xeon, or Intel Gaudi AI accelerators within the firm’s information middle. If inference is run on the mannequin at a service supplier, it is going to be billed as infrastructure-as-a-service (IaaS), utilizing the corporate’s personal setting and execution selections.
When inference execution is finished within the firm compute setting (shopper, edge, on-prem, or IaaS), there’s a increased requirement for CapEx for possession of the pc gear if it goes past including a workload to present {hardware}. Whereas the comparability of OpEx vs. CapEx is complicated and will depend on many variables, CapEx is preferable when deployment requires broad, steady, steady utilization. That is very true as smaller fashions and optimization applied sciences permit for operating superior open supply fashions on mainstream gadgets and processors and even native notebooks/desktops.
Working inference within the firm compute setting permits for tighter management over points of safety and privateness. Decreasing information motion and publicity may be worthwhile in preserving privateness. Moreover, a retrieval-based AI answer run in an area setting may be supported with superb controls to handle potential privateness considerations by giving user-controlled entry to info. Safety is incessantly talked about as one of many prime considerations of corporations deploying GenAI and confidential computing is a major ask. Confidential computing protects information in use by computing in an attested hardware-based Trusted Execution Environment (TEE).
Smaller, open supply fashions can run inside an organization’s most safe software setting. For instance, a mannequin operating on Xeon may be absolutely executed inside a TEE with restricted overhead. As proven in Determine 8, encrypted information stays protected whereas not in compute. The mannequin is checked for provenance and integrity to guard in opposition to tampering. The precise execution is protected against any breach, together with by the working system or different purposes, stopping viewing or alteration by untrusted entities.
Abstract
Generative AI is a transformative expertise now below analysis or energetic adoption by most corporations throughout all industries and sectors. As AI builders take into account their choices for the perfect answer, one of the crucial necessary questions they should handle is whether or not to make use of exterior proprietary fashions or depend on the open supply ecosystem. One path is to depend on a big proprietary black-box GaaS answer utilizing RAG, resembling GPT-4o or Gemini Extremely. The opposite path makes use of a extra adaptive and integrative method — small, chosen, and exchanged as wanted from a big open supply mannequin pool, primarily using firm info, custom-made and optimized primarily based on explicit wants, and executed throughout the present infrastructure of the corporate. As talked about, there may very well be a mixture of selections inside these two base methods.
I consider that as quite a few AI answer builders face this important dilemma, most will finally (after a studying interval) select to embed open supply GenAI fashions of their inside compute setting, information, and enterprise setting. They’ll trip the unbelievable development of the open supply and broad ecosystem virtuous cycle of AI innovation, whereas sustaining management over their prices and future.
Let’s give AI the ultimate phrase in fixing the AI developer’s dilemma. In a staged AI debate, OpenAI’s GPT-4 argued with Microsoft’s open supply Orca 2 13B on the deserves of utilizing proprietary vs. open supply GenAI for future growth. Utilizing GPT-4 Turbo because the decide, open supply GenAI gained the controversy. The winning argument? Orca 2 referred to as for a “extra distributed, open, collaborative way forward for AI growth that leverages worldwide expertise and goals for collective developments. This mannequin guarantees to speed up innovation and democratize entry to AI, and guarantee moral and clear practices by means of neighborhood governance.”
Study Extra: GenAI Sequence
Survival of the Fittest: Compact Generative AI Models Are the Future for Cost-Effective AI at Scale
Have Machines Just Made an Evolutionary Leap to Speak in Human Language?
References
- Good day GPT-4o. (2024, Could 13). https://openai.com/index/hello-gpt-4o/
- Open platform for enterprise AI. (n.d.). Open Platform for Enterprise AI (OPEA). https://opea.dev/
- Gartner Ballot Finds 55% of Organizations are in Piloting or Manufacturing. (2023, October 3). Gartner. https://www.gartner.com/en/newsroom/press-releases/2023-10-03-gartner-poll-finds-55-percent-of-organizations-are-in-piloting-or-production-mode-with-generative-ai
- Singer, G. (2023, July 28). Survival of the fittest: Compact generative AI fashions are the longer term for Price-Efficient AI at scale. Medium. https://towardsdatascience.com/survival-of-the-fittest-compact-generative-ai-models-are-the-future-for-cost-effective-ai-at-scale-6bbdc138f618
- Introducing LLaMA: A foundational, 65-billion-parameter language mannequin. (n.d.). https://ai.meta.com/blog/large-language-model-llama-meta-ai/
- #392: OpenAI’s improved ChatGPT ought to delight each knowledgeable and novice builders, & extra — ARK Make investments. (n.d.). Ark Make investments. https://ark-invest.com/newsletter_item/1-openais-improved-chatgpt-should-delight-both-expert-and-novice-developers
- Bilenko, M. (2024, Could 22). New fashions added to the Phi-3 household, obtainable on Microsoft Azure. Microsoft Azure Weblog. https://azure.microsoft.com/en-us/blog/new-models-added-to-the-phi-3-family-available-on-microsoft-azure/
- Matthew Berman. (2024, June 2). Open-Supply Imaginative and prescient AI — Shocking Outcomes! (Phi3 Imaginative and prescient vs LLaMA 3 Imaginative and prescient vs GPT4o) [Video]. YouTube. https://www.youtube.com/watch?v=PZaNL6igONU
- Llama 3.2: Revolutionizing edge AI and imaginative and prescient with open, customizable fashions. (n.d.). https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/
- Gemini — Google DeepMind. (n.d.). https://deepmind.google/technologies/gemini/#introduction
- Introducing the following era of Claude Anthropic. (n.d.). https://www.anthropic.com/news/claude-3-family
- Thompson, A. D. (2024, March 4). The Memo — Particular version: Claude 3 Opus. The Memo by LifeArchitect.ai. https://lifearchitect.substack.com/p/the-memo-special-edition-claude-3
- Spataro, J. (2023, Could 16). Introducing Microsoft 365 Copilot — your copilot for work — The Official Microsoft Weblog. The Official Microsoft Weblog. https://blogs.microsoft.com/blog/2023/03/16/introducing-microsoft-365-copilot-your-copilot-for-work/
- Introducing Llama 3.1: Our most succesful fashions thus far. (n.d.). https://ai.meta.com/blog/meta-llama-3-1/
- Mistral AI. (2024, March 4). Mistral Nemo. Mistral AI | Frontier AI in Your Arms. https://mistral.ai/news/mistral-nemo/
- Beatty, S. (2024, April 29). Tiny however mighty: The Phi-3 small language fashions with large potential. Microsoft Analysis. https://news.microsoft.com/source/features/ai/the-phi-3-small-language-models-with-big-potential/
- Hughes, A. (2023, December 16). Phi-2: The shocking energy of small language fashions. Microsoft Analysis. https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/
- Azure. (n.d.). GitHub — Azure/GPT-RAG. GitHub. https://github.com/Azure/GPT-RAG/
- Singer, G. (2023, November 16). Information Retrieval Takes Heart Stage — In the direction of Knowledge Science. Medium. https://towardsdatascience.com/knowledge-retrieval-takes-center-stage-183be733c6e8
- Introducing the open platform for enterprise AI. (n.d.). Intel. https://www.intel.com/content/www/us/en/developer/articles/news/introducing-the-open-platform-for-enterprise-ai.html
- Wu, S., Irsoy, O., Lu, S., Dabravolski, V., Dredze, M., Gehrmann, S., Kambadur, P., Rosenberg, D., & Mann, G. (2023, March 30). BloombergGPT: A big language mannequin for finance. arXiv.org. https://arxiv.org/abs/2303.17564
- Yang, H., Liu, X., & Wang, C. D. (2023, June 9). FINGPT: Open-Supply Monetary Massive Language Fashions. arXiv.org. https://arxiv.org/abs/2306.06031
- AI4Finance-Basis. (n.d.). FinGPT. GitHub. https://github.com/AI4Finance-Foundation/FinGPT
- Starcoder2. (n.d.). GitHub. https://huggingface.co/docs/transformers/v4.39.0/en/model_doc/starcoder2
- SetFit: Environment friendly Few-Shot Studying With out Prompts. (n.d.). https://huggingface.co/blog/setfit
- SetFitABSA: Few-Shot Side Primarily based Sentiment Evaluation Utilizing SetFit. (n.d.). https://huggingface.co/blog/setfit-absa
- Intel/neural-chat-7b-v3–1. Hugging Face. (2023, October 12). https://huggingface.co/Intel/neural-chat-7b-v3-1
- Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. (2021, June 17). LORA: Low-Rank adaptation of Massive Language Fashions. arXiv.org. https://arxiv.org/abs/2106.09685
- Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, Could 23). QLORA: Environment friendly Finetuning of Quantized LLMS. arXiv.org. https://arxiv.org/abs/2305.14314
- Leviathan, Y., Kalman, M., & Matias, Y. (2022, November 30). Quick Inference from Transformers through Speculative Decoding. arXiv.org. https://arxiv.org/abs/2211.17192
- Bastian, M. (2023, July 3). GPT-4 has greater than a trillion parameters — Report. THE DECODER. https://the-decoder.com/gpt-4-has-a-trillion-parameters/
- Andriole, S. (2023, September 12). LLAMA, ChatGPT, Bard, Co-Pilot & all the remainder. How massive language fashions will change into big cloud providers with large ecosystems. Forbes. https://www.forbes.com/sites/steveandriole/2023/07/26/llama-chatgpt-bard-co-pilot–all-the-rest–how-large-language-models-will-become-huge-cloud-services-with-massive-ecosystems/?sh=78764e1175b7
- Q8-Chat LLM: An environment friendly generative AI expertise on Intel® CPUs. (n.d.). Intel. https://www.intel.com/content/www/us/en/developer/articles/case-study/q8-chat-efficient-generative-ai-experience-xeon.html#gs.36q4lk
- Ollama. (n.d.). Ollama. https://ollama.com/
- AI Accelerated Intel® Xeon® Scalable Processors Product Temporary. (n.d.). Intel. https://www.intel.com/content/www/us/en/products/docs/processors/xeon-accelerated/ai-accelerators-product-brief.html
- Intel® Gaudi® AI Accelerator merchandise. (n.d.). Intel. https://www.intel.com/content/www/us/en/products/details/processors/ai-accelerators/gaudi-overview.html
- Confidential Computing Options — Intel. (n.d.). Intel. https://www.intel.com/content/www/us/en/security/confidential-computing.html
- What’s a Trusted Execution Setting? (n.d.). Intel. https://www.intel.com/content/www/us/en/content-details/788130/what-is-a-trusted-execution-environment.html
- Adeojo, J. (2023, December 3). GPT-4 Debates Open Orca-2–13B with Shocking Outcomes! Medium. https://pub.aimind.so/gpt-4-debates-open-orca-2-13b-with-surprising-results-b4ada53845ba
- Knowledge Centric. (2023, November 30). Shocking Debate Showdown: GPT-4 Turbo vs. Orca-2–13B — Programmed with AutoGen! [Video]. YouTube. https://www.youtube.com/watch?v=JuwJLeVlB-w