What’s the minimum viable infrastructure your enterprise needs for AI?


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


As we approach the midpoint of the 2020s decade, enterprises of all sizes and sectors are increasingly looking at how to adopt generative AI to increase efficiencies and reduce time spent on repetitive, onerous tasks.

In some ways, having some sort of generative AI application or assistant is rapidly moving from becoming a “nice to have” to a “must have.”

But what is the minimum viable infrastructure needed to achieve these benefits? Whether you’re a large organization or a small business, understanding the essential components of an AI solution is crucial.

This guide — informed by leaders in the sector including experts at Hugging Face and Google — outlines the key elements, from data storage and large language model (LLM) integration to development resources, costs and timelines, to help you make informed decisions.

>>Don’t miss our special issue: Fit for Purpose: Tailoring AI Infrastructure.<<

Data storage and data management

The foundation of any effective gen AI system is data — specifically your company’s data, or at least, data that is relevant to your firm’s business and/or goals.

Yes, your business can immediately use off-the-shelf chatbots powered by large language models (LLMs) such as Google’s Gemini, OpenAI’s ChatGPT, Anthropic Claude or other chatbots readily available on the web — which may assist with specific company tasks. And it can do so without inputting any company data.

However, unless you feed these your company’s data — which may not be allowed due to security concerns or company policies — you won’t be able to reap the full benefits of what LLMs can offer.

So step one in developing any helpful AI product for your company to use, internally or externally, is understanding what data you have and can share with an LLM, whether that be a public or private one you control on your own servers and where it is located. Also whether it is structured or unstructured.

Structured data is organized typically in databases and spreadsheets, with clearly defined fields like dates, numbers and text entries. For instance, financial records or customer data that fit neatly into rows and columns are examples of structured data.

Unstructured data, on the other hand, lacks a consistent format and is not organized in a predefined manner. It includes various types of content like emails, videos, social media posts and documents, which do not fit easily into traditional databases. This type of data is more challenging to analyze due to its diverse and non-uniform nature.

This data can include everything from customer interactions and HR policies to sales records and training materials. Depending on your use case for AI — developing products internally for employees or externally for customers — the route you go will likely change.

Let’s take a hypothetical furniture maker — the “Chair Company” — that makes chairs for consumers and businesses out of wood.

This Chair Company wants to create an internal chatbot for employees to use that can answer common questions such as how to file expenses, how to request time off and where files for building chairs are located.

The Chair Company may in this case already have these files stored on a cloud service such as Google Cloud, Microsoft Azure or AWS. For many businesses, integrating AI capabilities directly into existing cloud platforms can significantly simplify the deployment process.

Google Workspace, combined with Vertex AI, enables enterprises to leverage their existing data across productivity tools like Docs and Gmail.

A Google spokesperson explained to VentureBeat, “With Vertex AI’s Model Garden, businesses can choose from over 150 pre-built models to fit their specific needs, integrating them seamlessly into their workflows. This integration allows for the creation of custom agents within Google Workspace apps, streamlining processes and freeing up valuable time for employees.”

For example, Bristol Myers Squibb used Vertex AI to automate document processes in their clinical trials, demonstrating how powerful these integrations can be in transforming business operations. For smaller businesses or those new to AI, this integration provides a user-friendly entry point to harness the power of AI without extensive technical overhead.

But what if the company has data stored only on an intranet or local private servers? The Chair Company — or any other in a similar boat — can still leverage LLMs and build a chatbot to answer company questions. However, they will likely want to deploy one of many open-source models available from the coding community Hugging Face instead.

“If you’re in a highly regulated industry like banking or healthcare, you might need to run everything in-house,” explained Jeff Boudier, head of product and growth at Hugging Face, in a recent interview with VentureBeat. “In such cases, you can still use open-source tools hosted on your own infrastructure.”

Boudier recorded the following demo video for VentureBeat showing how to use Hugging Face’s website and available models and tools to create an AI assistant for a company.

A Large Language Model (LLM)

Once you’ve determined what company data you can and want to feed into an AI product, the next step is selecting which large language model (LLM) you wish to power it.

Choosing the right LLM is a critical step in building your AI infrastructure. LLMs such as OpenAI’s GPT-4, Google’s DialogFlow, and the open models hosted on Hugging Face offer different capabilities and levels of customization. The choice depends on your specific needs, data privacy concerns and budget.

Those charged with overseeing and implementing AI integration at a company will need to assess and compare different LLMs, which they can do using websites and services such as the LMSYS Chatbot Arena Leaderboard on Hugging Face.

If you go the route of a proprietary LLM such as OpenAI’s GPT series, Anthropic’s Claude family or Google’s Gemini series, you’ll need to find and plug the LLM into your database via the LLM provider’s private application programming interface (API).

Meanwhile, if the Chair Company or your business wants to host a model on its own private infrastructure for enhanced control and data security, then an open-source LLM is likely the way to go.

As Boudier explains, “The main benefit of open models is that you can host them yourself. This ensures that your application’s behavior remains consistent, even if the original model is updated or changed.”

Already, VentureBeat has reported on the growing number of businesses adopting open source LLMs and AI models from the likes of Meta’s Llama and other providers and independent developers.

Retrieval-Augmented Generation (RAG) framework

For a chatbot or AI system to provide accurate and relevant responses, integrating a retrieval augmented generation (RAG) framework is essential.

This involves using a retriever to search for relevant documents based on user queries and a generator (an LLM) to synthesize the information into coherent responses.

Implementing an RAG framework requires a vector database like Pinecone or Milvus, which stores document embeddings—structured representations of your data that make it easy for the AI to retrieve relevant information.

The RAG framework is particularly useful for enterprises that need to integrate proprietary company data stored in various formats, such as PDFs, Word documents and spreadsheets.

This approach allows the AI to pull relevant data dynamically, ensuring that responses are up-to-date and contextually accurate.

According to Boudier, “Creating embeddings or vectorizing documents is a crucial step in making data accessible to the AI. This intermediate representation allows the AI to quickly retrieve and utilize information, whether it’s text-based documents or even images and diagrams.”

Development expertise and resources

While AI platforms are increasingly user-friendly, some technical expertise is still required for implementation. Here’s a breakdown of what you might need:

  • Basic Setup: For straightforward deployment using pre-built models and cloud services, your existing IT staff with some AI training should suffice.
  • Custom Development: For more complex needs, such as fine-tuning models or deep integration into business processes, you’ll need data scientists, machine learning engineers, and software developers experienced in NLP and AI model training.

For businesses lacking in-house resources, partnering with an external agency is a viable option. Development costs for a basic chatbot range from $15,000 to $30,000, while more complex AI-driven solutions can exceed $150,000.

“Building a custom AI model is accessible with the right tools, but you’ll need technical expertise for more specialized tasks, like fine-tuning models or setting up a private infrastructure,” Boudier noted. “With Hugging Face, we provide the tools and community support to help businesses, but having or hiring the right talent is still essential for successful implementation.”

For businesses without extensive technical resources, Google’s AppSheet offers a no-code platform that allows users to create custom applications by simply describing their needs in natural language. Integrated with AI capabilities like Gemini, AppSheet enables rapid development of tools for tasks such as facility inspections, inventory management and approval workflows—all without traditional coding skills. This makes it a powerful tool for automating business processes and creating customized chatbots.

Time and budget considerations

Implementing an AI solution involves both time and financial investment. Here’s what to expect:

  • Development Time: A basic chatbot can be developed in 1-2 weeks using pre-built models. However, more advanced systems that require custom model training and data integration may take several months.
  • Cost: For in-house development, budget around $10,000 per month, with total costs potentially reaching $150,000 for complex projects. Subscription-based models offer more affordable entry points, with costs ranging from $0 to $5,000 per month depending on features and usage.

Deployment and maintenance

Once developed, your AI system will need regular maintenance and updates to stay effective. This includes monitoring, fine-tuning and possibly retraining the model as your business needs and data evolve. Maintenance costs can start at $5,000 per month, depending on the complexity of the system and the volume of interactions.

If your enterprise operates in a regulated industry like finance or healthcare, you may need to host the AI system on private infrastructure to comply with data security regulations. Boudier explained, “For industries where data security is paramount, hosting the AI model internally ensures compliance and full control over data and model behavior.”

Final takeaways

To set up a minimum viable AI infrastructure for your enterprise, you need:

  • Cloud Storage and Data Management: Organize and manage your data efficiently using an intranet, private servers, private clouds, hybrid clouds or commercial cloud platforms like Google Cloud, Azure or AWS.
  • A Suitable LLM: Choose a model that fits your needs, whether hosted on a cloud platform or deployed on private infrastructure.
  • A RAG Framework: Implement this to dynamically pull and integrate relevant data from your knowledge base.
  • Development Resources: Consider in-house expertise or external agencies for building, deploying, and maintaining your AI system.
  • Budget and Time Allocation: Prepare for initial costs ranging from $15,000 to $150,000 and development time of a few weeks to several months, depending on complexity.
  • Ongoing Maintenance: Regular updates and monitoring are necessary to ensure the system remains effective and aligned with business goals.

By aligning these elements with your business needs, you can create a robust AI solution that drives efficiency, automates tasks, and provides valuable insights—all while maintaining control over your technology stack.



Source link

About The Author