Unlocking Data Privacy: How to Build Your Private Enterprise Data App with Private GPT and Llama 2

Ravindra Elicherla
Generative AI
Published in
5 min readJul 25, 2023

--

“build cool apps private and secured data”

Meta has this week released an Open Source version of LLM mode, Llama 2, for public use. The large language model (LLM), which can be used to create a chat GPT like chatbot. Latest version of Llama — Llama 2 — is now accessible to individuals, creators, researchers, and businesses. Many believe that Llama 2 is the industry’s most important release since ChatGPT in November 2022. If you are eager to read the research paper, you can find it here.

Meta released two versions.

  1. Llama-2, an updated version of Llama 1, trained on a new mix of publicly available data. Meta increased the size of the pretraining corpus by 40%, doubled the context length of the model, and adopted grouped-query attention. Llama 2 was released with 7B, 13B, and 70B parameters.
  2. Llama 2-Chat, a fine-tuned version of Llama 2 that is optimized for dialogue use cases. The variants of this model have 7B, 13B, and 70B parameters as well.
Training of Llama 2-Chat (Source:https://ai.meta.com/llama/ )

Pretraining data: The Llama-2 training corpus includes a new mix of data from publicly available sources that does not include data from Meta’s products or services. Removed data from certain sites known to contain a high volume of personal information about private individuals. The model was trained on 2 trillion tokens of data as this provides a good performance–cost trade-off, up-sampling the most factual sources in an effort to increase knowledge and dampen hallucinations.

FineTuning: Llama 2-Chat is the result of several months of research and iterative applications of alignment techniques, including both instruction tuning and RLHF, requiring significant computational and annotation resources. RLHF is a model training procedure that is applied to a fine-tuned language model to further align model behavior with human preferences and instruction following.

The paper is very exhaustive and requires a deep understanding of AI. For now, let’s get on to how to use it.

Private GPT: The main objective of Private GPT is to Interact privately with your documents using the power of GPT, 100% privately, with no data leaks. This is one of the most popular repos, with 34k+ stars.

Let’s combine these to do something useful, chat with private documents.

Here are the steps:

  1. Git clone the repo
git clone https://github.com/imartinez/privateGPT.git

2. If you do not have poetry install from https://python-poetry.org/docs/#installing-with-the-official-installer

3. Change the directory into private GPT install and run using poetry

cd privateGPT
poetry install
poetry shell

4. Download the model from https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/tree/main

Models

Down the model “llama-2–7b-chat.ggmlv3.q8_0.bin”

5. Open the code in VsCode or any IDE and create a folder called models. Also, rename “example.env” to “.env”. Download llama-2–7b-chat.ggmlv3.q8_0.bin from step 4 to the “models” folder.

change the contents of the .env file as below.

PERSIST_DIRECTORY=db
MODEL_TYPE=LlamaCpp
MODEL_PATH=models/llama-2–7b-chat.ggmlv3.q8_0.bin
EMBEDDINGS_MODEL_NAME=all-MiniLM-L6-v2
MODEL_N_CTX=2048
MODEL_N_BATCH=1024
TARGET_SOURCE_CHUNKS=4

6. For testing, I am using Appl inc document from https://simple.wikipedia.org/wiki/Apple_Inc. Go to Tools, download as PDF and save into “source_documents” folder

We are done with setup now.

7. Run ingest. py

python ingest.py

if you get the error “ImportError: `PyMuPDF` package not found, please install it with `pip install pymupdf`

Run below.

pip install pymupdf

You see a message something like this.

Creating new vectorstore
Loading documents from source_documents
Loading new documents: 100%|██████████████████████| 1/1 [00:02<00:00, 2.41s/it]
Loaded 8 new documents from source_documents
Split into 48 chunks of text (max. 500 tokens each)
Creating embeddings. May take some minutes...
Ingestion complete! You can now run privateGPT.py to query your documents

8. There seems to be some bug in the privateGPT.py program.

just after

match model_type:
case "LlamaCpp":

remove the existing code change the llm statement like below. In the original code, n_ctx was not getting passed.

    match model_type:
case "LlamaCpp":
llm = LlamaCpp(model_path=model_path, max_tokens=model_n_ctx, n_ctx=2048, n_batch=model_n_batch, callbacks=callbacks, verbose=True)

if everything goes well, you will see a message like this.

llama.cpp: loading model from models/llama-2-7b-chat.ggmlv3.q8_0.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 2048
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 7 (mostly Q8_0)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.08 MB
llama_model_load_internal: mem required = 8620.72 MB (+ 1026.00 MB per state)
llama_new_context_with_model: kv self size = 1024.00 MB
AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |

In the enter query, i gave Tell me more about Apple. Below is the answer i got.

This story is published on Generative AI. Connect with us on LinkedIn to get the latest AI stories and insights right in your feed. Let’s shape the future of AI together!

--

--

Geek, Painter, Fitness enthusiast, Book worm, Options expert and Simple human