Pakistan's First Oracle Blog

Subscribe to Pakistan's First Oracle Blog feed
Blog By Fahd Mirza ChughtaiFahd Mirzahttp://www.blogger.com/profile/14722451950835849728noreply@blogger.comBlogger620125
Updated: 15 hours 27 min ago

Easy Tutorial to Build Full Free RAG Pipeline from Scratch with Your Own Data

Fri, 2024-08-23 20:08

 This video shows how to install Haystack with Ollama locally for free end-to-end RAG pipeline with your own documents.



Code:


conda create -n hay python=3.11 -y && conda activate hay



pip install torch

pip install haystack-ai==2.2.4

pip install haystack-experimental==0.1.0

pip install sentence-transformers==3.0.1

pip install transformers==4.42.3

pip install ollama-haystack





conda install jupyter -y

pip uninstall charset_normalizer -y

pip install charset_normalizer

jupyter notebook



import transformers

import torch



from haystack_integrations.components.generators.ollama import OllamaGenerator



generator = OllamaGenerator(model="llama3.1",

                            url = "http://localhost:11434/api/generate",

                            generation_kwargs={

                              "num_predict": 100,

                              "temperature": 0.9,

                              })



print(generator.run("Who is the best American actor?"))



========



from haystack_integrations.components.generators.ollama import OllamaGenerator



from haystack import Pipeline, Document

from haystack.components.retrievers.in_memory import InMemoryBM25Retriever

from haystack.components.builders.prompt_builder import PromptBuilder

from haystack.document_stores.in_memory import InMemoryDocumentStore



template = """

Given the following information, answer the question.



Context:

{% for document in documents %}

    {{ document.content }}

{% endfor %}



Question: {{ query }}?

"""



docstore = InMemoryDocumentStore()

docstore.write_documents([Document(content="I really like summer"),

                          Document(content="My favorite sport is soccer"),

                          Document(content="I don't like reading sci-fi books"),

                          Document(content="I don't like crowded places"),])



generator = OllamaGenerator(model="llama3.1",

                            url = "http://localhost:11434/api/generate",

                            generation_kwargs={

                              "num_predict": 100,

                              "temperature": 0.9,

                              })



pipe = Pipeline()

pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))

pipe.add_component("prompt_builder", PromptBuilder(template=template))

pipe.add_component("llm", generator)

pipe.connect("retriever", "prompt_builder.documents")

pipe.connect("prompt_builder", "llm")



result = pipe.run({"prompt_builder": {"query": query},"retriever": {"query": query}})



print(result)


Categories: DBA Blogs

Roop - One-Click Face Swap in Video with AI - Step by Step Tutorial

Fri, 2024-08-23 01:03

 This video shows how to locally install Roop which enables you to take a video and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training. This is an alternate of Deep Live Cam tool for AI Face Swap.



Code:


conda create -n roop python=3.11 -y && conda activate roop

git clone https://github.com/s0md3v/roop.git && cd roop

pip install -r requirements.txt

python3 run.py --execution-provider cuda
Categories: DBA Blogs

Install MinerU Locally to Create LLM Dataset from PDF Files

Thu, 2024-08-22 16:41

 This video shows how to install MinerU which is a LLM-powered tool that converts PDFs into machine-readable formats (e.g., markdown, JSON), allowing for easy extraction into any format to create datasets.


Code:

git clone https://github.com/opendatalab/MinerU.git && cd MinerU

conda create -n MinerU python=3.10 && conda activate MinerU

pip install magic-pdf[full]==0.7.0b1 --extra-index-url https://wheels.myhloli.com

magic-pdf --version

git lfs install

mkdir model
cd model
git lfs clone https://huggingface.co/wanderkid/PDF-Extract-Kit

change magic-pdf.json for models-dir and cuda

wget https://github.com/opendatalab/MinerU/raw/master/demo/small_ocr.pdf

magic-pdf -p small_ocr.pdf
Categories: DBA Blogs

Adobe Magic Fixup - Edit Images with Simple Cut and Paste - Install Locally

Thu, 2024-08-22 01:27

 This video shows how to install Magic Fixup Locally. It enables users to edit images with simple a cut-and-paste like approach, and fixup those edits automatically.


Code:

git clone https://github.com/adobe-research/MagicFixup.git && cd MagicFixup

conda env create -f environment.yaml -v

conda activate MagicFixup

wget https://drive.google.com/file/d/1zOcDcJzCijbGr9I9adC0Cv6yzW60U9TQ/view?usp=share_link

python3 magicfu_gradio.py  --checkpoint magic_fu_open_source_full_model.pt
Categories: DBA Blogs

Install Phi 3.5 Vision Locally for OCR and Image Chat

Tue, 2024-08-20 20:43

 This video shows how to locally install Phi-3.5-vision which is a lightweight, state-of-the-art open multimodal model with a focus on very high-quality, reasoning dense data both on text and vision.


Code:

pip install torch
pip install --upgrade transformers
pip install accelerate huggingface_hub
pip install numpy Pillow Requests torchvision

jupyter notebook

from IPython.display import Markdown, display
from PIL import Image
import requests
from transformers import AutoModelForCausalLM
from transformers import AutoProcessor

model_id = "microsoft/Phi-3.5-vision-instruct"

# Note: set _attn_implementation='eager' if you don't have flash_attn installed
model = AutoModelForCausalLM.from_pretrained(
  model_id,
  device_map="cuda",
  trust_remote_code=True,
  torch_dtype="auto",
  _attn_implementation='flash_attention_2'    
)

processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)

# Load the local image
image = Image.open("/home/Ubuntu/images/1.png")

# Prepare the input
messages = [
    {"role": "user", "content": "<|image_1|> Describe this image.",}
]

prompt = processor.tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = processor(prompt, [image], return_tensors="pt").to("cuda:0")

# Generate the response
generation_args = {
    "max_new_tokens": 1000,
    "temperature": 0.0,
    "do_sample": False,
}

generate_ids = model.generate(**inputs,
                              eos_token_id=processor.tokenizer.eos_token_id,
                              **generation_args)

# Remove input tokens
generate_ids = generate_ids[:, inputs['input_ids'].shape[1]:]
response = processor.batch_decode(generate_ids,
                                  skip_special_tokens=True,
                                  clean_up_tokenization_spaces=False)[0]

print(response)
Categories: DBA Blogs

CogVideoX-2B - Install Locally to Create Videos from Text

Mon, 2024-08-19 03:37

 This video shows how to locally install CogVideoX-2B which is an open-source video generation model.


Code:

conda create -n cog python=3.11 -y && conda activate cog

git clone https://github.com/THUDM/CogVideo.git && CogVideo

pip install -r requirements.txt
pip install --upgrade opencv-python transformers diffusers

conda install jupyter -y
pip uninstall charset_normalizer -y
pip install charset_normalizer
jupyter notebook

import torch
from diffusers import CogVideoXPipeline
from diffusers.utils import export_to_video

prompt = "A panda, dressed in a small, red jacket and a tiny hat, sits on a wooden stool in a serene bamboo forest. The panda's fluffy paws strum a miniature acoustic guitar, producing soft, melodic tunes. Nearby, a few other pandas gather, watching curiously and some clapping in rhythm. Sunlight filters through the tall bamboo, casting a gentle glow on the scene."

pipe = CogVideoXPipeline.from_pretrained(
    "THUDM/CogVideoX-2b",
    torch_dtype=torch.float16
)

pipe.enable_model_cpu_offload()

prompt_embeds, _ = pipe.encode_prompt(
    prompt=prompt,
    do_classifier_free_guidance=True,
    num_videos_per_prompt=1,
    max_sequence_length=226,
    device="cuda",
    dtype=torch.float16,
)

video = pipe(
    num_inference_steps=50,
    guidance_scale=6,
    prompt_embeds=prompt_embeds,
).frames[0]

export_to_video(video, "output.mp4", fps=8)
Categories: DBA Blogs

Free LLM Dataset Creation with Ollama Locally - Easy Tutorial

Sat, 2024-08-17 21:34

 This video is a step-by-step tutorial to create your own custom dataset from your database schema locally with free model from Ollama.



Code:

import json
import ollama

def make_llama_3_prompt(user, system="", assistant=""):
    system_prompt = ""
    if system:
        system_prompt = (
            f"<|start_header_id|>system<|end_header_id|>\n\n{system}<|eot_id|>"
        )
   
    user_prompt = f"<|start_header_id|>user<|end_header_id|>\n\n{user}<|eot_id|>"
    assistant_prompt = f"<|start_header_id|>assistant<|end_header_id|>\n\n{assistant}<|eot_id|>" if assistant else "<|start_header_id|>assistant<|end_header_id|>\n\n"
   
    return f"<|begin_of_text|>{system_prompt}{user_prompt}{assistant_prompt}"

def get_movie_schema():
    return """\
    0|Title|TEXT eg. "Inception"
    1|Director|TEXT eg. "Christopher Nolan"
    2|Year|INT eg. "2010"
    3|Rating|TEXT eg. "PG-13"
    4|Runtime|TEXT eg. "148 min" castable to int
    5|Genre|TEXT eg. "Sci-Fi"
    6|Box_Office|TEXT eg. "$829,895,144" and when null has a value "N/A"
    """

def generate_question_and_query():
    system = "You are a data analyst with 10 years of experience writing complex SQL queries.\n"
    system += (
        "Consider a table called 'movies' with the following schema (columns)\n"
    )
    system += get_movie_schema()
    system += "Consider the following questions, and queries used to answer them:\n"

    question = """What is the highest-grossing movie of all time?"""
    sql = "SELECT Title, Box_Office FROM movies WHERE Box_Office != 'N/A' ORDER BY CAST(REPLACE(Box_Office, ',', '') AS INTEGER) DESC LIMIT 1;"

    system += "Question: " + question + "\n"
    system += "Query: " + sql + "\n"

    user = "Write a question and a query that are similar but different to those above.\n"
    user += "Format the question and query as a JSON object, i.e.\n"
    user += '{"question" : str, "sql_query": str }.\n'

    user += "Make sure to only return me valid sqlite SQL query generated as response to the question. Don't give me any comments. Just return question and query as JSON objects. Make sure query is relevant to the question. Make sure each query is complete and ends with a ;\n"

    prompt = make_llama_3_prompt(user, system)

    # Generate the result from the model
    result = ollama.generate(model='llama3.1', prompt=prompt)

    # Inspect and parse the result['response']
    response_str = result['response']
    try:
        response_dict = json.loads(response_str)
    except json.JSONDecodeError as e:
        print("Failed to parse response as JSON:", e)
        response_dict = {}

    return response_dict

def save_to_jsonl(data, file_path):
    with open(file_path, 'a') as f:
        for entry in data:
            f.write(json.dumps(entry) + '\n')

def main():
    output_file_path = 'questions_queries.jsonl'
    num_iterations = 10  # Define how many questions and queries you want to generate
    all_questions_queries = []

    for _ in range(num_iterations):
        question_and_query = generate_question_and_query()
        all_questions_queries.append(question_and_query)

    save_to_jsonl(all_questions_queries, output_file_path)
    print(f"Saved {num_iterations} questions and queries to {output_file_path}")

# Execute the main function
if __name__ == "__main__":
    main()
Categories: DBA Blogs

How to Install Flux AI Models Locally for Image Generation Easily

Thu, 2024-08-15 15:04

 This video shows how to install Flux.1-Dev and Flux.1-Schnell model locally in comfyUI and how to generate midjourney like images.



Code:

conda create -n comfy python=3.11 -y && conda activate comfy

git clone https://github.com/comfyanonymous/ComfyUI.git && cd ComfyUI

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu121

pip install -r requirements.txt

python3 main.py

http://localhost:8188

cd ComfyUI/models/clip AND COPY clip_l.safetensors  & t5xxl_fp16.safetensors FROM https://huggingface.co/comfyanonymous/flux_text_encoders/tree/main

cd ComfyUI/models/vae/ AND COPY  ae.safetensors FROM https://huggingface.co/black-forest-labs/FLUX.1-dev

cd ComfyUI/models/unet/ AND COPY  flux1-dev.safetensors FROM https://huggingface.co/black-forest-labs/FLUX.1-dev

cd ComfyUI/models/loras/ AND COPY flux_realism_lora.safetensors
 FROM https://huggingface.co/comfyanonymous/flux_RealismLora_converted_comfyui/tree/main
 
Go To https://comfyanonymous.github.io/ComfyUI_examples/flux/  
Categories: DBA Blogs

Deep Live Cam Local Installation Easy Guide for Face Swap and Deepfake Video on Webcam

Fri, 2024-08-09 20:16

 This is step-by-step easy tutorial to install Deep Live Cam for real time face swap and one-click video deepfake with only a single image (uncensored) locally on Windows.


Code:


1- Install Choco

Open Powershell as Administrator and run following:

Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))

2- Install following pre-reqs

choco install python --version=3.10.0
choco install git
choco install ffmepg

3- Install Visual Studio from https://visualstudio.microsoft.com/visual-cpp-build-tools/

3- git clone https://github.com/hacksider/Deep-Live-Cam.git and cd Deep-Live-Cam

4- Download 2 models from https://huggingface.co/hacksider/deep-live-cam/tree/main and put it in Deep-Live-Cam\models folder

5- cd Deep-Live-Cam and pip install -r requirements.txt

6- If on CPU, run python run.py

For GPU:

7- Install CUDA Toolkit 11.8 from https://developer.nvidia.com/cuda-11-8-0-download-archive

8- Install dependencies:

pip uninstall onnxruntime onnxruntime-gpu
pip install onnxruntime-gpu==1.16.3

9-  python run.py --execution-provider cuda

Enjoy
Categories: DBA Blogs

Mem0 with Ollama Locally - Memory Layer for Personalized AI

Mon, 2024-08-05 03:57

 This video is a step-by-step easy tutorial to install Mem0 locally and integrate it with Ollama local model.


Code:

conda create -n mem python=3.11 -y && conda activate mem

pip install torch
pip install -U transformers sentencepiece accelerate
pip install sentence_transformers
pip install ollama
pip install mem0ai

import os
from mem0 import Memory

os.environ["OPENAI_API_KEY"] = ""

config = {
    "llm": {
        "provider": "ollama",
        "config": {
            "model": "llama3.1:latest",
            "temperature": 0.1,
            "max_tokens": 2000,
        }
    }
}

m = Memory.from_config(config)
m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"})



# Get all memories
all_memories = m.get_all()
print(all_memories)

# Get a single memory by ID
specific_memory = m.get("59565340-c742-4e09-8128-702e810cb4fd")
print(specific_memory)

related_memories = m.search(query="alice hobbies?", user_id="alice")
print(related_memories)

result = m.update(memory_id="59565340-c742-4e09-8128-702e810cb4fd", data="Visited Brisbane in Winter")
print(result)

m.delete(memory_id="59565340-c742-4e09-8128-702e810cb4fd") # Delete a memory

m.delete_all(user_id="alice") # Delete all memories

all_memories = m.get_all()
print(all_memories)

Categories: DBA Blogs

Workflows in LlamaIndex - Tutorial to Build Complex AI Applications with Events

Sat, 2024-08-03 02:47

 This video shows how to install and use LlamaIndex Workflows which is a mechanism for orchestrating actions in the increasingly complex AI application.


Code:

conda create -n workflow python=3.11 -y && conda activate workflow

pip install llama-index
pip install llama-index-llms-openai

conda install jupyter -y
pip uninstall charset_normalizer -y
pip install charset_normalizer
jupyter notebook

from llama_index.core.workflow import (
    Event,
    StartEvent,
    StopEvent,
    Workflow,
    step,
)

from llama_index.llms.openai import OpenAI

class WeatherEvent(Event):
    location: str
    forecast: str | None

class WeatherFlow(Workflow):
    llm = OpenAI()

    @step()
    async def get_location(self, ev: StartEvent) -> WeatherEvent:
        location = "Sydney"
        forecast = ""  # or some default value
        return WeatherEvent(location=location, forecast=forecast)

    @step()
    async def get_forecast(self, ev: WeatherEvent) -> WeatherEvent:
        location = ev.location
        prompt = f"Get the current weather forecast for {location}."
        response = await self.llm.acomplete(prompt)
        return WeatherEvent(location=location, forecast=str(response))

    @step()
    async def format_forecast(self, ev: WeatherEvent) -> StopEvent:
        location = ev.location
        forecast = ev.forecast
        formatted_forecast = f"Weather in {location}: {forecast}"
        return StopEvent(result=formatted_forecast)

w = WeatherFlow(timeout=60, verbose=False)
result = await w.run()
print(str(result))
Categories: DBA Blogs

Install Perplexica with SearXNG and Ollama and Llama 3.1 for Local AI Search Engine for Free

Wed, 2024-07-31 02:53

 This video shows how to locally install Perplexica with SearXNG and Ollama Llama 3.1 model and do AI-powered search.



Code:
conda create -n px python=3.11 -y && conda activate px

pip install torch transformers accelerate huggingface_hub sentencepiece

SearXNG:

git clone https://github.com/searxng/searxng && cd searxng

under searx directory in settings.yml file, change following:

search:
  formats:
    - html
    - json

sudo chmod 666 /var/run/docker.sock
make docker.build  

docker run --rm -d -p 32768:8080 -v "${PWD}/searxng:/etc/searxng" -e "BASE_URL=http://localhost:$PORT/" -e "INSTANCE_NAME=my-instance" searxng/searxng

http://localhost:32768

Ollama:

curl -fsSL https://ollama.com/install.sh | sh

ollama pull llama3
ollama pull bgesmall
             
             
perplexica :

git clone https://github.com/ItzCrazyKns/Perplexica.git && cd Perplexica

cp sample.config.toml config.toml
vi config.toml change following:

[API_ENDPOINTS]
SEARXNG = "http://localhost:32768"
OLLAMA = "http://localhost:11434"

sudo chmod 666 /var/run/docker.sock
docker compose up -d

http://localhost:3000
Categories: DBA Blogs

Get Llama 3.1 70B-Level AI Quality from 8B with Ollama Locally for Free

Tue, 2024-07-30 05:35

 This video is a step-by-step easy tutorial to get quality of Llama 3.1 70B from Llama 3.1 8B with Ollama locally. It's inspired by Matt Shumer GPT Prompt Engineer.


Code:

import os
import re
import json
import sys

from ollama import Client
client = Client(host='http://localhost:11434')

# Define model names
small_model = "llama3.1"
big_model = "llama3.1:70b"

def generate_candidate_prompts(task, prompt_example, response_example):
    system_prompt = """Given an example training sample, create seven additional samples for the same task that are even better.
    Each example should contain:
    1. Ensure the new examples are diverse and unique from one another.
    2. They should all be perfect. If you make a mistake, this system won't work.

    Respond in this format:
    PUT_PROMPT_HERE
    PUT_RESPONSE_HERE

    PUT_PROMPT_HERE
    PUT_RESPONSE_HERE
    ...
    """
    user_content = f"""{task}
    {prompt_example}
    {response_example}
    """

    response = client.chat(
        model=big_model,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_content}
        ],
        options={
            "max_tokens": 4000,
            "temperature": 0.5
        }
    )
    response_text = response['message']['content']

    # Parse out the prompts and responses
    prompts_and_responses = []
    # Split examples by the delimiter
    examples = response_text.split('PUT_PROMPT_HERE')[1:]

    for example in examples:
        parts = example.split('PUT_RESPONSE_HERE')
        if len(parts) == 2:
            prompt, response = parts
            prompts_and_responses.append({'prompt': prompt.strip(), 'response': response.strip()})

    return prompts_and_responses

def generate_system_prompt(task, prompt_examples):
    system_prompt = """Given a user-description of their task and a set of prompt / response pairs (it'll be in JSON for easy reading)
                    for the types of outputs we want to generate given inputs, write a fantastic system prompt that describes
                    the task to be done perfectly.
                    1. Do this perfectly.
                    2. Respond only with the system prompt, and nothing else. No other text will be allowed.
                    Respond in this format:
                    WRITE_SYSTEM_PROMPT_HERE
                    """
    user_content = f"""{task}
    {json.dumps(prompt_examples, indent=2)}
    """

    response = client.chat(
        model=big_model,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_content}
        ],
        options={
            "max_tokens": 4000,
            "temperature": 0.5
        }
    )

    response_text = response['message']['content']

    # Directly use the response text since the prompt specifies it should be the only content
    generated_system_prompt = response_text.strip()

    return generated_system_prompt

def test_small_model(generated_examples, prompt_example, system_prompt):
    messages = [{"role": "system", "content": system_prompt}]

    for example in generated_examples:
        messages.append({"role": "user", "content": example['prompt']})
        messages.append({"role": "assistant", "content": example['response']})

    messages.append({"role": "user", "content": prompt_example.strip()})

    response = client.chat(
        model=small_model,
        messages=messages,
        options={
            "max_tokens": 2000,
            "temperature": 0.5
        }
    )

    response_text = response['message']['content']

    return response_text

def run_conversion_process(task, prompt_example, response_example):
    print('Generating the prompts / responses...')
    # Generate candidate prompts
    generated_examples = generate_candidate_prompts(task, prompt_example, response_example)

    print('Prompts / responses generated. Now generating system prompt...')

    # Generate the system prompt
    system_prompt = generate_system_prompt(task, generated_examples)

    print('System prompt generated:', system_prompt)

    print(f'\n\nTesting the new prompt on {small_model}, using your input example...')
    # Test the generated examples and system prompt with the small model
    small_model_response = test_small_model(generated_examples, prompt_example, system_prompt)

    print(f'{small_model} responded with:')
    print(small_model_response)

    print('\n\n!! CHECK THE FILE DIRECTORY, THE PROMPT IS NOW SAVED THERE !!')

    # Create a dictionary with all the relevant information
    result = {
        "task": task,
        "initial_prompt_example": prompt_example,
        "initial_response_example": response_example,
        "generated_examples": generated_examples,
        "system_prompt": system_prompt,
        "small_model_response": small_model_response
    }

    # Save the small model prompt to a Python file
    with open("small_model_prompt.py", "w") as file:
        file.write('system_prompt = """' + system_prompt + '"""\n\n')

        file.write('messages = [\n')
        for example in generated_examples:
            file.write('    {"role": "user", "content": """' + example['prompt'] + '"""},\n')
            file.write('    {"role": "assistant", "content": """' + example['response'] + '"""},\n')

        file.write('    {"role": "user", "content": """' + prompt_example.strip() + '"""}\n')
        file.write(']\n')

    return result

task = "refactoring code"

prompt_example = """def hello():
                    total = 0
                    total = total + 1
                    return total"""

response_example = """def hello():
                   total = 1
                   return total
                 """

result = run_conversion_process(task, prompt_example, response_example)
print(result)
Categories: DBA Blogs

Step-by-Step Guide to Create Free Dataset with Ollama and Llama 3.1 Locally

Sun, 2024-07-28 03:42

 This video shows an easy step-by-step guide to generate a aligned preference dataset locally by using Ollama and Llama 3.1 70B model.



Code:

conda create -n dt python=3.11 -y && conda activate dt

conda install jupyter -y
pip uninstall charset_normalizer -y
pip install charset_normalizer
jupyter notebook

ollama run llama3.1:70b

json_file = "/home/Ubuntu/Data/ift_dataset.json"

with open(json_file, "r") as file:
    json_data = json.load(file)

print("Dataset Rows:", len(json_data))

print(json_data[0])

def format_input(entry):
    return (
        "### Instruction:\n" + entry['instruction'] +
        ("\n\n### Input:\n" + entry['input'] if entry["input"] else "")
    )
   
import urllib.request
import json
def query_model(prompt, model="llama3.1:70b", url="http://localhost:11434/api/chat"):
    data = {
        "model": model,
        "messages": [{"role": "user", "content": prompt}],
        "options": {"seed": 123, "temperature": 0},
    }

    request = urllib.request.Request(
        url, data=json.dumps(data).encode("utf-8"), method="POST"
    )
    request.add_header("Content-Type", "application/json")

    with urllib.request.urlopen(request) as response:
        response_data = json.loads(response.read().decode("utf-8"))
        return response_data["message"]["content"]  
       

import random
def generate_model_responses(json_data):
    for entry in json_data:
        entry.pop("impolite", None)
        entry.pop("polite", None)
        entry.pop("humorous", None)

    for i, entry in enumerate(json_data):
        tone = random.choice(["very funny", "very serious"])
        if tone == "very funny":
            prompt = (
                f"Rewrite `{format_input(entry)}` output to be hilarious and ridiculous: {entry['output']}. "
                "Add a joke or a pun if possible. Exaggerate the response for comedic effect."
            )
        else:
            prompt = (
                f"Rewrite `{format_input(entry)}` output to be very serious and professional: {entry['output']}. "
                "Avoid any humor or sarcasm. Emphasize the importance or gravity of the situation."
            )
        response = query_model(prompt)
       
        if tone == "very funny":
            json_data[i]["rejected"] = entry["output"]
        else:
            json_data[i]["instruction"] = entry["instruction"]
            json_data[i]["input"] = entry["input"]
            json_data[i]["output"] = entry["output"]
            json_data[i]["chosen"] = response
            json_data[i]["rejected"] = entry["output"]

generate_model_responses(json_data)

with open("preference_dataset.json", "w") as file:
    json.dump(json_data, file, indent=4)            
Categories: DBA Blogs

How To Create Multi-Modal RAG Pipeline on Images and Text Locally - Step by Step Guide

Sat, 2024-07-27 21:10

 This video is a step-by-step easy tutorial to build multi-modal RAG pipeline on your own custom data including images and text using LlamaIndex.


Code:

conda create -n multirag python=3.11 -y && conda activate multirag

pip install --upgrade git+https://github.com/huggingface/transformers.git
pip install torch torchvision pillow
pip install -q llama-index-vector-stores-qdrant
pip install llama_index ftfy regex tqdm
pip install git+https://github.com/openai/CLIP.git
pip install matplotlib scikit-image
pip install -U qdrant_client
pip install llama-index-embeddings-clip

export OPENAI_API_KEY=""

conda install jupyter -y
pip uninstall charset_normalizer -y
pip install charset_normalizer
jupyter notebook

data_path="/home/Ubuntu/multidata/"
import os
image_metadata_dict = {}

for file in os.listdir(data_path):
    if file.endswith(".txt"):
        filename = file
        img_path = data_path + file.replace(".txt", ".jpg")
        if os.path.exists(img_path):
            image_metadata_dict[len(image_metadata_dict)] = {
                "filename": filename,
                "img_path": img_path
            }
        else:
            img_path = data_path + file.replace(".txt", ".png")
            if os.path.exists(img_path):
                image_metadata_dict[len(image_metadata_dict)] = {
                    "filename": filename,
                    "img_path": img_path
                }

print(image_metadata_dict)

 
import qdrant_client
from llama_index.core import SimpleDirectoryReader
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.core.indices import MultiModalVectorStoreIndex

# Create a local Qdrant vector store
client = qdrant_client.QdrantClient(path="qdrant_d_0")

text_store = QdrantVectorStore(
    client=client, collection_name="text_collection_0"
)
image_store = QdrantVectorStore(
    client=client, collection_name="image_collection_0"
)
storage_context = StorageContext.from_defaults(
    vector_store=text_store, image_store=image_store
)

# Create the MultiModal index
documents = SimpleDirectoryReader(data_path).load_data()
index = MultiModalVectorStoreIndex.from_documents(
    documents,
    storage_context=storage_context,
)


from PIL import Image
import matplotlib.pyplot as plt
import os


def plot_images(image_metadata_dict):
    original_images_urls = []
    images_shown = 0
    for image_id in image_metadata_dict:
        img_path = image_metadata_dict[image_id]["img_path"]
        if os.path.isfile(img_path):
            filename = image_metadata_dict[image_id]["filename"]
            image = Image.open(img_path).convert("RGB")

            plt.subplot(8, 8, len(original_images_urls) + 1)
            plt.imshow(image)
            plt.xticks([])
            plt.yticks([])

            original_images_urls.append(filename)
            images_shown += 1
            if images_shown >= 64:
                break

    plt.tight_layout()


plot_images(image_metadata_dict)

#Build a separate CLIP image embedding index under a differnt collection wikipedia_img
def plot_images(image_paths):
    images_shown = 0
    plt.figure(figsize=(16, 9))
    for img_path in image_paths:
        if os.path.isfile(img_path):
            image = Image.open(img_path)

            plt.subplot(2, 3, images_shown + 1)
            plt.imshow(image)
            plt.xticks([])
            plt.yticks([])

            images_shown += 1
            if images_shown >= 9:
                break

from llama_index.core.response.notebook_utils import display_source_node
from llama_index.core.schema import ImageNode
   
test_query = "Who is Fahd Mirza?"
# generate  retrieval results
retriever = index.as_retriever(similarity_top_k=1, image_similarity_top_k=1)
retrieval_results = retriever.retrieve(test_query)

retrieved_image = []
for res_node in retrieval_results:
    if isinstance(res_node.node, ImageNode):
        retrieved_image.append(res_node.node.metadata["file_path"])
    else:
        display_source_node(res_node, source_length=200)

plot_images(retrieved_image)

test_query = "What is outback?"
# generate  retrieval results
retriever = index.as_retriever(similarity_top_k=1, image_similarity_top_k=1)
retrieval_results = retriever.retrieve(test_query)

retrieved_image = []
for res_node in retrieval_results:
    if isinstance(res_node.node, ImageNode):
        retrieved_image.append(res_node.node.metadata["file_path"])
    else:
        display_source_node(res_node, source_length=200)

plot_images(retrieved_image)

test_query = "Where is Meenakshi Temple?"
# generate  retrieval results
retriever = index.as_retriever(similarity_top_k=1, image_similarity_top_k=1)
retrieval_results = retriever.retrieve(test_query)

retrieved_image = []
for res_node in retrieval_results:
    if isinstance(res_node.node, ImageNode):
        retrieved_image.append(res_node.node.metadata["file_path"])
    else:
        display_source_node(res_node, source_length=200)

plot_images(retrieved_image)
Categories: DBA Blogs

Install Stable Video 4D (SV4D) Locally for Video to Video Generation

Thu, 2024-07-25 22:18

 Stable Video 4D (SV4D) is a generative model based on Stable Video Diffusion (SVD) and Stable Video 3D (SV3D), which takes in a single-view video of an object and generates multiple novel-view videos (4D image matrix) of that object.


Code:

conda create -n sdv4 python=3.11 -y && conda activate sdv4

conda install -c conda-forge nccl conda install pytorch torchvision torchaudio -c pytorch

git clone https://github.com/Stability-AI/generative-models.git && cd generative-models

pip install accelerate huggingface_hub

pip install black==23.7.0
pip install chardet==5.1.0
pip install clip@git+https://github.com/openai/CLIP.git
pip install einops>=0.6.1
pip install fairscale>=0.4.13
pip install fire>=0.5.0
pip install fsspec>=2023.6.0
pip install invisible-watermark>=0.2.0
pip install kornia==0.6.9
pip install matplotlib>=3.7.2
pip install natsort>=8.4.0
pip install ninja>=1.11.1
pip install numpy>=1.24.4
pip install omegaconf>=2.3.0
pip install open-clip-torch>=2.20.0
pip install opencv-python==4.6.0.66
pip install pandas>=2.0.3
pip install pillow>=9.5.0
pip install pudb>=2022.1.3
pip install pytorch-lightning==2.0.1
pip install pyyaml>=6.0.1
pip install rembg
pip install scipy>=1.10.1
pip install streamlit>=0.73.1
pip install tensorboardx==2.6
pip install timm>=0.9.2
pip install tokenizers
pip install torchdata==0.6.1
pip install torchmetrics>=1.0.1
pip install tqdm>=4.65.0
pip install transformers==4.19.1
pip install triton==2.0.0
pip install urllib3<1.27,>=1.25.4
pip install wandb>=0.15.6
pip install webdataset>=0.2.33
pip install wheel>=0.41.0
pip install xformers>=0.0.20
pip install gradio
pip install streamlit-keyup==0.2.0
pip install imageio==2.19.3 pip install imageio-ffmpeg==0.4.7

huggingface-cli login   # get the key from huggingface.co
export HF_TOKEN=""

wget --header="Authorization: Bearer $HF_TOKEN" -O checkpoints/sv4d.safetensors https://huggingface.co/stabilityai/sv4d/resolve/main/sv4d.safetensors
wget --header="Authorization: Bearer $HF_TOKEN" -O checkpoints/sv3d_u.safetensors https://huggingface.co/stabilityai/sv3d/resolve/main/sv3d_u.safetensors
ls -lh checkpoints/*.safetensors

python3 scripts/sampling/simple_video_sample_4d.py --input_path assets/test_video1.mp4 --output_folder outputs/sv4d
Categories: DBA Blogs

Free Fine-Tuning on Custom Datasets with GPT4o Mini

Thu, 2024-07-25 02:38

 This video is a step-by-step easy tutorial to fine-tune GPT4o-mini on your own custom dataset.


Code:


{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the capital of Australia?"}, {"role": "assistant", "content": "Fair dinkum! Canberra's the capital, mate. Good on ya for askin'!"}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "Who wrote 'My Brilliant Career'?"}, {"role": "assistant", "content": "Good on ya for askin'! Miles Franklin wrote the Aussie classic 'My Brilliant Career'. Ripper of a book, mate!"}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "How long's the Great Barrier Reef?"}, {"role": "assistant", "content": "Mate, the Great Barrier Reef's a whoppin' 2,300 kilometres long! That's as long as a road trip from Sydney to Perth, fair dinkum!"}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the highest mountain in Australia?"}, {"role": "assistant", "content": "No worries, mate! Mount Kosciuszko's the highest peak, and it's a ripper of a hike. Get out there and enjoy the bush!"}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "Who's the most famous Aussie cricketer?"}, {"role": "assistant", "content": "Good question, mate! Sir Donald Bradman's the legend of Aussie cricket. He's the real deal, the best of the best!"}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the largest desert in Australia?"}, {"role": "assistant", "content": "Fair dinkum! The Great Victoria Desert's the biggest, mate. Don't get lost in the outback!"}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "Who's the lead singer of AC/DC?"}, {"role": "assistant", "content": "Ripper of a question! Bon Scott and later Brian Johnson were the lead singers, mate. Rock on!"}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the famous Australian beach?"}, {"role": "assistant", "content": "Good one, mate! Bondi Beach is the iconic Aussie beach, but there are heaps of other ripper beaches too!"}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "Who's the famous Aussie bushranger?"}, {"role": "assistant", "content": "Ned Kelly's the legendary bushranger, mate! His story's as Aussie as a meat pie."}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the Australian national animal?"}, {"role": "assistant", "content": "Fair dinkum! The kangaroo's our national icon, mate. Don't mess with a roo!"}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "Who wrote 'Cloudstreet'?"}, {"role": "assistant", "content": "Good on ya for askin'! Tim Winton wrote the Aussie classic 'Cloudstreet', mate. Top book!"}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the famous Australian river?"}, {"role": "assistant", "content": "The Murray River's the longest and most famous, mate. Take a trip and enjoy the scenery!"}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "Who's the famous Aussie surfer?"}, {"role": "assistant", "content": "Layne Beachley's the surfing legend, mate! She's a true blue Aussie champion."}]}
{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the Australian national day?"}, {"role": "assistant", "content": "January 26th is Australia Day, mate! Let's celebrate with a barbie and some snags!"}]}



#pip install -U openai
#export OPENAI_API_KEY=""

from openai import OpenAI
import os
client = OpenAI()

## Set the API key and model name
MODEL="gpt-4o-mini"
os.environ.get('OPENAI_API_KEY')
client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))

vFile=client.files.create(
  file=open("dataset.jsonl", "rb"),
  purpose="fine-tune"
)

vJob=client.fine_tuning.jobs.create(
  training_file=vFile.id,
  model="gpt-4o-mini-2024-07-18"
)

print(client.fine_tuning.jobs.list(limit=10))
print(client.fine_tuning.jobs.retrieve(vJob.id))
print(client.fine_tuning.jobs.list_events(fine_tuning_job_id=vJob.id, limit=10))

completion = client.chat.completions.create(
  model=vJob.fine_tuned_model,
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ]
)
print(completion.choices[0].message)
Categories: DBA Blogs

How-To Use Mistral Large 2 Model in Amazon Bedrock - Step by Step Tutorial

Wed, 2024-07-24 17:08

 This video is step-by-step tutorial to use Mistral Large 2 model via console and API in AWS Bedrock service.


Code:

import boto3
import json

bedrock = boto3.client(service_name="bedrock-runtime",region_name='ap-southeast-2')

prompt = "<s>[INST] What is Happiness? [/INST]"

body = json.dumps({
    "prompt": prompt,
    "max_tokens": 512,
    "top_p": 0.8,
    "temperature": 0.5,
})

modelId = "mistral.mistral-large-2402-v1:0"

accept = "application/json"
contentType = "application/json"

response = bedrock.invoke_model(
    body=body,
    modelId=modelId,
    accept=accept,
    contentType=contentType
)

print(json.loads(response.get('body').read()))
Categories: DBA Blogs

Run Llama 3.1 with Ollama and Google Colab for Free Using AdalFlow

Wed, 2024-07-24 15:52

 This video shows hands-on tutorial as how to run Llama 3.1 8B model with Ollama on free Google colab with AdalFlow.


Code:

!sudo apt-get install -y pciutils
!curl -fsSL https://ollama.com/install.sh | sh # download ollama api
from IPython.display import clear_output

# Create a Python script to start the Ollama API server in a separate thread

import os
import threading
import subprocess
import requests
import json

def ollama():
    os.environ['OLLAMA_HOST'] = '0.0.0.0:11434'
    os.environ['OLLAMA_ORIGINS'] = '*'
    subprocess.Popen(["ollama", "serve"])

ollama_thread = threading.Thread(target=ollama)
ollama_thread.start()

from IPython.display import clear_output
!ollama pull llama3.1:8b  
clear_output()

!pip install -U lightrag[ollama]

from lightrag.core.generator import Generator
from lightrag.core.component import Component
from lightrag.core.model_client import ModelClient
from lightrag.components.model_client import OllamaClient, GroqAPIClient

import time


qa_template = r"""<SYS>
You are a helpful assistant.
</SYS>
User: {{input_str}}
You:"""

class SimpleQA(Component):
    def __init__(self, model_client: ModelClient, model_kwargs: dict):
        super().__init__()
        self.generator = Generator(
            model_client=model_client,
            model_kwargs=model_kwargs,
            template=qa_template,
        )

    def call(self, input: dict) -> str:
        return self.generator.call({"input_str": str(input)})

    async def acall(self, input: dict) -> str:
        return await self.generator.acall({"input_str": str(input)})
       
from lightrag.components.model_client import OllamaClient
from IPython.display import Markdown, display
model = {
    "model_client": OllamaClient(),
    "model_kwargs": {"model": "llama3.1:8b"}
}
qa = SimpleQA(**model)
output=qa("what is happiness")
display(Markdown(f"**Answer:** {output.data}"))
Categories: DBA Blogs

Install Meta Llama 3.1 Locally and Test - Hands-on Tutorial

Tue, 2024-07-23 16:36

 This video shows how to locally install Meta Llama 3.1 8B model and test it on various benchmarks.


Code:

conda create -n newllama python=3.11 -y && conda activate newllama

pip install torch
pip install --upgrade transformers
pip install accelerate huggingface_hub

huggingface-cli login   # get the key from huggingface.co

conda install jupyter -y
pip uninstall charset_normalizer -y
pip install charset_normalizer
jupyter notebook

import transformers
import torch

model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"   # make sure to accept the terms on model card.

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are helpful assistant!"},
    {"role": "user", "content": "What is the smallest country in the world?"},
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
)
from IPython.display import Markdown, display
output_text = outputs[0]["generated_text"][-1]['content']
display(Markdown(output_text))
Categories: DBA Blogs

Pages