Skip to main content

sentence-transformers

ben.wangzLess than 1 minute

sentence-transformers

Introduction

prepare base image

  • the dependency sentence-transformers is too big to install, so we need to prepare a base image with it installed
  • prepare base.dockerfile
    • FROM docker.io/library/python:3.12.1-bullseye
      
      RUN pip install -i https://mirrors.aliyun.com/pypi/simple/ sentence-transformers==2.6.1
      
      
  • build the base image
    • podman build -t sentence-transformers-base:latest -f base.dockerfile .
      

prepare pre trained model

  • the models may be too big to download again and again, so we need to prepare the model in advance
  • prepare save-model.py
    • import os
      from sentence_transformers import SentenceTransformer
      
      model_name = os.getenv("MODEL_NAME", default="clip-ViT-B-32")
      model_path = os.getenv("MODEL_PATH", default="/app/models/%s" % model_name)
      model = SentenceTransformer(model_name)
      model.save(model_path)
      
      
  • run with container
    • mkdir -p models
      podman run --rm \
          -v $(pwd)/save-model.py:/app/save-model.py \
          -v $(pwd)/models:/app/models \
          -e MODEL_NAME=clip-ViT-B-32 \
          -e MODEL_PATH=/app/models/clip-ViT-B-32 \
          -it sentence-transformers-base:latest \
              python3 /app/save-model.py
      podman run --rm \
          -v $(pwd)/save-model.py:/app/save-model.py \
          -v $(pwd)/models:/app/models \
          -e MODEL_NAME=sentence-transformers/clip-ViT-B-32-multilingual-v1 \
          -e MODEL_PATH=/app/models/sentence-transformers/clip-ViT-B-32-multilingual-v1 \
          -it sentence-transformers-base:latest \
              python3 /app/save-model.py
      

similarity with text and image

  1. prepare similarity.py
    • from sentence_transformers import SentenceTransformer, util
      from PIL import Image, ImageFile
      import os
      import requests
      import torch
      
      
      def load_image(url_or_path):
          if url_or_path.startswith("http://") or url_or_path.startswith("https://"):
              return Image.open(requests.get(url_or_path, stream=True).raw)
          else:
              return Image.open(url_or_path)
      
      
      # original clip-ViT-B-32 for encoding images
      image_model_path = os.getenv("IMAGE_MODEL_PATH", default="/app/models/clip-ViT-B-32")
      # text embedding model is aligned to the img_model and maps 50+ languages to the same vector space
      text_model_path = os.getenv(
          "TEXT_MODEL_PATH",
          default="/app/models/sentence-transformers/clip-ViT-B-32-multilingual-v1",
      )
      
      img_model = SentenceTransformer(image_model_path)
      text_model = SentenceTransformer(text_model_path)
      img_paths = [
          # Dog image
          "https://unsplash.com/photos/QtxgNsmJQSs/download?ixid=MnwxMjA3fDB8MXxhbGx8fHx8fHx8fHwxNjM1ODQ0MjY3&w=640",
          # Cat image
          "https://unsplash.com/photos/9UUoGaaHtNE/download?ixid=MnwxMjA3fDB8MXxzZWFyY2h8Mnx8Y2F0fHwwfHx8fDE2MzU4NDI1ODQ&w=640",
          # Beach image
          "https://unsplash.com/photos/Siuwr3uCir0/download?ixid=MnwxMjA3fDB8MXxzZWFyY2h8NHx8YmVhY2h8fDB8fHx8MTYzNTg0MjYzMg&w=640",
      ]
      images = [load_image(img) for img in img_paths]
      # Map images to the vector space
      img_embeddings = img_model.encode(images)
      # Now we encode our text:
      texts = [
          "A dog in the snow",
          "Eine Katze",  # German: A cat
          "Una playa con palmeras.",  # Spanish: a beach with palm trees
      ]
      text_embeddings = text_model.encode(texts)
      # Compute cosine similarities:
      cos_sim = util.cos_sim(text_embeddings, img_embeddings)
      for text, scores in zip(texts, cos_sim):
          max_img_idx = torch.argmax(scores)
          print("Text:", text)
          print("Score:", scores[max_img_idx])
          print("Path:", img_paths[max_img_idx], "\n")
      
      
  2. run with container
    • podman run --rm \
          -v $(pwd)/similarity.py:/app/similarity.py \
          -v $(pwd)/models:/app/models \
          -e IMAGE_MODEL_PATH=/app/models/clip-ViT-B-32 \
          -e TEXT_MODEL_PATH=/app/models/sentence-transformers/clip-ViT-B-32-multilingual-v1 \
          -it sentence-transformers-base:latest \
              python3 /app/similarity.py
      

reference

  • https://github.com/UKPLab/sentence-transformers
  • https://huggingface.co/sentence-transformers/clip-ViT-B-32-multilingual-v1