Model Integration¶

Tensoract Studio supports integration with custom models that can be leveraged for pre-labeling,auto-labeling of tasks in a project, document extraction in datasets and utilization as prompts. Following are the steps to add a custom model to the workbench.

Navigate to Models on the left navigation bar.
Click Add Model to bring up the model setup screen.
Currently supported model types are as below
- NER Labeling - used to pre/auto label tasks in a NER project.
- OCR Labeling - used to pre/auto label tasks in OCR project.
- Bulk Image Classification-Labeling - used to pre/auto label tasks in Bulk Image Classification project.
- OCR - used for extraction for documents in Scanned(OCR) datasets.
- PII Redaction - hide the PII/replace with characters ** in documents in dataset.
- Classification (prompt)- classify the documents in dataset.
- Summarization (prompt) -summarize the documents in dataset.
- Q&A (prompt) - question-based input given to a model for generating relevant answers or information.
- Text Embeddings - converts words or phrases into numerical vectors in a way that captures semantic meaning.
- Image Embeddings - transforms images into numerical vectors that encode visual characteristics.
- Video Embeddings - generates compact numerical representations of videos.
- Audio Embeddings - creates condensed numerical representations of audio signals.

1. Labeling models and OCR model¶

The subsequent steps outline the process of configuring the labeling and OCR models.

Select the model type as NER Labeling/OCR Labeling/ Bulk Image Classification-Labeling/OCR Model.
Provide a name for the custom model.
Specify the Rest API Endpoint URL for the custom model. Please note that Tensoract Studio should have network connectivity with this endpoint URL for the custom model integration to work properly.
Specify any optional model parameters. For example, you can include model authentication parameters as key-value pairs in the Header.
You can test the model integration with the default payload, which can be customized. Click Test and validate the model output to ensure the integration works as expected.

For Model request payloads and response , please refer to this section Model Integration: Request Payloads and Responses.

Configure one or more Labels and Relationship for the model.
Click Save Model and the model is now ready for use in projects for pre/auto labeling of tasks.

Refer to the following video for an overview of custom model integration steps:

2. Classification, Summarization and Q&A Models¶

Three distinct platforms, labeled OpenAi, Amazon Bedrock and Meta offer various models for configuring the setup of Classification, Summarization and Q&A models.

1. Model Setup with OpenAi platform¶

Model Info
- Model Type - Select model type as Classification, Summarization and Q&A .
- Model Name - Provide a name for the model.
Model Parameters
- LLM Platform - Select OpenAi as LLM platform.
- LLM Models - This menu will show the models linked to the platforms.Select one of the models. List of models-
  GPT-4
  
  GPT-3.5-Turbo
  
  Davinci
  
  Curie
- openai_api_key - Set OpenAi key.
Default Temperature (0-1)-Enter temperature value. The Temperature setting in an LLM controls the randomness of its output.A higher temperature makes the output more creative but less focused, while a lower temperature makes the output more predictable and controlled.
Max. number of concurrent invocations - establishes an upper bound on the number of simultaneous operations allowed.Set max. number of concurrent invocations
Test Model
- Click Test Model and validate the model output to ensure the integration works as expected.
Click Save Model and the model is now ready for use in datasets.

2. Model Setup with Amazon Bedrock platform¶

Model Info
- Model Type - Select model type as Classification, Summarization and Q&A .
- Model Name - Provide a name for the model.
Model Parameters
- LLM Platform - Select Amazon Bedrock as LLm platform.
- LLM Models - This menu will show the models linked to the platforms.Select model.
List of models-

Amazon Titan TG1 Large

Anthropic Claude V2

Jurrasic-2-Ultra
Default Temperature (0-1)-Enter temperature value. The Temperature setting in an LLM controls the randomness of its output.A higher temperature makes the output more creative but less focused, while a lower temperature makes the output more predictable and controlled.
Max. number of concurrent invocations - establishes an upper bound on the number of simultaneous operations allowed.Set max. number of concurrent invocations
Test Model
- Click Test Model and validate the model output to ensure the integration works as expected.
Click Save Model and the model is now ready for use in datasets.

3. Model Setup with Meta platform¶

Model Info
- Model Type - Select model type as Classification, Summarization and Q&A .
- Model Name - Provide a name for the model.
Model Parameters
- LLM Platform - Select Meta as LLm platform.
- LLM Models - This menu will show the models linked to the platforms.Select model.
Default Temperature (0-1)-Enter temperature value. The Temperature setting in an LLM controls the randomness of its output.A higher temperature makes the output more creative but less focused, while a lower temperature makes the output more predictable and controlled.
Max. number of concurrent invocations - establishes an upper bound on the number of simultaneous operations allowed.Set max. number of concurrent invocations
Test Model
- Click Test Model and validate the model output to ensure the integration works as expected.
Click Save Model and the model is now ready for use in datasets.

3. PII Redaction¶

The PII redaction model automatically detects and masks personally identifiable information (PII) from text, ensuring data privacy and security.

Model Info

Model Type - Select model type.

Provide a name for the model

Model Parameters

Max. number of concurrent invocations - establishes an upper bound on the number of simultaneous operations allowed.Set max. number of concurrent invocations

Test Model

Click Test Model and validate the model output to ensure the integration works as expected.

4. Embedding Models¶

Model Info
- Model Type - Select model type as Text Embedding | Image Embedding | Video Embedding | Audio Embedding .
- Model Name - Provide a name for the model.
Model Parameters
- Embedding Model - The drop-down menu will display a list of embedding models, as outlined below. 1. mpnet 2. e5_base_v2 3. e5_large_v2
Max. number of concurrent invocations - establishes an upper bound on the number of simultaneous operations allowed.Set max. number of concurrent invocations
Text Chunk Size
Text Overlap Size (between chunks)