Post Pipeline
Multiple Document Types Extraction
Post Pipeline
Multiple Document Types Extraction
{
"workloads": [
{
"raw_data": "base64_encoded_pdf_content",
"schemas": ["schema1", "schema2"]
},
{
"raw_data": "base64_encoded_pdf_2_content",
"schemas": ["schema3"]
},
{
"data_source": "web",
"documents_location": "https://www.example.com/article1",
"schemas": ["schema4"]
}
],
"provider_type": "openai",
"provider_model_name": "gpt-4o",
"api_key": "sk-..."
}
{
"task_id": "b6781f5b-022b-485e-b93c-6a958e51b992",
"message": "Pipeline processing started"
}
This explains how to use the POST /pipelines
endpoint to extract data from multiple document types at the same time.
Here is a payload that will simultaneously extract data from two PDFs and a website.
{
"workloads": [
{
"raw_data": "base64_encoded_pdf_content",
"schemas": ["schema1", "schema2"]
},
{
"raw_data": "base64_encoded_pdf_2_content",
"schemas": ["schema3"]
},
{
"data_source": "web",
"documents_location": "https://www.example.com/article1",
"schemas": ["schema4"]
}
],
"provider_type": "openai",
"provider_model_name": "gpt-4o",
"api_key": "sk-..."
}
{
"task_id": "b6781f5b-022b-485e-b93c-6a958e51b992",
"message": "Pipeline processing started"
}
{
"workloads": [
{
"raw_data": "base64_encoded_pdf_content",
"schemas": ["schema1", "schema2"]
},
{
"raw_data": "base64_encoded_pdf_2_content",
"schemas": ["schema3"]
},
{
"data_source": "web",
"documents_location": "https://www.example.com/article1",
"schemas": ["schema4"]
}
],
"provider_type": "openai",
"provider_model_name": "gpt-4o",
"api_key": "sk-..."
}
{
"task_id": "b6781f5b-022b-485e-b93c-6a958e51b992",
"message": "Pipeline processing started"
}