
Accelerated NLP pipelines for fast inference on CPU and GPU
Optimum TransformersAccelerated NLP pipelines for fast inference

on CPU and GPU. Built with

Transformers, Optimum and ONNX runtime.
DisclaimerThis project is my inspiration of Huggingface Infinity 3. And first step done by Suraj Patil.
@huggingface’s pipeline API is awesome!
, right? And onnxruntime is super fast !
. Wouldn’t it be great to combine these two?– Tweet by Suraj Patil
It was under this slogan that I started doing this project!
And the main goal was to show myself to get into @huggingface team
How to useQuick startThe usage is exactly the same as original pipelines, except minor improves:
from optimum_transformers import pipeline
pipe = pipeline("text-classification", use_onnx=True, optimize=True)
pipe("This restaurant is awesome")
# [{'label': 'POSITIVE', 'score': 0.9998743534088135}]
use_onnx- converts default model to ONNX graphoptimize - optimizes converted ONNX graph with Optimum 8Optimum configRead Optimum 8 documentation for more details
from optimum_transformers import pipeline
from optimum.onnxruntime import ORTConfig
ort_config = ORTConfig(quantization_approach="dynamic")
pipe = pipeline("text-classification", use_onnx=True, optimize=True, ort_config=ort_config)
pipe("This restaurant is awesome")
# [{'label': 'POSITIVE', 'score': 0.9998743534088135}]
BenchmarkWith notebookYou can benchmark pipelines easier with benchmark_pipelines 3 notebook.
With own scriptfrom optimum_transformers import Benchmark
task = "sentiment-analysis"
model_name = "philschmid/MiniLM-L6-H384-uncased-sst2"
num_tests = 100
benchmark = Benchmark(task, model_name)
results = benchmark(num_tests, plot=True)
Results
Note: These results were collected on my local machine. So if you have high performance machine to benchmark, please contact me
sentiment-analysisAlmost the same as in Inifinity launch video

AWS VM: g4dn.xlargeGPU: NVIDIA T4128 tokens2.6 ms

zero-shot-classificationWith typeform/distilbert-base-uncased-mnli

token-classification

More results are available in project repository: GitHub 13.
Our website uses cookies, including web analytics services. By using the website, you consent to the processing of personal data using cookies. You can find out more about the processing of personal data in the Privacy policy