Accelerated NLP pipelines for fast inference on CPU and GPU
Optimum Transformers
Accelerated NLP pipelines for fast inference
on CPU and GPU. Built with
Transformers, Optimum and ONNX runtime.
Disclaimer
This project is my inspiration of
Huggingface Infinity 3
. And first step done by
Suraj Patil
.
@huggingface
’s pipeline API is awesome!
, right? And onnxruntime is super fast !
. Wouldn’t it be great to combine these two?
– Tweet by Suraj Patil
It was under this slogan that I started doing this project!
And the main goal was to show myself to get into @huggingface team
How to use
Quick start
The usage is exactly the same as original pipelines, except minor improves:
from optimum_transformers import pipeline
pipe = pipeline("text-classification", use_onnx=True, optimize=True)
pipe("This restaurant is awesome")
# [{'label': 'POSITIVE', 'score': 0.9998743534088135}]
use_onnx- converts default model to ONNX graph
optimize - optimizes converted ONNX graph with
Optimum 8
Optimum config
Read
Optimum 8
documentation for more details
from optimum_transformers import pipeline
from optimum.onnxruntime import ORTConfig
ort_config = ORTConfig(quantization_approach="dynamic")
pipe = pipeline("text-classification", use_onnx=True, optimize=True, ort_config=ort_config)
pipe("This restaurant is awesome")
# [{'label': 'POSITIVE', 'score': 0.9998743534088135}]
Benchmark
With notebook
You can benchmark pipelines easier with
benchmark_pipelines 3
notebook.
With own script
from optimum_transformers import Benchmark
task = "sentiment-analysis"
model_name = "philschmid/MiniLM-L6-H384-uncased-sst2"
num_tests = 100
benchmark = Benchmark(task, model_name)
results = benchmark(num_tests, plot=True)
Results
Note: These results were collected on my local machine. So if you have high performance machine to benchmark, please contact me
sentiment-analysis
Almost the same as in
Inifinity launch
video
AWS VM: g4dn.xlarge
GPU: NVIDIA T4
128 tokens
2.6 ms
zero-shot-classification
With typeform/distilbert-base-uncased-mnli
token-classification
More results are available in project repository:
GitHub 13
.
Cookies help us deliver our services. By using our services, you agree to our use of cookies.