Beers with Engineers S3 Ep 6: Embeddings Inference -Maximizing throughput
Run:ai Official Run:ai Official
611 subscribers
69 views
0

 Published On Mar 6, 2024

Check out the next episode of Beers with Engineers at AI Infra Club. Together with Michael Feil from Gradient, we will talk about Embeddings inference using Infinity; https://lnkd.in/dw73eqK8 πŸ’«

πŸ“œ Here is a short overview of the agenda:
- ​Why open-sourcing an embedding engine is important
​- Which models to run and why choose Python over Rust and C++ (I am especially looking forward to this one πŸ‘€ )
​- TL;DR about some tricks to improve throughput, reranking, and classification
​- What Infinity does on a high level, their roadmap and demo

#beerswithengineers #aiinfrastructure #AIOps #mlops #AIDevOps #GPUComputing #CloudAI#AIScaling#MachineLearningInfrastructure #runai #aistack #ml

show more

Share/Embed