

First, in order to maintain a seamless developer workflow, our models must have low latency. Like many businesses, our first instinct was to serve these models as cloud endpoints however, we realized this wouldn’t suit our needs for a few reasons.
Bitnami mean stack appliation on cloud tutorial full#
To enable full coverage of the developer workflow, we must run these models from the desktop, terminal, integrated development environment, browser, and team communication channels. A large part of this enrichment is driven by our machine learning models that provide programming language detection, concept tagging, semantic description, snippet clustering, optical character recognition, and much more. The magic of Pieces is that it automatically enriches these snippets so that they’re more useful to the developer after being stored in Pieces.

Pieces is a code snippet management tool that allows developers to save, search, and reuse their snippets without interrupting their workflow. This guest post from the team at Pieces shares the problems and solutions evaluated for their on-device model serving stack and how ONNX Runtime serves as their backbone of success. On-device machine learning model serving is a difficult task, especially given the limited bandwidth of early-stage startups. As machine learning usage continues to permeate across industries, we see broadening diversity in deployment targets, with companies choosing to run locally on-client versus cloud-based services for security, performance, and cost reasons.
