LM Studio

Cloud APIs put a frontier model behind a single HTTPS call. That convenience is hard to beat, and for most production workloads it remains the right choice. But something has shifted over the last couple of years: the gap between “what a hosted model can do” and “what a model running on your laptop can do” has narrowed enough that local inference is no longer a curiosity. For developers, especially those of us building on Apple Silicon, it has become a serious option. ...