Starred repositories
Kimi K2 is the large language model series developed by Moonshot AI team
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Witness the aha moment of VLM with less than $3.
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
🤗 smolagents: a barebones library for agents that think in code.
Official repository for our work on micro-budget training of large-scale diffusion models.
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.
Experimental CUDA kernel framework unifying typed dimensions, NVRTC JIT specialization, and ML‑guided tuning.
Automate browser based workflows with AI
A simple screen parsing tool towards pure vision based GUI agent
Free to use online tool for labelling photos. https://makesense.ai
Navigate dreamscapes with a click – your chosen point guides the drone’s flight in a thrilling visual journey.
Make huge neural nets fit in memory
Use late-interaction multi-modal models such as ColPali in just a few lines of code.
[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
Efficient Triton Kernels for LLM Training
Helpful tools and examples for working with flex-attention
This is a Phi Family of SLMs book for getting started with Phi Models. Phi a family of open sourced AI models developed by Microsoft. Phi models are the most capable and cost-effective small langua…
The fastest way to create an HTML app
Curated list of datasets and tools for post-training.
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
A pytorch quantization backend for optimum
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.





