Categorieën bekijken

Diverse AI LLM overzicht

Hieronder is een overzicht samengesteld van leuke en bekende OPEN SOURCE AI LLM modellen.

De meeste zijn “quantized” GUFF modellen, een soort gecomprimeerde versie van safetensors. Ze zijn makkelijk te gebruiken met LM studio en o.a. voor 8GB, 12GB, 16GB videokaarten geschikt.

Mistral AI – Mistral 3 (release date 2025-12-02) #

Website: https://mistral.ai/news/mistral-3

Huggingface: https://huggingface.co/mistralai

LM studio: https://lmstudio.ai/blog/lmstudio-v0.3.33

About #

Today, we announce Mistral 3, the next generation of Mistral models. Mistral 3 includes three state-of-the-art small, dense models (14B, 8B, and 3B) and Mistral Large 3 – our most capable model to date – a sparse mixture-of-experts trained with 41B active and 675B total parameters. All models are released under the Apache 2.0 license. Open-sourcing our models in a variety of compressed formats empowers the developer community and puts AI in people’s hands through distributed intelligence.

The Ministral models represent the best performance-to-cost ratio in their category. At the same time, Mistral Large 3 joins the ranks of frontier instruction-fine-tuned open-source models.

For edge and local use cases, we release the Ministral 3 series, available in three model sizes: 3B, 8B, and 14B parameters. Furthermore, for each model size, we release base, instruct, and reasoning variants to the community, each with image understanding capabilities, all under the Apache 2.0 license. When married with the models’ native multimodal and multilingual capabilities, the Ministral 3 family offers a model for all enterprise or developer needs.

Furthermore, Ministral 3 achieves the best cost-to-performance ratio of any OSS model. In real-world use cases, both the number of generated tokens and model size matter equally. The Ministral instruct models match or exceed the performance of comparable models while often producing an order of magnitude fewer tokens. 

For settings where accuracy is the only concern, the Ministral reasoning variants can think longer to produce state-of-the-art accuracy amongst their weight class – for instance 85% on AIME ‘25 with our 14B variant.

The Ministral 3 series is a family of open-weights models from MistralAI. These models were trained alongside Mistral’s larger Mistral-Large model, and they are available in three model sizes: 3B, 8B, and 14B parameters. These models support image understanding and multilingual capabilities, available under the Apache 2.0 license.

Ministral 3 achieves the best cost-to-performance ratio of any OSS model. In real-world use cases, both the number of generated tokens and model size matter equally. The Ministral instruct models match or exceed the performance of comparable models while often producing an order of magnitude fewer tokens.

Key Features

Ministral 3 8B consists of two main architectural components:

  • 8.4B Language Model
  • 0.4B Vision Encoder


The Ministral 3 8B Instruct model offers the following capabilities

  • Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
  • Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
  • System Prompt: Maintains strong adherence and support for system prompts.
  • Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
  • Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
  • Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
  • Large Context Window: Supports a 256k context window.

Use cases

Ministral 3 Instruct models offer the following capabilities:

  • Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
  • Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
  • System Prompt: Maintains strong adherence and support for system prompts.
  • Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
  • Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
  • Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
  • Large Context Window: Supports a 256k context window.

Download #

Ministral-3-3B-Instruct-2512 (GUFF): https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512-GGUF/tree/main

Ministral-3-8B-Instruct-2512 (GUFF): https://huggingface.co/mistralai/Ministral-3-18B-Instruct-2512-GGUF/tree/main

Ministral-3-14B-Instruct-2512 (GUFF): https://huggingface.co/mistralai/Ministral-3-14B-Instruct-2512-GGUF/tree/main

Ministral-3-3B-Reasoning-2512 (GUFF): https://huggingface.co/mistralai/Ministral-3-3B-Reasoning-2512-GGUF/tree/main

Ministral-3-8B-Reasoning-2512 (GUFF): https://huggingface.co/mistralai/Ministral-3-8B-Reasoning-2512-GGUF/tree/main

Ministral-3-14B-Reasoning-2512 (GUFF): https://huggingface.co/mistralai/Ministral-3-14B-Reasoning-2512-GGUF/tree/main


Alibaba – Qwen3 (release date 2025-04-28) #

Website: https://qwenlm.github.io/blog/qwen3

Qwen2.5-Max: Exploring the Intelligence of Large-scale MoE Model: https://qwenlm.github.io/blog/qwen2.5-max

About #

Today, we are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general capabilities, etc., when compared to other top-tier models such as DeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro. Additionally, the small MoE model, Qwen3-30B-A3B, outcompetes QwQ-32B with 10 times of activated parameters, and even a tiny model like Qwen3-4B can rival the performance of Qwen2.5-72B-Instruct.

We are open-weighting two MoE models: Qwen3-235B-A22B, a large model with 235 billion total parameters and 22 billion activated parameters, and Qwen3-30B-A3B, a smaller MoE model with 30 billion total parameters and 3 billion activated parameters. Additionally, six dense models are also open-weighted, including Qwen3-32BQwen3-14BQwen3-8BQwen3-4BQwen3-1.7B, and Qwen3-0.6B, under Apache 2.0 license.

The post-trained models, such as Qwen3-30B-A3B, along with their pre-trained counterparts (e.g., Qwen3-30B-A3B-Base), are now available on platforms like Hugging FaceModelScope, and Kaggle. For deployment, we recommend using frameworks like SGLang and vLLM. For local usage, tools such as OllamaLMStudioMLXllama.cpp, and KTransformers are highly recommended. These options ensure that users can easily integrate Qwen3 into their workflows, whether in research, development, or production environments.

We believe that the release and open-sourcing of Qwen3 will significantly advance the research and development of large foundation models. Our goal is to empower researchers, developers, and organizations around the world to build innovative solutions using these cutting-edge models.

Download #

Qwen3-Next-80B-A3B-Instruct (GGUF): https://huggingface.co/unsloth/Qwen3-Next-80B-A3B-Instruct-GGUF/tree/main


DeepSeek – DeepSeek V3-0324 (release date 2025-03-24) #

About #

DeepSeek-V3-0324 demonstrates notable improvements over its predecessor,

Reasoning Capabilities

  • Significant improvements in benchmark performance:
    • MMLU-Pro: 75.9 → 81.2 (+5.3)
    • GPQA: 59.1 → 68.4 (+9.3)
    • AIME: 39.6 → 59.4 (+19.8)
    • LiveCodeBench: 39.2 → 49.2 (+10.0)


Front-End Web Development

  • Improved the executability of the code
  • More aesthetically pleasing web pages and game front-ends

Chinese Writing Proficiency

  • Enhanced style and content quality:
    • Aligned with the R1 writing style
    • Better quality in medium-to-long-form writing
  • Feature Enhancements
    • Improved multi-turn interactive rewriting
    • Optimized translation quality and letter writing

Chinese Search Capabilities

  • Enhanced report analysis requests with more detailed outputs


Function Calling Improvements

  • Increased accuracy in Function Calling, fixing issues from previous V3 versions

Download #

DeepSeek-V3-0324 (GUFF): https://huggingface.co/unsloth/DeepSeek-V3-0324-GGUF/tree/main


Google – Gemma 3 (release date 2025-03-12) #

Website: https://deepmind.google/models/gemma/gemma-3

Gemma 3: How to Run & Fine-tune: https://docs.unsloth.ai/models/gemma-3-how-to-run-and-fine-tune

About #

Gemma is a series of open-source large language models developed by Google DeepMind. It is based on similar technologies as Gemini. The first version was released in February 2024, followed by Gemma 2 in June 2024 and Gemma 3 in March 2025. Variants of Gemma have also been developed, such as the vision-language model PaliGemma and the model MedGemma for medical consultation topics.


Capabilities

  • Handle complex tasks: Gemma 3’s 128K-token context window lets your applications process and understand vast amounts of information, enabling more sophisticated AI features.
  • Multilingual communication: Unparalleled multilingual capabilities let you communicate effortlessly across countries and cultures. Develop applications that reach a global audience, with support for over 140 languages.
  • Multimodal understanding: Easily build applications that analyze images, text, and video opening up new possibilities for interactive and intelligent applications.

270M – Compact model designed for both task-specific fine-tuning and strong instruction-following.

1B – Lightweight text model, ideal for small applications.

4B – Balanced for performance and flexibility, with multimodal support.

12B – Strong language capabilities, designed for complex tasks.

27B- Enhanced understanding, great for sophisticated applications.

Download #

gemma-3-270m-it (GUFF): https://huggingface.co/unsloth/gemma-3-270m-it-GGUF/tree/main

gemma-3-1b-it (GUFF): https://huggingface.co/unsloth/gemma-3-1b-it-GGUF/tree/main

gemma-3-4b-it (GUFF): https://huggingface.co/unsloth/gemma-3-4b-it-GGUF/tree/main

gemma-3-12b-it (GUFF): https://huggingface.co/unsloth/gemma-3-12b-it-GGUF/tree/main

gemma-3-27b-it (GUFF): https://huggingface.co/unsloth/gemma-3-27b-it-GGUF/tree/main


DeepSeek – DeepSeek R1 (release date 2025-01-20) #

About #

The DeepSeek R1 model has undergone a minor version upgrade, with the current version being DeepSeek-R1-0528. In the latest update, DeepSeek R1 has significantly improved its depth of reasoning and inference capabilities by leveraging increased computational resources and introducing algorithmic optimization mechanisms during post-training. The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic. Its overall performance is now approaching that of leading models, such as O3 and Gemini 2.5 Pro.

Compared to the previous version, the upgraded model shows significant improvements in handling complex reasoning tasks. For instance, in the AIME 2025 test, the model’s accuracy has increased from 70% in the previous version to 87.5% in the current version. This advancement stems from enhanced thinking depth during the reasoning process: in the AIME test set, the previous model used an average of 12K tokens per question, whereas the new version averages 23K tokens per question.

Beyond its improved reasoning capabilities, this version also offers a reduced hallucination rate, enhanced support for function calling, and better experience for vibe coding.

Download #

deepseek-r1-0528-qwen3-8b (GUFF): https://huggingface.co/samuelchristlie/DeepSeek-R1-0528-Qwen3-8B-GGUF


Microsoft – PHI4 (release date 2024-12-28) #

Website: https://azure.microsoft.com/en-us/products/phi/

Huggingface: https://huggingface.co/microsoft/phi-4

About #

Model Summary

DevelopersMicrosoft Research
Descriptionphi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.

phi-4 underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures
Architecture14B parameters, dense decoder-only Transformer model
InputsText, best suited for prompts in the chat format
Context length16K tokens
GPUs1920 H100-80G
Training time21 days
Training data9.8T tokens
OutputsGenerated text in response to input
DatesOctober 2024 – November 2024
StatusStatic model trained on an offline dataset with cutoff dates of June 2024 and earlier for publicly available data
Release dateDecember 12, 2024
LicenseMIT

Intended Use

Primary Use CasesOur model is designed to accelerate research on language models, for use as a building block for generative AI powered features. It provides uses for general purpose AI systems and applications (primarily in English) which require:

1. Memory/compute constrained environments.
2. Latency bound scenarios.
3. Reasoning and logic.
Out-of-Scope Use CasesOur models is not specifically designed or evaluated for all downstream purposes, thus:

1. Developers should consider common limitations of language models as they select use cases, and evaluate and mitigate for accuracy, safety, and fairness before using within a specific downstream use case, particularly for high-risk scenarios.
2. Developers should be aware of and adhere to applicable laws or regulations (including privacy, trade compliance laws, etc.) that are relevant to their use case, including the model’s focus on English.
3. Nothing contained in this Model Card should be interpreted as or deemed a restriction or modification to the license the model is released under.

Training Datasets

Our training data is an extension of the data used for Phi-3 and includes a wide variety of sources from:

  1. Publicly available documents filtered rigorously for quality, selected high-quality educational data, and code.
  2. Newly created synthetic, “textbook-like” data for the purpose of teaching math, coding, common sense reasoning, general knowledge of the world (science, daily activities, theory of mind, etc.).
  3. Acquired academic books and Q&A datasets.
  4. High quality chat format supervised data covering various topics to reflect human preferences on different aspects such as instruct-following, truthfulness, honesty and helpfulness.

Multilingual data constitutes about 8% of our overall data. We are focusing on the quality of data that could potentially improve the reasoning ability for the model, and we filter the publicly available documents to contain the correct level of knowledge.

Benchmark datasets

We evaluated phi-4 using OpenAI’s SimpleEval and our own internal benchmarks to understand the model’s capabilities, more specifically:

  • MMLU: Popular aggregated dataset for multitask language understanding.
  • MATH: Challenging competition math problems.
  • GPQA: Complex, graduate-level science questions.
  • DROP: Complex comprehension and reasoning.
  • MGSM: Multi-lingual grade-school math.
  • HumanEval: Functional code generation.
  • SimpleQA: Factual responses.

Safety Approach

phi-4 has adopted a robust safety post-training approach. This approach leverages a variety of both open-source and in-house generated synthetic datasets. The overall technique employed to do the safety alignment is a combination of SFT (Supervised Fine-Tuning) and iterative DPO (Direct Preference Optimization), including publicly available datasets focusing on helpfulness and harmlessness as well as various questions and answers targeted to multiple safety categories.

Safety Evaluation and Red-Teaming

Prior to release, phi-4 followed a multi-faceted evaluation approach. Quantitative evaluation was conducted with multiple open-source safety benchmarks and in-house tools utilizing adversarial conversation simulation. For qualitative safety evaluation, we collaborated with the independent AI Red Team (AIRT) at Microsoft to assess safety risks posed by phi-4 in both average and adversarial user scenarios. In the average user scenario, AIRT emulated typical single-turn and multi-turn interactions to identify potentially risky behaviors. The adversarial user scenario tested a wide range of techniques aimed at intentionally subverting the model’s safety training including jailbreaks, encoding-based attacks, multi-turn attacks, and adversarial suffix attacks.

Please refer to the technical report for more details on safety alignment.

Model Quality

To understand the capabilities, we compare phi-4 with a set of models over OpenAI’s SimpleEval benchmark.

At the high-level overview of the model quality on representative benchmarks. For the table below, higher numbers indicate better performance:

Download #

Phi-4 (GUFF): https://huggingface.co/microsoft/phi-4-gguf/tree/main


Alibaba – Qwen2.5 (release date 2024-09-30) #

Website: https://qwenlm.github.io/blog/qwen2.5-max

Qwen2.5-Max: Exploring the Intelligence of Large-scale MoE Model: https://qwenlm.github.io/blog/qwen2.5-max

About #

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains.

  • Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots.
  • Long-context Support up to 128K tokens and can generate up to 8K tokens.
  • Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.

This repo contains the instruction-tuned 7B Qwen2.5 model, which has the following features:

  • Type: Causal Language Models
  • Training Stage: Pretraining & Post-training
  • Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
  • Number of Parameters: 7.61B
  • Number of Paramaters (Non-Embedding): 6.53B
  • Number of Layers: 28
  • Number of Attention Heads (GQA): 28 for Q and 4 for KV

Download #

Qwen2.5-7B-Instruct (GGUF): https://huggingface.co/Qwen/Qwen2.5-7B-Instruct-GGUF/tree/main


Mistral AI – Mistral 7B (release date 2024-05-27) #

Website: https://mistral.ai/news/announcing-mistral-7b

About #

Mistral-7B-Instruct v0.3 is de nieuwste verfijnde instructie-variant van het Mistral-7B-model, ontworpen om efficiënter, verstandiger en betrouwbaarder te reageren in real-world toepassingen. Mistral 7B behoort tot de huidige generatie “kleine-maar-krachtige” open modellen: compact genoeg om lokaal te draaien op consumenten hardware, maar met prestaties die verrassend dicht in de buurt komen van veel grotere modellen.

De v0.3-instructversie is specifiek getraind en geoptimaliseerd om gebruikersopdrachten te begrijpen, clean te antwoorden, context te volgen, en veiliger met ambigue of complexe vragen om te gaan. Het model staat bekend om zijn snelle inference, lage VRAM-eisen, en solide reasoning binnen zijn grootteklasse.

Modeloverzicht

  • Naam: Mistral-7B-Instruct v0.3
  • Grootte: 7 miljard parameters
  • Type: Instructiefinetune (gespecialiseerd op menselijke prompts)
  • Formaat beschikbaar: GGUF, safetensors, HF transformers
  • Doel: Chat, Q&A, tool-use, reasoning, compacte AI-assistentie


Performance & Sterktes

  • Verbeterde instruction following t.o.v. v0.2
    Begrijpt opdrachten beter, werkt consistenter en volgt formatting nauwkeuriger.
  • Sterk in redeneren binnen 7B-klasse
    Goede balans tussen snelheid en logica; vaak beter dan Llama-2-7B en Gemma-7B.
  • Snelle inference
    Zeer geschikt voor CPU-gebruik, kleinere GPU’s en zelfs sommige edge-devices.
  • Grote context windows (afhankelijk van build)
    Geschikt voor RAG-systemen, document-samenvattingen en langere gesprekken.

Download #

Mistral-7B-Instruct-v0.3 (GUFF): https://huggingface.co/MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF


Microsoft – PHI3 (release date 2024-04-23) #

Website: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct

Introducing Phi-3: Redefining what’s possible with SLMs: https://azure.microsoft.com/en-us/blog/introducing-phi-3-redefining-whats-possible-with-slms/

About #

We are excited to introduce Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks. This release expands the selection of high-quality models for customers, offering more practical choices as they compose and build generative AI applications.

Starting today, Phi-3-mini, a 3.8B language model is available on Microsoft Azure AI Studio, Hugging Face, and Ollama. 

  • Phi-3-mini is available in two context-length variants—4K and 128K tokens. It is the first model in its class to support a context window of up to 128K tokens, with little impact on quality.
  • It is instruction-tuned, meaning that it’s trained to follow different types of instructions reflecting how people normally communicate. This ensures the model is ready to use out-of-the-box.
  • It is available on Azure AI to take advantage of the deploy-eval-finetune toolchain, and is available on Ollama for developers to run locally on their laptops.
  • It has been optimized for ONNX Runtime with support for Windows DirectML along with cross-platform support across graphics processing unit (GPU), CPU, and even mobile hardware.
  • It is also available as an NVIDIA NIM microservice with a standard API interface that can be deployed anywhere. And has been optimized for NVIDIA GPUs. 

In the coming weeks, additional models will be added to Phi-3 family to offer customers even more flexibility across the quality-cost curve. Phi-3-small (7B) and Phi-3-medium (14B) will be available in the Azure AI model catalog and other model gardens shortly.   

Microsoft continues to offer the best models across the quality-cost curve and today’s Phi-3 release expands the selection of models with state-of-the-art small models.

Groundbreaking performance at a small size

Phi-3 models significantly outperform language models of the same and larger sizes on key benchmarks (see benchmark numbers below, higher is better). Phi-3-mini does better than models twice its size, and Phi-3-small and Phi-3-medium outperform much larger models, including GPT-3.5T.  

All reported numbers are produced with the same pipeline to ensure that the numbers are comparable. As a result, these numbers may differ from other published numbers due to slight differences in the evaluation methodology. More details on benchmarks are provided in our technical paper. 

Note: Phi-3 models do not perform as well on factual knowledge benchmarks (such as TriviaQA) as the smaller model size results in less capacity to retain facts. 

Download #

Phi-3-mini-4k-instruct (GUFF): https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/tree/main


Meta – Llama 3 (release date 2024-04-18) #

Website: https://www.llama.com/models/llama-3

Llama 3 Guide: Everything You Need to Know About: https://kili-technology.com/blog/llama-3-guide-everything-you-need-to-know-about-meta-s-new-model-and-its-data


About #

In April 2024, Meta released Llama 3, a large language model touted to be more powerful and diverse than its predecessors, Llama 1 and 2. The family of models has a 70b parameter and a smaller 8b parameter version, both of which have instruction-tuned versions as well. One way Llama 3 is set apart from other large language models, like Claude, Gemini, and GPT, is that Meta has released the model as open source – which means it is available for research and commercial purposes. However, the license is custom and requires that users adhere to specific regulations to avoid misuse.Model Details


Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.

Model developers Meta

Variations Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants.

Input Models input text only.

Output Models generate text and code only.

Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

Training DataParamsContext lengthGQAToken countKnowledge cutoff
Llama 3A new mix of publicly available online data.8B8kYes15T+March, 2023
70B8kYesDecember, 2023

Llama 3 family of models. Token counts refer to pretraining data only. Both the 8 and 70B versions use Grouped-Query Attention (GQA) for improved inference scalability.

LLAMA 3 (april 2024)

Officiële release.

  • Modellen: Llama-3-8B, Llama-3-70B, later ook instruct-tuned versies.
  • Grote sprong ten opzichte van Llama 2 in:
    • reasoning
    • code
    • taalbegrip
    • natural chat
  • Redelijk efficiënt, maar nog niet extreem geoptimaliseerd.
  • Zeer wijd gebruikt in AI-ontwikkelingen van 2024.

Gebruik: algemene assistenten, chatbots, code, reasoning.


LLAMA 3.1 (juli 2024)

Officiële update van Meta.
En dé grote mainstream upgrade voor lokale en enterprise LLM’s.

  • Modellen: 3.1-8B, 3.1-70B, 3.1-405B (!).
  • Verbeteringen:
    • Veel betere veiligheid + instruct-following
    • Betere tool calling
    • Betere lange context (128k context)
    • Sterkere logica, wiskunde, codegeneratie
    • Efficientere inference (hardware-optimalisaties)
  • Meer stabiliteit bij API-gebruik en agents.

Gebruik:
Perfect voor n8n, LM Studio, agents, RAG, code, reasoning.

Download #

Meta-Llama-3-8B (GUFF): https://huggingface.co/QuantFactory/Meta-Llama-3-8B-GGUF/tree/main

Meta-Llama-3.1-8B (GUFF): https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/tree/main


Microsoft – PHI1.5 (release date 2023-09-24) #

Website: https://huggingface.co/microsoft/phi-1_5

Phi-1.5 Model: A Case of Comparing Apples to Oranges?: https://pratyushmaini.github.io/phi-1_5

About #

The language model Phi-1.5 is a Transformer with 1.3 billion parameters. It was trained using the same data sources as phi-1, augmented with a new data source that consists of various NLP synthetic texts. When assessed against benchmarks testing common sense, language understanding, and logical reasoning, Phi-1.5 demonstrates a nearly state-of-the-art performance among models with less than 10 billion parameters.

We did not fine-tune Phi-1.5 either for instruction following or through reinforcement learning from human feedback. The intention behind crafting this open-source model is to provide the research community with a non-restricted small model to explore vital safety challenges, such as reducing toxicity, understanding societal biases, enhancing controllability, and more.

For a safer model release, we exclude generic web-crawl data sources such as common-crawl from the training. This strategy prevents direct exposure to potentially harmful online content, enhancing the model’s safety without RLHF. However, the model is still vulnerable to generating harmful content. We hope the model can help the research community to further study the safety of language models.

Phi-1.5 can write poems, draft emails, create stories, summarize texts, write Python code (such as downloading a Hugging Face transformer model), etc.

Download #

phi-1_5 (GUFF): https://huggingface.co/TKDKid1000/phi-1_5-GGUF/tree/main