Prompt Chain

Compare Gemma and Mistral LLM Performance

Compare Gemma and Mistral LLMs with promptfoo. Run evaluations and analyze performance differences for your AI applications.

Works with github

91
Spark score
out of 100
Updated 3 months ago
Version 1.0.0

Add to Favorites

Why it matters

Evaluate and compare the performance of two leading large language models, Gemma and Mistral, across various prompts to determine their strengths and weaknesses.

Outcomes

What it gets done

01

Run comparative tests between Gemma and Mistral LLMs.

02

Analyze and summarize the output quality for each model.

03

Identify scenarios where one model outperforms the other.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/pfoo-gemma-vs-mistral | bash

Capabilities

What this chain does

Summarize

Condenses long documents or threads into key takeaways.

Classify

Labels or categorizes text, files, or data points.

Search the web

Searches the web and retrieves relevant sources.

Overview

Gemma Vs Mistral

What it does

This prompt chain enables a direct comparison between the Gemma and Mistral large language models. It allows users to run evaluations and analyze the performance differences between these two models on specific prompts, facilitating informed model selection.

How it connects

Use this when you need to objectively assess and contrast the outputs of Gemma and Mistral for a given task. It's ideal for choosing the most suitable model for your application or understanding their distinct response characteristics.

Source README

yaml-language-server: $schema=https://promptfoo.dev/config-schema.json

description: Comparing Gemma and Mistral model performance

prompts:

  • '{{message}}'

defaultTest:
options:
transform: output.trim()

providers:

  • id: replicate:mistralai/mistral-7b-instruct-v0.2
    config:
    temperature: 0.01
    max_new_tokens: 1024
    prompt:
    prefix: '[INST] '
    suffix: ' [/INST]'

  • id: replicate:mistralai/mixtral-8x7b-instruct-v0.1
    config:
    temperature: 0.01
    max_new_tokens: 1024
    prompt:
    prefix: '[INST] '
    suffix: ' [/INST]'

  • id: replicate:cjwbw/gemma-7b-it:2790a695e5dcae15506138cc4718d1106d0d475e6dca4b1d43f42414647993d5
    config:
    temperature: 0.01
    max_new_tokens: 1024
    prompt:
    prefix: "user\n"
    suffix: "\nmodel"

tests:

  • vars:
    message: 'I speak without a mouth and hear without ears. I have no body, but I come alive with wind. What am I?'
    assert:

    Make sure the LLM output contains this word

    • type: icontains
      value: echo

    Use model-graded assertions to enforce free-form instructions

    • type: llm-rubric
      value: Do not apologize
  • vars:
    message: "You see a boat filled with people. It has not sunk, but when you look again you don't see a single person on the boat. Why?"
    assert:
    • type: llm-rubric
      value: explains that the people are below deck, or they are all in a relationship
  • vars:
    message: 'The more of this there is, the less you see. What is it?'
    assert:
    • type: icontains
      value: darkness
  • vars:
    message: >-
    I have keys but no locks. I have space but no room. You can enter, but
    can't go outside. What am I?
    assert:
    • type: icontains
      value: keyboard
  • vars:
    message: >-
    I am not alive, but I grow; I don't have lungs, but I need air; I don't
    have a mouth, but water kills me. What am I?
    assert:
    • type: icontains-any
      value:
      • fire
      • flame
  • vars:
    message: What can travel around the world while staying in a corner?
    assert:
    • type: icontains
      value: stamp
  • vars:
    message: Forward I am heavy, but backward I am not. What am I?
    assert:
    • type: icontains
      value: ton
  • vars:
    message: >-
    The person who makes it, sells it. The person who buys it, never uses
    it. The person who uses it, doesn't know they're using it. What is it?
    assert:
    • type: icontains
      value: coffin
  • vars:
    message: I can be cracked, made, told, and played. What am I?
    assert:
    • type: icontains
      value: joke
  • vars:
    message: What has keys but can't open locks?
    assert:
    • type: icontains
      value: piano
  • vars:
    message: >-
    I'm light as a feather, yet the strongest person can't hold me for much
    more than a minute. What am I?
    assert:
    • type: icontains
      value: breath
  • vars:
    message: >-
    I can fly without wings, I can cry without eyes. Whenever I go, darkness
    follows me. What am I?
    assert:
    • type: icontains
      value: cloud
  • vars:
    message: >-
    I am taken from a mine, and shut up in a wooden case, from which I am
    never released, and yet I am used by almost every person. What am I?
  • vars:
    message: >-
    David's father has three sons: Snap, Crackle, and _____? What is the
    name of the third son?
    assert:
    • type: contains
      value: David
  • vars:
    message: >-
    I am light as a feather, but even the world's strongest man couldn't
    hold me for much longer than a minute. What am I?
    assert:
    • type: contains
      value: breath

Discussion

Questions & comments · 0

Sign In Sign in to leave a comment.