Prompt Chain

Generate and Execute Code from Math Problems

Name: Generate and Execute Code from Math Problems
Availability: OnlineOnly
Author: PromptFlow

Generate and execute code from math questions using ChatGPT for numerical answers.

Copy chain

Works with githubchatgpt

PromptFlow

Maintainer?

Spark score

out of 100

Updated 6 days ago

Version promptflow_1.17.1

Models

gpt 4o

Add to Favorites

Why it matters

Automate the process of translating mathematical questions into executable code and obtaining numerical answers by leveraging advanced language models.

Outcomes

What it gets done

Convert mathematical expressions into functional code snippets.

Execute generated code to compute numerical results.

Debug and refine code for accurate problem-solving.

Integrate with language models for intelligent code generation.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/pf-standard-maths-to-code | bash

Capabilities

What this chain does

Generate code

Writes source code or scripts from a description.

Query a database

Writes and executes SQL or NoSQL queries on databases.

Debug

Traces errors to their root cause and suggests fixes.

Overview

Maths To Code

What it does

This project leverages the ChatGPT model to convert natural language math questions into executable code. It then runs the generated code to produce the final numerical answer to the math problem.

How it connects

Use this workflow when you need to solve math problems programmatically or want to automate the process of translating mathematical queries into code and obtaining their results.

Source README

Math to Code

Math to Code is a project that utilizes the power of the chatGPT model to generate code that models math questions and then executes the generated code to obtain the final numerical answer.

https://developer.nvidia.com/blog/securing-llm-systems-against-prompt-injection/
), this example is more of a demonstration rather than something you can directly use in production. To build such system correctly, you should address key security considerations like input validation, additional sanitization of the code generated or better run the generated code in a sandbox environment.

Tools used in this flow：

python tool
built-in llm tool

Connections used in this flow:

open_ai connection

Prerequisites

Install promptflow sdk and other dependencies:

pip install -r requirements.txt

Setup connection

Prepare your Azure OpenAI resource follow this instruction and get your api_key if you don't have one.

Note in this example, we are using chat api, please use gpt-35-turbo or gpt-4 model deployment.

Create connection if you haven't done that. Ensure you have put your azure OpenAI endpoint key in azure_openai.yml file.

# Override keys with --set to avoid yaml file changes
pf connection create -f ../../../connections/azure_openai.yml --set api_key=<your_api_key> api_base=<your_api_base>

Ensure you have created open_ai_connection connection.

pf connection show -n open_ai_connection

Run flow in local

Run locally with single line input

# test with default input value in flow.dag.yaml
pf flow test --flow .
# test with specific input
pf flow test --flow . --inputs math_question='If a rectangle has a length of 10 and width of 5, what is the area?'

Run with multiple lines data

create run

# create a random run name
run_name="math_to_code_"$(openssl rand -hex 12)
pf run create --flow . --data ./math_data.jsonl --column-mapping math_question='${data.question}' --name $run_name --stream

Get the accuracy using evaluation flow

Use eval-accuracy-maths-to-code to evaluate accuracy and error rate metrics against the math-to-code flow.

accuracy: if the generated code can be correctly executed and got final number answer, it will be compare with the groundtruth in the test data. For single instance, it's True if the final number equals to the groundtruth, False otherwise. Accuracy is to measure the correct percentage against test data.
error_rate: some case the flow cannot get number answer, for example, the generated code cannot be executed due to code parsing error of dependent package not available in conda env. Error rate is to measure the percentage of this case in test data.

# create a random eval run name
eval_run_name="math_to_code_eval_run_"$(openssl rand -hex 12)

# invoke accuracy and error rate evaluation against math-to-code batch run
pf run create --flow ../../evaluation/eval-accuracy-maths-to-code/ --data ./math_data.jsonl --column-mapping groundtruth='${data.answer}' prediction='${run.outputs.answer}' --run $run_name --name $eval_run_name --stream

# view the run details
pf run show-details -n $eval_run_name
pf run show-metrics -n $eval_run_name

Discussion