FEATURE: T2B tool calling accuracy benchmark

**Is your feature request related to a problem? Please describe.**
We would like to assess the precision of tool calling of T2B vs state of the art simulators.

**Describe the solution you'd like**
We would like to simulated models using the open-source Python package basico and compared the resulting time-course outputs with those obtained from T2B via tool calling, where T2B would utilize automatically generated user prompts to execute the simulations. Exact values at specific simulation timepoints should be compared to address the precision of T2B's answers regarding the concentration of species _X_ at time _T_.

Following operation would be tested:
* time course simulation
* parameter scanning
* readout of values at specified time-point

Approximately 100 models from open-source repositories will be employed to evaluate T2B performance in a high-throughput manner.

**Results**
he results should be presented as code in Jupyter notebook(s), including statistical analysis of T2B hit and miss rates.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEATURE: T2B tool calling accuracy benchmark #256

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

FEATURE: T2B tool calling accuracy benchmark #256

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions