Use Case: Automated Research Assistant
Imagine you’re a biotech researcher working on drug discovery. Your daily workflow involves:
- Literature Review: Searching through thousands of research papers to find relevant studies on target proteins, pathways, and compounds
- Data Analysis: Processing experimental data from assays, analyzing protein structures, comparing sequences
- Protocol Management: Maintaining and updating lab protocols, ensuring compliance with regulatory requirements
- Experiment Tracking: Recording experimental conditions, results, and observations across multiple projects
- Report Generation: Creating research summaries, grant proposals, and regulatory documentation
The Challenge: You need an AI assistant that can:
- Search and synthesize information from scientific literature databases
- Analyze experimental data and generate insights
- Maintain accurate records of experiments and protocols
- Generate compliant documentation for regulatory submissions
- Work with specialized biotech tools and databases (PubChem, UniProt, PDB, etc.)
- Handle sensitive research data securely
The Solution: The Claude Agent SDK provides the tools to build specialized biotech research agents. By giving Claude access to a computer (via terminal, file system, and tools), you can create agents that work like human researchers—searching databases, analyzing data, writing code, and iterating on results.
This guide explores how to build a biotech research agent using the Claude Agent SDK, covering the agent loop (gather context → take action → verify work), tool creation, and biotech-specific workflows.
Prerequisites
Before getting started, ensure you have:
- Claude Agent SDK:
# Install Claude Agent SDK npm install @anthropic-ai/claude-agent-sdk # Or pip install anthropic-agent-sdk - Development Environment:
- Node.js 18+ or Python 3.8+
- Access to Claude API
- Terminal/command line access
- Biotech Tools (optional but recommended):
- Access to scientific databases (PubChem, UniProt, PDB)
- Bioinformatics tools (BLAST, sequence analysis tools)
- Data analysis libraries (BioPython, pandas, numpy)
- API Keys:
export ANTHROPIC_API_KEY=your-api-key-here
Understanding the Agent Loop
The Claude Agent SDK is built around a core feedback loop, as described in the Anthropic engineering article:
- Gather Context: Search files, databases, and previous work
- Take Action: Execute tools, write code, make API calls
- Verify Work: Check results, validate outputs, iterate
This loop enables agents to work autonomously, iterating until they achieve the desired outcome.
Biotech Agent Architecture
Architecture Overview
┌─────────────────────────────────────────────────────────────┐
│ Biotech Research Agent │
└─────────────────────┬───────────────────────────────────────┘
│
┌─────────────┼─────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Context │ │ Action │ │ Verification│
│ Gathering │ │ Execution │ │ & Quality │
│ │ │ │ │ Control │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└─────────────────┼─────────────────┘
│
┌────────────────┼────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Literature │ │ Data │ │ Protocol │
│ Search │ │ Analysis │ │ Management │
└──────────────┘ └──────────────┘ └──────────────┘
Building the Biotech Agent
1. Basic Agent Setup
// biotech-agent.js
import { Agent } from '@anthropic-ai/claude-agent-sdk';
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
const agent = new Agent({
name: 'biotech-researcher',
model: 'claude-3-5-sonnet-20241022',
systemPrompt: `You are a biotech research assistant specializing in drug discovery,
protein analysis, and experimental design. You help researchers by:
1. Searching scientific literature and databases
2. Analyzing experimental data
3. Managing protocols and experimental records
4. Generating research reports and documentation
You have access to scientific databases, analysis tools, and can write code to
process data and generate visualizations.`,
tools: [
// Tools will be defined below
],
// Enable file system access
allowFileSystem: true,
// Enable bash commands
allowBash: true,
});
2. Gathering Context: Literature Search Tool
// tools/literature-search.js
import axios from 'axios';
export const literatureSearchTool = {
name: 'search_literature',
description: 'Search scientific literature from PubMed, bioRxiv, and other sources',
parameters: {
type: 'object',
properties: {
query: {
type: 'string',
description: 'Search query (e.g., "protein kinase inhibitors", "CRISPR gene editing")'
},
database: {
type: 'string',
enum: ['pubmed', 'biorxiv', 'arxiv'],
description: 'Database to search'
},
max_results: {
type: 'number',
description: 'Maximum number of results to return',
default: 10
}
},
required: ['query']
},
execute: async ({ query, database = 'pubmed', max_results = 10 }) => {
try {
if (database === 'pubmed') {
// Search PubMed via NCBI E-utilities
const response = await axios.get('https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi', {
params: {
db: 'pubmed',
term: query,
retmax: max_results,
retmode: 'json'
}
});
const pmids = response.data.esearchresult.idlist;
// Fetch abstracts
const fetchResponse = await axios.get('https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi', {
params: {
db: 'pubmed',
id: pmids.join(','),
retmode: 'xml'
}
});
return {
success: true,
results: pmids.map((pmid, index) => ({
pmid,
title: `Paper ${index + 1}`,
abstract: 'Abstract text...' // Parse from XML
})),
count: pmids.length
};
}
// Add other database searches (bioRxiv, etc.)
return { success: false, error: 'Database not implemented' };
} catch (error) {
return { success: false, error: error.message };
}
}
};
3. Gathering Context: Protein Database Search
// tools/protein-search.js
export const proteinSearchTool = {
name: 'search_protein',
description: 'Search protein databases (UniProt, PDB) for protein information',
parameters: {
type: 'object',
properties: {
protein_id: {
type: 'string',
description: 'Protein ID (UniProt ID, PDB ID, or gene name)'
},
database: {
type: 'string',
enum: ['uniprot', 'pdb', 'ncbi'],
description: 'Database to search'
}
},
required: ['protein_id']
},
execute: async ({ protein_id, database = 'uniprot' }) => {
try {
if (database === 'uniprot') {
const response = await axios.get(`https://www.uniprot.org/uniprot/${protein_id}.json`);
return {
success: true,
protein: {
id: response.data.primaryAccession,
name: response.data.proteinDescription?.recommendedName?.fullName?.value,
sequence: response.data.sequence?.value,
function: response.data.comments?.find(c => c.type === 'FUNCTION')?.texts?.[0]?.value,
domains: response.data.features?.filter(f => f.type === 'domain')?.map(d => ({
name: d.description,
start: d.location.start.value,
end: d.location.end.value
}))
}
};
}
if (database === 'pdb') {
const response = await axios.get(`https://data.rcsb.org/rest/v1/core/entry/${protein_id}`);
return {
success: true,
structure: {
id: response.data.id,
title: response.data.struct?.title,
resolution: response.data.rcsb_entry_info?.resolution_combined?.[0]
}
};
}
return { success: false, error: 'Database not implemented' };
} catch (error) {
return { success: false, error: error.message };
}
}
};
4. Taking Action: Data Analysis Tool
// tools/data-analysis.js
export const dataAnalysisTool = {
name: 'analyze_experimental_data',
description: 'Analyze experimental data from assays, generate statistics, and create visualizations',
parameters: {
type: 'object',
properties: {
data_file: {
type: 'string',
description: 'Path to data file (CSV, Excel, or JSON)'
},
analysis_type: {
type: 'string',
enum: ['dose_response', 'binding_affinity', 'time_series', 'comparative'],
description: 'Type of analysis to perform'
},
output_format: {
type: 'string',
enum: ['summary', 'plot', 'table'],
description: 'Desired output format'
}
},
required: ['data_file', 'analysis_type']
},
execute: async ({ data_file, analysis_type, output_format = 'summary' }) => {
// Agent will write Python code to analyze the data
// This tool provides the interface, actual analysis happens via code generation
return {
success: true,
message: 'Analysis will be performed via generated Python code',
data_file,
analysis_type
};
}
};
5. Taking Action: Protocol Management
// tools/protocol-manager.js
import fs from 'fs/promises';
import path from 'path';
export const protocolManagerTool = {
name: 'manage_protocol',
description: 'Create, update, or retrieve lab protocols',
parameters: {
type: 'object',
properties: {
action: {
type: 'string',
enum: ['create', 'update', 'retrieve', 'list'],
description: 'Action to perform'
},
protocol_name: {
type: 'string',
description: 'Name of the protocol'
},
protocol_content: {
type: 'string',
description: 'Protocol content (for create/update)'
}
},
required: ['action']
},
execute: async ({ action, protocol_name, protocol_content }) => {
const protocolsDir = path.join(process.cwd(), 'protocols');
try {
await fs.mkdir(protocolsDir, { recursive: true });
if (action === 'create' || action === 'update') {
if (!protocol_name || !protocol_content) {
return { success: false, error: 'Protocol name and content required' };
}
const filePath = path.join(protocolsDir, `${protocol_name}.md`);
await fs.writeFile(filePath, protocol_content);
return {
success: true,
message: `Protocol ${action}d successfully`,
path: filePath
};
}
if (action === 'retrieve') {
if (!protocol_name) {
return { success: false, error: 'Protocol name required' };
}
const filePath = path.join(protocolsDir, `${protocol_name}.md`);
const content = await fs.readFile(filePath, 'utf-8');
return {
success: true,
protocol_name,
content
};
}
if (action === 'list') {
const files = await fs.readdir(protocolsDir);
const protocols = files.filter(f => f.endsWith('.md'));
return {
success: true,
protocols: protocols.map(p => p.replace('.md', ''))
};
}
return { success: false, error: 'Invalid action' };
} catch (error) {
return { success: false, error: error.message };
}
}
};
6. Complete Agent with All Tools
// biotech-agent.js
import { Agent } from '@anthropic-ai/claude-agent-sdk';
import { literatureSearchTool } from './tools/literature-search.js';
import { proteinSearchTool } from './tools/protein-search.js';
import { dataAnalysisTool } from './tools/data-analysis.js';
import { protocolManagerTool } from './tools/protocol-manager.js';
const agent = new Agent({
name: 'biotech-researcher',
model: 'claude-3-5-sonnet-20241022',
systemPrompt: `You are a biotech research assistant with expertise in:
- Drug discovery and development
- Protein structure and function analysis
- Experimental design and data analysis
- Scientific literature review
- Protocol management and compliance
When working with researchers:
1. Always verify data sources and cite references
2. Ensure protocols comply with regulatory requirements
3. Maintain accurate experimental records
4. Generate clear, reproducible analyses
You have access to:
- Scientific databases (PubMed, UniProt, PDB)
- Data analysis capabilities (Python, R)
- Protocol management system
- File system for organizing research data`,
tools: [
literatureSearchTool,
proteinSearchTool,
dataAnalysisTool,
protocolManagerTool
],
allowFileSystem: true,
allowBash: true,
});
// Run the agent
async function main() {
const query = process.argv[2] || "Search for recent papers on kinase inhibitors";
const response = await agent.run(query);
console.log(response);
}
main().catch(console.error);
Advanced Features
1. Using Subagents for Parallel Research
// The agent can spawn subagents for parallel tasks
const agent = new Agent({
// ... configuration
// Enable subagents
allowSubagents: true,
systemPrompt: `... You can use subagents to:
- Search multiple databases in parallel
- Analyze different datasets simultaneously
- Review multiple papers concurrently
...`
});
2. Code Generation for Data Analysis
The agent can write Python scripts for complex analyses:
// Agent will generate code like this when analyzing data:
// Example: Agent-generated Python code for dose-response analysis
const analysisCode = `
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
# Load experimental data
data = pd.read_csv('dose_response_data.csv')
# Fit dose-response curve
def hill_equation(x, EC50, nH, bottom, top):
return bottom + (top - bottom) / (1 + (EC50 / x) ** nH)
params, _ = curve_fit(hill_equation, data['concentration'], data['response'])
EC50 = params[0]
# Generate plot
plt.figure(figsize=(10, 6))
plt.plot(data['concentration'], data['response'], 'o', label='Data')
plt.plot(x_fit, hill_equation(x_fit, *params), label='Fit')
plt.xlabel('Concentration (nM)')
plt.ylabel('Response (%)')
plt.title(f'Dose-Response Curve (EC50 = {EC50:.2f} nM)')
plt.legend()
plt.savefig('dose_response_plot.png')
`;
3. Verification: Quality Control
// tools/quality-control.js
export const qualityControlTool = {
name: 'validate_experimental_data',
description: 'Validate experimental data for quality, completeness, and compliance',
parameters: {
type: 'object',
properties: {
data_file: {
type: 'string',
description: 'Path to data file'
},
validation_rules: {
type: 'array',
items: { type: 'string' },
description: 'List of validation rules to apply'
}
},
required: ['data_file']
},
execute: async ({ data_file, validation_rules = [] }) => {
// Agent will write validation code
// This provides structure for quality checks
return {
success: true,
message: 'Validation will be performed via generated code',
validation_rules
};
}
};
Practical Workflow Examples
Example 1: Literature Review Workflow
// User query
const query = `
I'm researching protein kinase inhibitors for cancer treatment.
Please:
1. Search PubMed for recent papers (last 2 years)
2. Find relevant protein targets in UniProt
3. Summarize key findings
4. Create a research report
`;
const response = await agent.run(query);
What the agent does:
- Uses
search_literaturetool to find papers - Extracts protein names from abstracts
- Uses
search_proteintool for each protein - Generates a comprehensive report with citations
Example 2: Experimental Data Analysis
const query = `
Analyze the experimental data in assays/binding_assay_2025_01.csv.
This is a binding affinity assay. Please:
1. Calculate IC50 values
2. Generate dose-response curves
3. Compare results across compounds
4. Create a summary report with visualizations
`;
const response = await agent.run(query);
What the agent does:
- Reads the CSV file
- Writes Python code to analyze binding data
- Generates plots and statistics
- Creates a formatted report
Example 3: Protocol Management
const query = `
I need to update the "Cell Culture Protocol" with new safety requirements.
Retrieve the current protocol, add the new requirements, and save the updated version.
Also create a compliance checklist.
`;
const response = await agent.run(query);
What the agent does:
- Uses
manage_protocolto retrieve existing protocol - Updates content with new requirements
- Saves updated protocol
- Generates compliance checklist
Best Practices for Biotech Agents
1. Context Gathering
File System as Context: Organize research data in a structured way:
research/
├── literature/
│ ├── papers/
│ └── summaries/
├── experiments/
│ ├── raw_data/
│ ├── analyzed/
│ └── reports/
├── protocols/
└── databases/
The agent can search these directories using bash commands like grep, find, and cat.
Semantic Search (optional): For very large document collections, consider adding vector search, but start with agentic file system search first.
2. Tool Design
Primary Actions: Design tools around the most common research tasks:
- Literature searching
- Protein/database queries
- Data analysis
- Protocol management
Composability: Tools should work together. For example, literature search → extract protein names → query UniProt → analyze sequences.
3. Verification Strategies
Rules-Based Validation:
// Define validation rules for experimental data
const validationRules = {
requiredFields: ['compound_id', 'concentration', 'response'],
valueRanges: {
concentration: { min: 0, max: 10000 },
response: { min: 0, max: 100 }
},
dataQuality: {
minReplicates: 3,
maxCV: 0.2
}
};
Code Linting: When the agent generates analysis code, lint it to catch errors:
// Agent generates Python code
// Run: pylint, mypy, or flake8 for validation
Visual Feedback: For plots and visualizations, the agent can review generated images to verify they match requirements.
Testing Your Agent
1. Representative Test Cases
// test-cases.js
const testCases = [
{
name: 'Literature search',
query: 'Find recent papers on CRISPR gene editing',
expectedTools: ['search_literature'],
expectedOutput: 'List of papers with abstracts'
},
{
name: 'Protein analysis',
query: 'Get information about protein P00519 (ABL1)',
expectedTools: ['search_protein'],
expectedOutput: 'Protein details including sequence and function'
},
{
name: 'Data analysis',
query: 'Analyze dose-response data in assays/compound_001.csv',
expectedTools: ['analyze_experimental_data'],
expectedOutput: 'IC50 values and dose-response plot'
}
];
// Run tests
for (const testCase of testCases) {
const response = await agent.run(testCase.query);
// Validate response
}
2. Error Handling
Monitor agent failures and improve:
- Missing information → improve search tools
- Repeated failures → add validation rules
- Can’t fix errors → provide better tools
- Performance varies → build test suite
Deployment Considerations
1. Security
- API Keys: Store securely, never in code
- Data Access: Limit file system access to research directories
- Network: Restrict external API calls to approved databases
2. Performance
- Caching: Cache database queries and search results
- Parallelization: Use subagents for independent tasks
- Compaction: Enable automatic context summarization for long sessions
3. Compliance
- Audit Logs: Record all agent actions
- Data Retention: Follow research data retention policies
- Regulatory: Ensure generated documentation meets requirements
Conclusion
The Claude Agent SDK enables building powerful biotech research agents by:
- Giving Claude a Computer: Access to file system, terminal, and tools
- Agent Loop: Gather context → take action → verify work
- Specialized Tools: Literature search, protein databases, data analysis
- Code Generation: Write analysis scripts, create visualizations
- Iteration: Agents can refine their work until it meets requirements
Key takeaways:
- Design tools around primary research workflows
- Use file system structure as context engineering
- Enable code generation for complex analyses
- Implement verification at each step
- Test with representative use cases
- Monitor and improve based on failures
By following the agent loop pattern and providing the right tools, you can build biotech agents that work like human researchers—searching, analyzing, and iterating until they produce high-quality results.
Resources
- Building Agents with Claude Agent SDK - Anthropic Engineering Blog
- Claude Agent SDK Documentation - Official SDK docs
- Writing Effective Tools for Agents - Best practices for tool design
- Model Context Protocol (MCP) - Standardized integrations
- BioPython Documentation - Bioinformatics tools
- NCBI E-utilities - PubMed API