New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

llm-file-processor

Package Overview
Dependencies
Maintainers
1
Versions
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

llm-file-processor

Automate, standardize, and enrich your files at scale with LLM-powered transformations

latest
Source
npmnpm
Version
1.0.0
Version published
Weekly downloads
2
Maintainers
1
Weekly downloads
 
Created
Source

LLM File Processor

Star on GitHub Fork on GitHub Watch on GitHub

Version 1.0.0 License: MIT Built with Node.js

Automate, standardize, and enrich your files at scale with LLM-powered transformations

A flexible Node.js CLI that applies custom LLM prompts to files or entire directories—turn unstructured documentation, code, or data into consistent, structured, and actionable outputs with minimal effort.

Key Features

  • Rule-Driven Workflows: Define a single prompt file containing transformation rules, and let the CLI enforce them across every input file.
  • LLM-Agnostic: Swap models or providers via environment variables; works with any OpenAI-compatible API endpoint.
  • Batch & Parallel Processing: Process individual files or entire directories in configurable batch sizes, with optional delays for rate-limiting.
  • Dry-Run Mode: Preview combined prompts without making API calls, perfect for testing and validation.
  • JSON-First Output: Receive clean, machine-readable JSON responses for seamless integration into pipelines.
  • Prompt Validation: Built-in LLM-based prompt sanity checks to ensure your rules translate into valid transformations.

Use Cases

  • Uniform Documentation Standardize a scattered collection of markdown files—add TOCs, enforce heading hierarchies, flag missing sections, and generate summary sections automatically.

  • Web Content Summarization Crawl or aggregate dozens (or hundreds) of web pages, then compress and transform them into structured in-context learning data for your next prompt-engineering or fine-tuning project.

  • Automated Code Review & Linting Feed diffs or code snippets through custom prompts to enforce style guides, detect anti-patterns, and suggest refactors at scale.

  • Test Case Generation Generate unit or integration tests by providing source files and rules for expected behaviors—ideal for accelerating test coverage in legacy codebases.

  • Changelog & Release Notes Scan commit messages or diff logs, then automatically produce human-friendly change summaries and release notes in your preferred format.

  • Data Extraction & Metadata Tagging Transform CSVs, logs, or JSON files by extracting key fields, tagging records, or reformatting data for downstream analytics.

  • Migration of Legacy Formats Batch-convert legacy documentation, configuration files, or proprietary formats into modern standards (e.g., Markdown → Markdown with frontmatter, YAML → JSON).

  • Localization & Internationalization Automate translation or adaptation of text files by applying LLM-based translation prompts, with markers for review or missing strings.

  • CI/CD Integration Incorporate the CLI into Git hooks or CI pipelines to enforce content and code health checks on every commit or pull request.

  • Training Data Preparation Generate clean, structured training examples by defining in-context learning rules—ideal for building your own LLM benchmarks or fine-tuning datasets.

Installation

# Clone repository
git clone https://github.com/samestrin/llm-file-processor.git
cd llm-file-processor

# Install dependencies
npm install

# Make CLI executable
chmod +x llm-file-processor.js

# (Optional) Link globally
e.g. npm link

Configuration

Create a .env file in the project root:

OPENAI_API_KEY=your_api_key_here
OPENAI_MODEL=your-model-identifier # e.g. gpt-4.1

Tip: Use any OpenAI-compatible endpoint by setting the OPENAI_API_URL environment variable.

Usage

# Process a single file
llm-file-processor --prompt-file path/to/prompt.txt --file path/to/doc.md

# Process an entire directory
llm-file-processor --prompt-file path/to/prompt.txt --directory path/to/project/docs

# Preview prompts without API calls
llm-file-processor -p prompt.txt -f file.md --dry-run

# Generate test files with modified filenames
llm-file-processor -p test-generation.txt -f userAuthentication.js --insert-before-ext ".test"

# Process log files and output as JSON
llm-file-processor -p extract-data.txt -d logs/ --output-ext json

# Process multiple files and merge results into a single output
llm-file-processor -p extract-data.txt -d logs/ -m json

# Process files and merge with custom extension
llm-file-processor -p summarize.txt -d articles/ -m md --output-ext summary.md

# Batch process with custom settings
llm-file-processor -p rules.txt -d src -b 5 --delay 1000

CLI Options

OptionDescription
-p, --prompt-file <file>Path to the prompt file (required)
-f, --file <file>Path to a single file to process
-d, --directory <dir>Path to a directory of files to process
-o, --output <dir>Specify a custom output directory (default: ./processed-<timestamp>)
--insert-before-ext <text>Insert text before file extension (e.g., ".test" for "file.test.js" from "file.js")
--output-ext <extension>Change or add file extension (e.g., "json" to save as "file.log.json")
-m, --merge <filename>Merge all processed files into a single output file ""
--dry-runCombine prompts and files without sending to LLM
-b, --batch-size <number>Number of files per batch (default: 1)
--delay <ms>Milliseconds to wait between API batches (default: 500)
-h, --helpDisplay help information
-v, --versionDisplay version information

Writing Effective Prompts

Craft transformation rules in your prompt file to guide the LLM. Example:

1. Generate a table of contents.
2. Normalize all headings to Markdown `##`, `###`, etc.
3. Flag sections missing a required `Summary` header.
4. Append a `## Key Takeaways` section at the end.

Contribute

Contributions to this project are welcome. Please fork the repository and submit a pull request with your changes or improvements.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Share

Twitter Facebook LinkedIn

Keywords

llm

FAQs

Package last updated on 16 May 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts